2025-12-04T08:53:38.1233152Z Current runner version: '2.329.0' 2025-12-04T08:53:38.1236118Z Runner name: 'linux.rocm.gpu.gfx942.1.b-gwk9b-runner-jfbtd' 2025-12-04T08:53:38.1236517Z Runner group name: 'default' 2025-12-04T08:53:38.1236963Z Machine name: 'linux' 2025-12-04T08:53:38.1238088Z ##[group]GITHUB_TOKEN Permissions 2025-12-04T08:53:38.1239124Z Contents: read 2025-12-04T08:53:38.1239362Z Metadata: read 2025-12-04T08:53:38.1239628Z ##[endgroup] 2025-12-04T08:53:38.1240628Z Secret source: Actions 2025-12-04T08:53:38.1240930Z Prepare workflow directory 2025-12-04T08:53:38.1479447Z Prepare all required actions 2025-12-04T08:53:38.1499022Z Getting action download info 2025-12-04T08:53:38.6720662Z Download action repository 'pytorch/pytorch@main' (SHA:ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32) 2025-12-04T08:53:43.0724417Z Download action repository 'pytorch/test-infra@main' (SHA:39aa74d619174326f4e2fb0e216151c2f29d9ffd) 2025-12-04T08:53:44.1085574Z Download action repository 'actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-12-04T08:53:46.8059017Z Download action repository 'aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722' (SHA:ececac1a45f3b08a01d2dd070d28d111c5fe6722) 2025-12-04T08:53:48.0381926Z Getting action download info 2025-12-04T08:53:48.2636094Z Download action repository 'actions/checkout@v4' (SHA:34e114876b0b11c390a56381ad16ebd13914f8d5) 2025-12-04T08:53:49.1473551Z Getting action download info 2025-12-04T08:53:49.3480879Z Download action repository 'nick-fields/retry@v3.0.0' (SHA:7152eba30c6575329ac0576536151aca5a72780e) 2025-12-04T08:53:50.1588467Z Getting action download info 2025-12-04T08:53:50.3913000Z Uses: pytorch/pytorch/.github/workflows/_rocm-test.yml@refs/heads/main (ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32) 2025-12-04T08:53:50.3915180Z ##[group] Inputs 2025-12-04T08:53:50.3915333Z build-environment: linux-jammy-rocm-py3.10 2025-12-04T08:53:50.3918565Z test-matrix: {"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}]} 2025-12-04T08:53:50.3921914Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T08:53:50.3922196Z sync-tag: 2025-12-04T08:53:50.3922719Z timeout-minutes: 300 2025-12-04T08:53:50.3922852Z tests-to-include: 2025-12-04T08:53:50.3922964Z dashboard-tag: 2025-12-04T08:53:50.3923200Z disable-monitor: true 2025-12-04T08:53:50.3923314Z monitor-log-interval: 5 2025-12-04T08:53:50.3923436Z monitor-data-collect-interval: 1 2025-12-04T08:53:50.3923568Z ##[endgroup] 2025-12-04T08:53:50.3923775Z Complete job name: linux-jammy-rocm-py3.10 / test (default, 3, 6, linux.rocm.gpu.gfx942.1.b, mem_leak_check, unstable) 2025-12-04T08:53:50.4181278Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@main 2025-12-04T08:53:50.4181573Z with: 2025-12-04T08:53:50.4181666Z no-sudo: true 2025-12-04T08:53:50.4181770Z submodules: recursive 2025-12-04T08:53:50.4181927Z fetch-depth: 0 2025-12-04T08:53:50.4182070Z env: 2025-12-04T08:53:50.4182162Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:53:50.4182279Z ##[endgroup] 2025-12-04T08:53:50.4227467Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-12-04T08:53:50.4227856Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-12-04T08:53:50.4234728Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:53:50.4234881Z env: 2025-12-04T08:53:50.4234973Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:53:50.4235076Z ##[endgroup] 2025-12-04T08:53:50.4396289Z ##[group]Run actions/checkout@v4 2025-12-04T08:53:50.4396461Z with: 2025-12-04T08:53:50.4396583Z ref: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T08:53:50.4396717Z fetch-depth: 0 2025-12-04T08:53:50.4396811Z submodules: recursive 2025-12-04T08:53:50.4397027Z show-progress: false 2025-12-04T08:53:50.4397135Z repository: pytorch/pytorch 2025-12-04T08:53:50.4397310Z token: *** 2025-12-04T08:53:50.4397402Z ssh-strict: true 2025-12-04T08:53:50.4397493Z ssh-user: git 2025-12-04T08:53:50.4397591Z persist-credentials: true 2025-12-04T08:53:50.4397698Z clean: true 2025-12-04T08:53:50.4397807Z sparse-checkout-cone-mode: true 2025-12-04T08:53:50.4397925Z fetch-tags: false 2025-12-04T08:53:50.4398020Z lfs: false 2025-12-04T08:53:50.4398109Z set-safe-directory: true 2025-12-04T08:53:50.4398211Z env: 2025-12-04T08:53:50.4398295Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:53:50.4398394Z ##[endgroup] 2025-12-04T08:53:50.4928276Z Syncing repository: pytorch/pytorch 2025-12-04T08:53:50.4928843Z ##[group]Getting Git version info 2025-12-04T08:53:50.4929003Z Working directory is '/home/runner/_work/pytorch/pytorch' 2025-12-04T08:53:50.4929253Z [command]/usr/bin/git version 2025-12-04T08:53:50.4929367Z git version 2.52.0 2025-12-04T08:53:50.4946241Z ##[endgroup] 2025-12-04T08:53:50.4952514Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/ac52d958-577e-4565-8772-69a4bf2fa33e/.gitconfig' 2025-12-04T08:53:50.4957297Z Temporarily overriding HOME='/home/runner/_work/_temp/ac52d958-577e-4565-8772-69a4bf2fa33e' before making global git config changes 2025-12-04T08:53:50.4957960Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T08:53:50.4960154Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch 2025-12-04T08:53:50.4992312Z [command]/usr/bin/git config --local --get remote.origin.url 2025-12-04T08:53:50.5009381Z https://github.com/pytorch/pytorch 2025-12-04T08:53:50.5019582Z ##[group]Removing previously created refs, to avoid conflicts 2025-12-04T08:53:50.5023430Z [command]/usr/bin/git rev-parse --symbolic-full-name --verify --quiet HEAD 2025-12-04T08:53:50.5044787Z refs/heads/main 2025-12-04T08:53:50.5054924Z [command]/usr/bin/git checkout --detach 2025-12-04T08:53:52.2042019Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919) 2025-12-04T08:53:52.2095270Z [command]/usr/bin/git branch --delete --force main 2025-12-04T08:53:52.2215928Z Deleted branch main (was ffd9b0fb4355). 2025-12-04T08:53:52.2224464Z ##[endgroup] 2025-12-04T08:53:52.2228872Z [command]/usr/bin/git submodule status 2025-12-04T08:53:52.2460685Z 7e1e1fe3858c63c251c637ae41a20de425dde96f android/libs/fbjni (v0.1.0-12-g7e1e1fe) 2025-12-04T08:53:52.2523765Z 4dfe081cf6bcd15db339cf2680b9281b8451eeb3 third_party/FP16 (4dfe081) 2025-12-04T08:53:52.2571466Z b408327ac2a15ec3e43352421954f5b1967701d1 third_party/FXdiv (b408327) 2025-12-04T08:53:52.2640541Z c07e3a0400713d546e0dea2d5466dd22ea389c73 third_party/NNPACK (c07e3a0) 2025-12-04T08:53:52.2674749Z 3ebbc93ded7285963bff932c678fa367eb393ba6 third_party/NVTX (v3.1.0-313-g3ebbc93) 2025-12-04T08:53:52.2731453Z 1d8f600fd424278486eade7ed3e877c99f0846b1 third_party/VulkanMemoryAllocator (v2.1.0-982-g1d8f600) 2025-12-04T08:53:52.3037182Z 51a0103656eff6fc9bfd39a4597923c4b542c883 third_party/XNNPACK (remotes/origin/ds/ndk-1243-g51a0103656) 2025-12-04T08:53:52.3064169Z 01aae101b9e5e94d6c16a9514c9fb8df99c93150 third_party/aiter (v0.1.1-92-g01aae101) 2025-12-04T08:53:52.3080071Z 299e5928955cc62af9968370293b916f5130916f third_party/benchmark (v1.9.3) 2025-12-04T08:53:52.3135438Z 7fe50dc3da2069d6645d9deb8c017a876472a977 third_party/composable_kernel (rocm-6.4.3-459-g7fe50dc3d) 2025-12-04T08:53:52.3212127Z 89c932f313c6437c38f2982869beacc89c2f2246 third_party/cpp-httplib (v0.26.0) 2025-12-04T08:53:52.3290148Z f858c30bcb16f8effd5ff46996f0514539e17abc third_party/cpuinfo (f858c30) 2025-12-04T08:53:52.3311940Z 0b1577c8c83401237d601d0d0db5210506705396 third_party/cudnn_frontend (v0.5-61-g0b1577c) 2025-12-04T08:53:52.3371935Z f88806b1e31dfa579842638740216dd41fc6c588 third_party/cutlass (v4.3.1) 2025-12-04T08:53:52.3392297Z c0b988d39a9e47c794d699f29930ed4d7c7e13a4 third_party/fbgemm (v1.4.0-rc1-2-gc0b988d39) 2025-12-04T08:53:52.3462960Z 979702c87a8713a8e0a5e9fee122b90d2ef13be5 third_party/flash-attention (v2.7.4) 2025-12-04T08:53:52.3479593Z a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757 third_party/flatbuffers (v24.12.23) 2025-12-04T08:53:52.3723738Z 407c905e45ad75fc29bf0f9bb7c5c2fd3475976f third_party/fmt (12.1.0) 2025-12-04T08:53:52.3809161Z 3fb5c176c17c765a3492cd2f0321b0dab712f350 third_party/gemmlowp/gemmlowp (remotes/origin/revert-87-master-135-g3fb5c17) 2025-12-04T08:53:52.3915712Z 54cbae0d3a67fa890b4c3d9ee162b7860315e341 third_party/gloo (remotes/origin/gh/c-p-i-o/1/base-37-g54cbae0) 2025-12-04T08:53:52.4056171Z 52eb8108c5bdec04579160ae17225d66034bd723 third_party/googletest (release-1.8.0-3544-g52eb8108) 2025-12-04T08:53:52.4104904Z 719d8e6cd7f7a0e01b155657526d693acf97c2b3 third_party/ideep (pytorch-rls-v3.7.1) 2025-12-04T08:53:52.4153018Z dec1d23ca65ab069d225dfe40dea14f455170959 third_party/ittapi (v3.25.5) 2025-12-04T08:53:52.4280282Z 31f85df8fbd89c188f14ef10f1ec65379786b943 third_party/kineto (heads/main) 2025-12-04T08:53:52.4300215Z d7770c89632329a9914ef1a90289917597639cbe third_party/kleidiai (v1.15.0) 2025-12-04T08:53:52.4313054Z fbd8b99c2b828428947d70fdc046bb55609be93e third_party/mimalloc (v2.2.4) 2025-12-04T08:53:52.4327647Z 55f93686c01528224f448c19128836e7df245f72 third_party/nlohmann (v3.12.0) 2025-12-04T08:53:52.4530312Z e709452ef2bbc1d113faf678c24e6d3467696e83 third_party/onnx (v1.18.0) 2025-12-04T08:53:52.4547447Z a799f4aed9c94b765dcdaabaeab7d5e7e2310878 third_party/opentelemetry-cpp (v1.14.2) 2025-12-04T08:53:52.4575493Z 0fa0ef591e38c2758e3184c6c23e497b9f732ffa third_party/pocketfft (release_for_eigen-40-g0fa0ef5) 2025-12-04T08:53:52.4793305Z d1eca4e4b421cd2997495c4b4e65cea6be4e9b8a third_party/protobuf (v3.7.0-rc.2-1279-gd1eca4e4b) 2025-12-04T08:53:52.4846193Z 072586a71b55b7f8c584153d223e95687148a900 third_party/psimd (heads/master) 2025-12-04T08:53:52.4890133Z 4fe0e1e183925bf8cfa6aae24237e724a96479b8 third_party/pthreadpool (0.1-144-g4fe0e1e) 2025-12-04T08:53:52.4902846Z f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8 third_party/pybind11 (v3.0.1) 2025-12-04T08:53:52.4959140Z f45429b087dd7d5bc78bb40dc7cf06425c252d67 third_party/python-peachpy (remotes/origin/pre-generated) 2025-12-04T08:53:52.5009497Z 5a1d179df9cf652951b59010a2d2075372d67f68 third_party/sleef (3.8) 2025-12-04T08:53:52.5058663Z 2b4cd91092d335a697416b2a3cb398283246849d third_party/tensorpipe (heads/main) 2025-12-04T08:53:52.5068453Z ##[group]Cleaning the repository 2025-12-04T08:53:52.5071838Z [command]/usr/bin/git clean -ffdx 2025-12-04T08:53:52.5185990Z [command]/usr/bin/git reset --hard HEAD 2025-12-04T08:53:53.8435392Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919) 2025-12-04T08:53:53.8501565Z ##[endgroup] 2025-12-04T08:53:53.8502699Z ##[group]Disabling automatic garbage collection 2025-12-04T08:53:53.8505772Z [command]/usr/bin/git config --local gc.auto 0 2025-12-04T08:53:53.8532911Z ##[endgroup] 2025-12-04T08:53:53.8533269Z ##[group]Setting up auth 2025-12-04T08:53:53.8535693Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T08:53:53.8550749Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T08:53:53.8746860Z Entering 'android/libs/fbjni' 2025-12-04T08:53:53.8771547Z Entering 'third_party/FP16' 2025-12-04T08:53:53.8797491Z Entering 'third_party/FXdiv' 2025-12-04T08:53:53.8828020Z Entering 'third_party/NNPACK' 2025-12-04T08:53:53.8852800Z Entering 'third_party/NVTX' 2025-12-04T08:53:53.8874526Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T08:53:53.8894472Z Entering 'third_party/XNNPACK' 2025-12-04T08:53:53.8921056Z Entering 'third_party/aiter' 2025-12-04T08:53:53.8948686Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:53:53.8980734Z Entering 'third_party/benchmark' 2025-12-04T08:53:53.9003372Z Entering 'third_party/composable_kernel' 2025-12-04T08:53:53.9028565Z Entering 'third_party/cpp-httplib' 2025-12-04T08:53:53.9050431Z Entering 'third_party/cpuinfo' 2025-12-04T08:53:53.9073101Z Entering 'third_party/cudnn_frontend' 2025-12-04T08:53:53.9094794Z Entering 'third_party/cutlass' 2025-12-04T08:53:53.9120533Z Entering 'third_party/fbgemm' 2025-12-04T08:53:53.9149083Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T08:53:53.9183473Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:53:53.9215127Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:53:53.9236420Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T08:53:53.9264360Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T08:53:53.9286279Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:53:53.9320316Z Entering 'third_party/fbgemm/external/json' 2025-12-04T08:53:53.9360887Z Entering 'third_party/flash-attention' 2025-12-04T08:53:53.9397678Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:53:53.9424030Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:53:53.9458122Z Entering 'third_party/flatbuffers' 2025-12-04T08:53:53.9486512Z Entering 'third_party/fmt' 2025-12-04T08:53:53.9513369Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:53:53.9544398Z Entering 'third_party/gloo' 2025-12-04T08:53:53.9571643Z Entering 'third_party/googletest' 2025-12-04T08:53:53.9596999Z Entering 'third_party/ideep' 2025-12-04T08:53:53.9629071Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T08:53:53.9660681Z Entering 'third_party/ittapi' 2025-12-04T08:53:53.9684039Z Entering 'third_party/kineto' 2025-12-04T08:53:53.9708808Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:53:53.9741322Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:53:53.9763789Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:53:53.9787453Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:53:53.9818096Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:53:53.9844021Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:53:53.9870867Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:53:53.9902885Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:53:53.9928916Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:53:53.9958216Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:53:53.9987301Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:53:54.0017642Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:53:54.0047508Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:53:54.0077139Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:53:54.0108268Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:53:54.0132611Z Entering 'third_party/kleidiai' 2025-12-04T08:53:54.0156166Z Entering 'third_party/mimalloc' 2025-12-04T08:53:54.0181200Z Entering 'third_party/nlohmann' 2025-12-04T08:53:54.0204083Z Entering 'third_party/onnx' 2025-12-04T08:53:54.0235141Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T08:53:54.0263355Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T08:53:54.0289221Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:53:54.0312919Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:53:54.0334677Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:53:54.0354475Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:53:54.0377978Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:53:54.0410616Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:53:54.0434970Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:53:54.0463968Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:53:54.0489660Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:53:54.0514057Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:53:54.0542715Z Entering 'third_party/pocketfft' 2025-12-04T08:53:54.0567149Z Entering 'third_party/protobuf' 2025-12-04T08:53:54.0595153Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:53:54.0626529Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T08:53:54.0651769Z Entering 'third_party/psimd' 2025-12-04T08:53:54.0673278Z Entering 'third_party/pthreadpool' 2025-12-04T08:53:54.0692825Z Entering 'third_party/pybind11' 2025-12-04T08:53:54.0716636Z Entering 'third_party/python-peachpy' 2025-12-04T08:53:54.0740185Z Entering 'third_party/sleef' 2025-12-04T08:53:54.0772970Z Entering 'third_party/tensorpipe' 2025-12-04T08:53:54.0801105Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:53:54.0823854Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:53:54.0851766Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:53:54.0874900Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:53:54.0899611Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:53:54.0945826Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T08:53:54.0968929Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T08:53:54.1134720Z Entering 'android/libs/fbjni' 2025-12-04T08:53:54.1155410Z Entering 'third_party/FP16' 2025-12-04T08:53:54.1180801Z Entering 'third_party/FXdiv' 2025-12-04T08:53:54.1204482Z Entering 'third_party/NNPACK' 2025-12-04T08:53:54.1226313Z Entering 'third_party/NVTX' 2025-12-04T08:53:54.1247655Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T08:53:54.1270100Z Entering 'third_party/XNNPACK' 2025-12-04T08:53:54.1297792Z Entering 'third_party/aiter' 2025-12-04T08:53:54.1321911Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:53:54.1349358Z Entering 'third_party/benchmark' 2025-12-04T08:53:54.1374794Z Entering 'third_party/composable_kernel' 2025-12-04T08:53:54.1400883Z Entering 'third_party/cpp-httplib' 2025-12-04T08:53:54.1425521Z Entering 'third_party/cpuinfo' 2025-12-04T08:53:54.1457887Z Entering 'third_party/cudnn_frontend' 2025-12-04T08:53:54.1478943Z Entering 'third_party/cutlass' 2025-12-04T08:53:54.1506371Z Entering 'third_party/fbgemm' 2025-12-04T08:53:54.1527496Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T08:53:54.1550799Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:53:54.1580281Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:53:54.1602372Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T08:53:54.1624850Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T08:53:54.1644035Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:53:54.1668057Z Entering 'third_party/fbgemm/external/json' 2025-12-04T08:53:54.1696159Z Entering 'third_party/flash-attention' 2025-12-04T08:53:54.1718425Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:53:54.1749717Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:53:54.1783083Z Entering 'third_party/flatbuffers' 2025-12-04T08:53:54.1805457Z Entering 'third_party/fmt' 2025-12-04T08:53:54.1829125Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:53:54.1851911Z Entering 'third_party/gloo' 2025-12-04T08:53:54.1880353Z Entering 'third_party/googletest' 2025-12-04T08:53:54.1902472Z Entering 'third_party/ideep' 2025-12-04T08:53:54.1923377Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T08:53:54.1948783Z Entering 'third_party/ittapi' 2025-12-04T08:53:54.1970643Z Entering 'third_party/kineto' 2025-12-04T08:53:54.1993015Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:53:54.2027694Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:53:54.2059209Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:53:54.2092100Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:53:54.2119893Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:53:54.2146264Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:53:54.2174485Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:53:54.2206743Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:53:54.2235201Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:53:54.2257162Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:53:54.2286372Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:53:54.2310736Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:53:54.2338992Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:53:54.2364554Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:53:54.2385969Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:53:54.2409853Z Entering 'third_party/kleidiai' 2025-12-04T08:53:54.2435758Z Entering 'third_party/mimalloc' 2025-12-04T08:53:54.2458172Z Entering 'third_party/nlohmann' 2025-12-04T08:53:54.2488057Z Entering 'third_party/onnx' 2025-12-04T08:53:54.2519966Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T08:53:54.2545286Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T08:53:54.2567513Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:53:54.2589413Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:53:54.2612645Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:53:54.2644980Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:53:54.2670307Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:53:54.2692347Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:53:54.2714922Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:53:54.2735661Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:53:54.2758828Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:53:54.2783271Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:53:54.2814366Z Entering 'third_party/pocketfft' 2025-12-04T08:53:54.2837024Z Entering 'third_party/protobuf' 2025-12-04T08:53:54.2860395Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:53:54.2886357Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T08:53:54.2913291Z Entering 'third_party/psimd' 2025-12-04T08:53:54.2936210Z Entering 'third_party/pthreadpool' 2025-12-04T08:53:54.2957450Z Entering 'third_party/pybind11' 2025-12-04T08:53:54.2993081Z Entering 'third_party/python-peachpy' 2025-12-04T08:53:54.3022091Z Entering 'third_party/sleef' 2025-12-04T08:53:54.3047900Z Entering 'third_party/tensorpipe' 2025-12-04T08:53:54.3077636Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:53:54.3111046Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:53:54.3142685Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:53:54.3172238Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:53:54.3200260Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:53:54.3257024Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.3284014Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T08:53:54.3483058Z Entering 'android/libs/fbjni' 2025-12-04T08:53:54.3497347Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T08:53:54.3507683Z Entering 'third_party/FP16' 2025-12-04T08:53:54.3519359Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T08:53:54.3526758Z Entering 'third_party/FXdiv' 2025-12-04T08:53:54.3536793Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T08:53:54.3544902Z Entering 'third_party/NNPACK' 2025-12-04T08:53:54.3558405Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T08:53:54.3567699Z Entering 'third_party/NVTX' 2025-12-04T08:53:54.3579955Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T08:53:54.3591328Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T08:53:54.3604558Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T08:53:54.3614584Z Entering 'third_party/XNNPACK' 2025-12-04T08:53:54.3627142Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T08:53:54.3643472Z Entering 'third_party/aiter' 2025-12-04T08:53:54.3657372Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T08:53:54.3667500Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:53:54.3681173Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T08:53:54.3697077Z Entering 'third_party/benchmark' 2025-12-04T08:53:54.3711302Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T08:53:54.3720228Z Entering 'third_party/composable_kernel' 2025-12-04T08:53:54.3740744Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T08:53:54.3759162Z Entering 'third_party/cpp-httplib' 2025-12-04T08:53:54.3773015Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T08:53:54.3779721Z Entering 'third_party/cpuinfo' 2025-12-04T08:53:54.3793870Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T08:53:54.3801309Z Entering 'third_party/cudnn_frontend' 2025-12-04T08:53:54.3810445Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T08:53:54.3825477Z Entering 'third_party/cutlass' 2025-12-04T08:53:54.3842198Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T08:53:54.3855991Z Entering 'third_party/fbgemm' 2025-12-04T08:53:54.3869759Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T08:53:54.3879751Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T08:53:54.3897590Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T08:53:54.3909381Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:53:54.3928783Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T08:53:54.3942899Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:53:54.3958838Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T08:53:54.3969734Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T08:53:54.3981095Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T08:53:54.3994324Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T08:53:54.4005450Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T08:53:54.4014697Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:53:54.4024217Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T08:53:54.4032964Z Entering 'third_party/fbgemm/external/json' 2025-12-04T08:53:54.4042445Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T08:53:54.4057082Z Entering 'third_party/flash-attention' 2025-12-04T08:53:54.4071984Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T08:53:54.4081484Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:53:54.4091448Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T08:53:54.4103309Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:53:54.4119793Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T08:53:54.4134624Z Entering 'third_party/flatbuffers' 2025-12-04T08:53:54.4145146Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T08:53:54.4155531Z Entering 'third_party/fmt' 2025-12-04T08:53:54.4165158Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T08:53:54.4173742Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:53:54.4183357Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T08:53:54.4192518Z Entering 'third_party/gloo' 2025-12-04T08:53:54.4203655Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T08:53:54.4215093Z Entering 'third_party/googletest' 2025-12-04T08:53:54.4227591Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:53:54.4240870Z Entering 'third_party/ideep' 2025-12-04T08:53:54.4258915Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T08:53:54.4269046Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T08:53:54.4279635Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T08:53:54.4296736Z Entering 'third_party/ittapi' 2025-12-04T08:53:54.4306698Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T08:53:54.4316253Z Entering 'third_party/kineto' 2025-12-04T08:53:54.4325944Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T08:53:54.4335318Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:53:54.4346475Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T08:53:54.4355881Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:53:54.4368508Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T08:53:54.4383043Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:53:54.4397648Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T08:53:54.4408034Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:53:54.4420011Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T08:53:54.4430262Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:53:54.4449825Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T08:53:54.4459508Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:53:54.4475489Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T08:53:54.4486941Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:53:54.4497308Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T08:53:54.4507078Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:53:54.4517871Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:53:54.4532468Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:53:54.4545154Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T08:53:54.4556752Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:53:54.4570478Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T08:53:54.4580392Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:53:54.4589937Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T08:53:54.4598730Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:53:54.4608564Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T08:53:54.4619111Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:53:54.4634877Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T08:53:54.4649543Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:53:54.4660981Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T08:53:54.4669062Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:53:54.4682722Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T08:53:54.4693072Z Entering 'third_party/kleidiai' 2025-12-04T08:53:54.4709346Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T08:53:54.4719110Z Entering 'third_party/mimalloc' 2025-12-04T08:53:54.4735136Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T08:53:54.4744698Z Entering 'third_party/nlohmann' 2025-12-04T08:53:54.4755075Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T08:53:54.4768164Z Entering 'third_party/onnx' 2025-12-04T08:53:54.4780408Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T08:53:54.4802252Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T08:53:54.4818222Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T08:53:54.4831547Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T08:53:54.4847119Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T08:53:54.4858110Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:53:54.4872025Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T08:53:54.4882317Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:53:54.4895486Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:53:54.4906820Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:53:54.4930162Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T08:53:54.4939024Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:53:54.4953260Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T08:53:54.4963229Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:53:54.4973445Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T08:53:54.4982940Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:53:54.4993566Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T08:53:54.5002439Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:53:54.5017592Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T08:53:54.5030268Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:53:54.5043856Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T08:53:54.5057548Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:53:54.5070548Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T08:53:54.5082089Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:53:54.5097220Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T08:53:54.5116854Z Entering 'third_party/pocketfft' 2025-12-04T08:53:54.5126999Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T08:53:54.5135510Z Entering 'third_party/protobuf' 2025-12-04T08:53:54.5145270Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T08:53:54.5161896Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:53:54.5178270Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T08:53:54.5188450Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T08:53:54.5200366Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:53:54.5215076Z Entering 'third_party/psimd' 2025-12-04T08:53:54.5227177Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T08:53:54.5237278Z Entering 'third_party/pthreadpool' 2025-12-04T08:53:54.5247906Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T08:53:54.5258588Z Entering 'third_party/pybind11' 2025-12-04T08:53:54.5274838Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T08:53:54.5286089Z Entering 'third_party/python-peachpy' 2025-12-04T08:53:54.5298434Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T08:53:54.5315369Z Entering 'third_party/sleef' 2025-12-04T08:53:54.5329237Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T08:53:54.5343428Z Entering 'third_party/tensorpipe' 2025-12-04T08:53:54.5354763Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T08:53:54.5365468Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:53:54.5375261Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:53:54.5387992Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:53:54.5398373Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T08:53:54.5411020Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:53:54.5420662Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T08:53:54.5431816Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:53:54.5442037Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T08:53:54.5449736Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:53:54.5459839Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T08:53:54.5484254Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5503427Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5520944Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5541918Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5560472Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5578409Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5595716Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5611632Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5629924Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5645818Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5664035Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5680425Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5701463Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5721541Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5754425Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5754995Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5769150Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5785824Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5799688Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5816781Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5831933Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5846639Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5861697Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5884638Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5908405Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5926461Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5946838Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5968402Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.5984636Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6001519Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6022226Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6037626Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6056049Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6073391Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6093679Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6117405Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6137526Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6153987Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6173726Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6190970Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6206778Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6223434Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6245927Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6262508Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6280571Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6297956Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6317512Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6334305Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6351786Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6373545Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6390931Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6409192Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6426934Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6444342Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6461822Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6478659Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6499834Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6519136Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6537130Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6552730Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6571615Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6589303Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6608316Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6627812Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6647871Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6667680Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6687811Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6705392Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6723261Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6746469Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6767398Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6786209Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6803682Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6821806Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6839119Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6856425Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6874228Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6892089Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6908886Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6927028Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6949968Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:53:54.6971723Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T08:53:54.7002225Z ##[endgroup] 2025-12-04T08:53:54.7002434Z ##[group]Fetching the repository 2025-12-04T08:53:54.7006370Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-12-04T08:53:55.3605417Z From https://github.com/pytorch/pytorch 2025-12-04T08:53:55.3605848Z - [deleted] (none) -> ciflow/inductor/160174 2025-12-04T08:53:55.3606243Z - [deleted] (none) -> ciflow/trunk/160174 2025-12-04T08:53:59.2976574Z * [new branch] 2.6.0.dev20241004+ -> origin/2.6.0.dev20241004+ 2025-12-04T08:53:59.2977000Z * [new branch] 2.9.1 -> origin/2.9.1 2025-12-04T08:53:59.2977474Z * [new branch] AaronWang04_addmmfusion_perftest -> origin/AaronWang04_addmmfusion_perftest 2025-12-04T08:53:59.2977930Z * [new branch] Flamefire-patch-1 -> origin/Flamefire-patch-1 2025-12-04T08:53:59.2978354Z * [new branch] HDCharles-2.6.0-release-notes -> origin/HDCharles-2.6.0-release-notes 2025-12-04T08:53:59.2978777Z * [new branch] HOPrintFunc -> origin/HOPrintFunc 2025-12-04T08:53:59.2979134Z * [new branch] IvanKobzarev/stack/1 -> origin/IvanKobzarev/stack/1 2025-12-04T08:53:59.2979490Z * [new branch] NicoshevSVE128 -> origin/NicoshevSVE128 2025-12-04T08:53:59.2979945Z * [new branch] PR-AOTInductorNoneBug -> origin/PR-AOTInductorNoneBug 2025-12-04T08:53:59.2980351Z * [new branch] PR-AOTInductorNoneBugFix -> origin/PR-AOTInductorNoneBugFix 2025-12-04T08:53:59.2980750Z * [new branch] PR-FixConfigsIssue -> origin/PR-FixConfigsIssue 2025-12-04T08:53:59.2982129Z * [new branch] PR-NoneBugFix-viable -> origin/PR-NoneBugFix-viable 2025-12-04T08:53:59.2982490Z * [new branch] PR-ResetToZero -> origin/PR-ResetToZero 2025-12-04T08:53:59.2982868Z * [new branch] Update-Flash-Packaging -> origin/Update-Flash-Packaging 2025-12-04T08:53:59.2983234Z * [new branch] VLA_exp -> origin/VLA_exp 2025-12-04T08:53:59.2983570Z * [new branch] activation_bench -> origin/activation_bench 2025-12-04T08:53:59.2983918Z * [new branch] addmm-heuristic -> origin/addmm-heuristic 2025-12-04T08:53:59.2984262Z * [new branch] adi/onednn_aarch64 -> origin/adi/onednn_aarch64 2025-12-04T08:53:59.2984586Z * [new branch] adi/test -> origin/adi/test 2025-12-04T08:53:59.2984843Z * [new branch] adi/test_bgemm -> origin/adi/test_bgemm 2025-12-04T08:53:59.2985125Z * [new branch] adi/test_m8g -> origin/adi/test_m8g 2025-12-04T08:53:59.2985393Z * [new branch] adi/test_onednn -> origin/adi/test_onednn 2025-12-04T08:53:59.2985664Z * [new branch] adi/test_onednn_v3.9 -> origin/adi/test_onednn_v3.9 2025-12-04T08:53:59.2985964Z * [new branch] adi/test_presve_change -> origin/adi/test_presve_change 2025-12-04T08:53:59.2986392Z * [new branch] adi/test_timm -> origin/adi/test_timm 2025-12-04T08:53:59.2986686Z * [new branch] adi/testpresve_change -> origin/adi/testpresve_change 2025-12-04T08:53:59.2986993Z * [new branch] aditew01/test/vec_bf16 -> origin/aditew01/test/vec_bf16 2025-12-04T08:53:59.2987303Z * [new branch] ah-globalfeedback-hook -> origin/ah-globalfeedback-hook 2025-12-04T08:53:59.2987639Z * [new branch] albanD-patch-1 -> origin/albanD-patch-1 2025-12-04T08:53:59.2987941Z * [new branch] also-surround-shimh -> origin/also-surround-shimh 2025-12-04T08:53:59.2988247Z * [new branch] angelayi/aot_compile -> origin/angelayi/aot_compile 2025-12-04T08:53:59.2988583Z * [new branch] angelayi/aoti_additional_files -> origin/angelayi/aoti_additional_files 2025-12-04T08:53:59.2988925Z * [new branch] angelayi/benchmark -> origin/angelayi/benchmark 2025-12-04T08:53:59.2989287Z * [new branch] angelayi/change_pytree_serialization -> origin/angelayi/change_pytree_serialization 2025-12-04T08:53:59.2989650Z * [new branch] angelayi/cpp_loader -> origin/angelayi/cpp_loader 2025-12-04T08:53:59.2989959Z * [new branch] angelayi/inductor_const -> origin/angelayi/inductor_const 2025-12-04T08:53:59.2990251Z * [new branch] angelayi/lstm -> origin/angelayi/lstm 2025-12-04T08:53:59.2990530Z * [new branch] angelayi/no_so_weight -> origin/angelayi/no_so_weight 2025-12-04T08:53:59.2990832Z * [new branch] angelayi/scan_layers -> origin/angelayi/scan_layers 2025-12-04T08:53:59.2991119Z * [new branch] angelayi/side_eff -> origin/angelayi/side_eff 2025-12-04T08:53:59.2991402Z * [new branch] angelayi/state_dict -> origin/angelayi/state_dict 2025-12-04T08:53:59.2991708Z * [new branch] angelayi/symint_input -> origin/angelayi/symint_input 2025-12-04T08:53:59.2992060Z * [new branch] angelayi/symm_mem -> origin/angelayi/symm_mem 2025-12-04T08:53:59.2992345Z * [new branch] angelayi/test_cpp -> origin/angelayi/test_cpp 2025-12-04T08:53:59.2992629Z * [new branch] angelayi/torch_size -> origin/angelayi/torch_size 2025-12-04T08:53:59.2992922Z * [new branch] annotate_assert -> origin/annotate_assert 2025-12-04T08:53:59.2993217Z * [new branch] annotate_fallback_kernel -> origin/annotate_fallback_kernel 2025-12-04T08:53:59.2993661Z * [new branch] annotation_deepcopy -> origin/annotation_deepcopy 2025-12-04T08:53:59.2993956Z * [new branch] annotation_dynamo -> origin/annotation_dynamo 2025-12-04T08:53:59.2994235Z * [new branch] aot_eager_stack_trace -> origin/aot_eager_stack_trace 2025-12-04T08:53:59.2994554Z * [new branch] aoti-cuda-alloc -> origin/aoti-cuda-alloc 2025-12-04T08:53:59.2994787Z * [new branch] aoti_const_device -> origin/aoti_const_device 2025-12-04T08:53:59.2995009Z * [new branch] aoti_fqn_name_interface -> origin/aoti_fqn_name_interface 2025-12-04T08:53:59.2995257Z * [new branch] aoti_package_weights_binary -> origin/aoti_package_weights_binary 2025-12-04T08:53:59.2995501Z * [new branch] aoti_target_windows -> origin/aoti_target_windows 2025-12-04T08:53:59.2995761Z * [new branch] arsh/feat/inductor_check_profiling -> origin/arsh/feat/inductor_check_profiling 2025-12-04T08:53:59.2996018Z * [new branch] async_tp -> origin/async_tp 2025-12-04T08:53:59.2996267Z * [new branch] atalman-inductor-perf-cu124 -> origin/atalman-inductor-perf-cu124 2025-12-04T08:53:59.2996568Z * [new branch] atalman-inductor-perf-cu124.1 -> origin/atalman-inductor-perf-cu124.1 2025-12-04T08:53:59.2996873Z * [new branch] atalman-patch-2 -> origin/atalman-patch-2 2025-12-04T08:53:59.2997086Z * [new branch] atalman-patch-3 -> origin/atalman-patch-3 2025-12-04T08:53:59.2997291Z * [new branch] atalman-patch-4 -> origin/atalman-patch-4 2025-12-04T08:53:59.2997500Z * [new branch] atalman-patch-5 -> origin/atalman-patch-5 2025-12-04T08:53:59.2997706Z * [new branch] atalman-patch-6 -> origin/atalman-patch-6 2025-12-04T08:53:59.2997920Z * [new branch] atalman-patch-7 -> origin/atalman-patch-7 2025-12-04T08:53:59.2998138Z * [new branch] atalman-patch-8 -> origin/atalman-patch-8 2025-12-04T08:53:59.2998362Z * [new branch] atalman_inductor_2.3.1 -> origin/atalman_inductor_2.3.1 2025-12-04T08:53:59.2998586Z * [new branch] atalman_inductor_2.4.0 -> origin/atalman_inductor_2.4.0 2025-12-04T08:53:59.2998822Z * [new branch] atalman_inductor_2.4.x -> origin/atalman_inductor_2.4.x 2025-12-04T08:53:59.2999075Z * [new branch] attention_benchmarking_clean -> origin/attention_benchmarking_clean 2025-12-04T08:53:59.2999337Z * [new branch] bahuang/dt_fix_scalar_add -> origin/bahuang/dt_fix_scalar_add 2025-12-04T08:53:59.2999575Z * [new branch] bahuang/fix_debug_mode -> origin/bahuang/fix_debug_mode 2025-12-04T08:53:59.2999797Z * [new branch] bahuang/fix_expand -> origin/bahuang/fix_expand 2025-12-04T08:53:59.3000011Z * [new branch] bahuang/test -> origin/bahuang/test 2025-12-04T08:53:59.3000213Z * [new branch] base/1.5 -> origin/base/1.5 2025-12-04T08:53:59.3000459Z * [new branch] batching_sdpa_efficient_attention -> origin/batching_sdpa_efficient_attention 2025-12-04T08:53:59.3000721Z * [new branch] bench_scaled_mm_ops -> origin/bench_scaled_mm_ops 2025-12-04T08:53:59.3000943Z * [new branch] benchmark-updates -> origin/benchmark-updates 2025-12-04T08:53:59.3001175Z * [new branch] benchmarking-script -> origin/benchmarking-script 2025-12-04T08:53:59.3001400Z * [new branch] bertmaher/pinbump26 -> origin/bertmaher/pinbump26 2025-12-04T08:53:59.3001630Z * [new branch] bertrand/cutlass -> origin/bertrand/cutlass 2025-12-04T08:53:59.3001887Z * [new branch] bf/bug-static-input -> origin/bf/bug-static-input 2025-12-04T08:53:59.3002210Z * [new branch] bf/cg-backend -> origin/bf/cg-backend 2025-12-04T08:53:59.3002417Z * [new branch] bf/cg-nccl-test -> origin/bf/cg-nccl-test 2025-12-04T08:53:59.3002700Z * [new branch] bf/cg-remove-check -> origin/bf/cg-remove-check 2025-12-04T08:53:59.3002932Z * [new branch] bf/clean-torchbench-hf -> origin/bf/clean-torchbench-hf 2025-12-04T08:53:59.3003176Z * [new branch] bf/combo-debug-log -> origin/bf/combo-debug-log 2025-12-04T08:53:59.3003385Z * [new branch] bf/cudagraph -> origin/bf/cudagraph 2025-12-04T08:53:59.3003659Z * [new branch] bf/cudagraph-disable-input-mutation -> origin/bf/cudagraph-disable-input-mutation 2025-12-04T08:53:59.3004091Z * [new branch] bf/cudagraph-enable-input-mutation-support-benchmark -> origin/bf/cudagraph-enable-input-mutation-support-benchmark 2025-12-04T08:53:59.3004447Z * [new branch] bf/cudagraph-partition -> origin/bf/cudagraph-partition 2025-12-04T08:53:59.3004646Z * [new branch] bf/donated-buffer-bench -> origin/bf/donated-buffer-bench 2025-12-04T08:53:59.3004840Z * [new branch] bf/dynamo-partition -> origin/bf/dynamo-partition 2025-12-04T08:53:59.3005012Z * [new branch] bf/lite -> origin/bf/lite 2025-12-04T08:53:59.3005226Z * [new branch] bf/pa-non-divisible -> origin/bf/pa-non-divisible 2025-12-04T08:53:59.3005445Z * [new branch] bf/partition-cache-free-symbols -> origin/bf/partition-cache-free-symbols 2025-12-04T08:53:59.3005675Z * [new branch] bf/partition-memory-plan -> origin/bf/partition-memory-plan 2025-12-04T08:53:59.3005884Z * [new branch] bf/partition-move-cpu -> origin/bf/partition-move-cpu 2025-12-04T08:53:59.3006093Z * [new branch] bf/partition-view-fallback -> origin/bf/partition-view-fallback 2025-12-04T08:53:59.3006309Z * [new branch] bf/remove-check-55b0c39d -> origin/bf/remove-check-55b0c39d 2025-12-04T08:53:59.3006501Z * [new branch] bf/timm-nov-26-2025 -> origin/bf/timm-nov-26-2025 2025-12-04T08:53:59.3006700Z * [new branch] bf/transformer-pin-4-57-3 -> origin/bf/transformer-pin-4-57-3 2025-12-04T08:53:59.3006921Z * [new branch] bisect_perf_hf_T5_3acc6eac492 -> origin/bisect_perf_hf_T5_3acc6eac492 2025-12-04T08:53:59.3007136Z * [new branch] bisect_perf_hf_T5_3fcf66f61fb -> origin/bisect_perf_hf_T5_3fcf66f61fb 2025-12-04T08:53:59.3007352Z * [new branch] bisect_perf_hf_T5_4009d154129 -> origin/bisect_perf_hf_T5_4009d154129 2025-12-04T08:53:59.3007559Z * [new branch] bisect_perf_hf_T5_40d0740e73d -> origin/bisect_perf_hf_T5_40d0740e73d 2025-12-04T08:53:59.3007763Z * [new branch] bisect_perf_hf_T5_5268754e -> origin/bisect_perf_hf_T5_5268754e 2025-12-04T08:53:59.3007972Z * [new branch] bisect_perf_hf_T5_7d89a8d385c -> origin/bisect_perf_hf_T5_7d89a8d385c 2025-12-04T08:53:59.3008191Z * [new branch] bisect_perf_hf_T5_b7a25c1ee7c -> origin/bisect_perf_hf_T5_b7a25c1ee7c 2025-12-04T08:53:59.3008401Z * [new branch] bisect_perf_hf_T5_c25b201583f -> origin/bisect_perf_hf_T5_c25b201583f 2025-12-04T08:53:59.3008609Z * [new branch] bisect_perf_hf_T5_c93e57efac0 -> origin/bisect_perf_hf_T5_c93e57efac0 2025-12-04T08:53:59.3008818Z * [new branch] bisect_perf_hf_T5_ca9813ea149 -> origin/bisect_perf_hf_T5_ca9813ea149 2025-12-04T08:53:59.3009022Z * [new branch] bisect_perf_hf_T5_d65f194a -> origin/bisect_perf_hf_T5_d65f194a 2025-12-04T08:53:59.3009220Z * [new branch] bisect_perf_hf_T5_da94ab0b -> origin/bisect_perf_hf_T5_da94ab0b 2025-12-04T08:53:59.3009431Z * [new branch] bisect_perf_hf_T5_da94ab0b_new -> origin/bisect_perf_hf_T5_da94ab0b_new 2025-12-04T08:53:59.3009667Z * [new branch] bisect_perf_hf_T5_db4e8a1d8a8 -> origin/bisect_perf_hf_T5_db4e8a1d8a8 2025-12-04T08:53:59.3009873Z * [new branch] bisect_perf_hf_T5_e0d97e936a2 -> origin/bisect_perf_hf_T5_e0d97e936a2 2025-12-04T08:53:59.3010078Z * [new branch] bisect_perf_hf_T5_f23621ec563 -> origin/bisect_perf_hf_T5_f23621ec563 2025-12-04T08:53:59.3010280Z * [new branch] brister/fx_device_type -> origin/brister/fx_device_type 2025-12-04T08:53:59.3010488Z * [new branch] brister/test_inductor_all_fx -> origin/brister/test_inductor_all_fx 2025-12-04T08:53:59.3010738Z * [new branch] brister/tiled_reduction_no_numel_check -> origin/brister/tiled_reduction_no_numel_check 2025-12-04T08:53:59.3010957Z * [new branch] bwd-backup -> origin/bwd-backup 2025-12-04T08:53:59.3011119Z * [new branch] c57382a49 -> origin/c57382a49 2025-12-04T08:53:59.3011279Z * [new branch] ca_0431d47eaa -> origin/ca_0431d47eaa 2025-12-04T08:53:59.3011449Z * [new branch] ca_fix_0431d47eaa -> origin/ca_fix_0431d47eaa 2025-12-04T08:53:59.3011645Z * [new branch] camyllh/test_setup_hooks_push -> origin/camyllh/test_setup_hooks_push 2025-12-04T08:53:59.3011886Z * [new branch] cccclai-patch-1 -> origin/cccclai-patch-1 2025-12-04T08:53:59.3012185Z * [new branch] cherry-pick-159969-by-pytorch_bot_bot_ -> origin/cherry-pick-159969-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3012460Z * [new branch] cherry-pick-160586-by-pytorch_bot_bot_ -> origin/cherry-pick-160586-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3012735Z * [new branch] cherry-pick-162208-by-pytorch_bot_bot_ -> origin/cherry-pick-162208-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3013009Z * [new branch] cherry-pick-163169-by-pytorch_bot_bot_ -> origin/cherry-pick-163169-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3013287Z * [new branch] cherry-pick-165086-by-pytorch_bot_bot_ -> origin/cherry-pick-165086-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3013560Z * [new branch] cherry-pick-165514-by-pytorch_bot_bot_ -> origin/cherry-pick-165514-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3013832Z * [new branch] cherry-pick-165601-by-pytorch_bot_bot_ -> origin/cherry-pick-165601-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3014106Z * [new branch] cherry-pick-165667-by-pytorch_bot_bot_ -> origin/cherry-pick-165667-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3014382Z * [new branch] cherry-pick-165815-by-pytorch_bot_bot_ -> origin/cherry-pick-165815-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3014654Z * [new branch] cherry-pick-165922-by-pytorch_bot_bot_ -> origin/cherry-pick-165922-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3014935Z * [new branch] cherry-pick-166148-by-pytorch_bot_bot_ -> origin/cherry-pick-166148-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3015213Z * [new branch] cherry-pick-166181-by-pytorch_bot_bot_ -> origin/cherry-pick-166181-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3015483Z * [new branch] cherry-pick-166404-by-pytorch_bot_bot_ -> origin/cherry-pick-166404-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3015757Z * [new branch] cherry-pick-166427-by-pytorch_bot_bot_ -> origin/cherry-pick-166427-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3016032Z * [new branch] cherry-pick-166480-by-pytorch_bot_bot_ -> origin/cherry-pick-166480-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3016310Z * [new branch] cherry-pick-166570-by-pytorch_bot_bot_ -> origin/cherry-pick-166570-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3016581Z * [new branch] cherry-pick-166993-by-pytorch_bot_bot_ -> origin/cherry-pick-166993-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3016855Z * [new branch] cherry-pick-167111-by-pytorch_bot_bot_ -> origin/cherry-pick-167111-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3017171Z * [new branch] cherry-pick-167478-by-pytorch_bot_bot_ -> origin/cherry-pick-167478-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3017401Z * [new branch] cherry_pick_166036_166040 -> origin/cherry_pick_166036_166040 2025-12-04T08:53:59.3017588Z * [new branch] cherry_pick_166457 -> origin/cherry_pick_166457 2025-12-04T08:53:59.3017768Z * [new branch] cherrypick_166338 -> origin/cherrypick_166338 2025-12-04T08:53:59.3017953Z * [new branch] cherrypick_166458 -> origin/cherrypick_166458 2025-12-04T08:53:59.3018126Z * [new branch] cherrypick_166586 -> origin/cherrypick_166586 2025-12-04T08:53:59.3018298Z * [new branch] cherrypick_166956 -> origin/cherrypick_166956 2025-12-04T08:53:59.3018461Z * [new branch] ci_attn -> origin/ci_attn 2025-12-04T08:53:59.3018626Z * [new branch] codex-testing -> origin/codex-testing 2025-12-04T08:53:59.3018893Z * [new branch] codex/add-check_memory_overlap-helper-functions -> origin/codex/add-check_memory_overlap-helper-functions 2025-12-04T08:53:59.3019185Z * [new branch] codex/fix-issue-121219-in-pytorch -> origin/codex/fix-issue-121219-in-pytorch 2025-12-04T08:53:59.3019520Z * [new branch] codex/investigate-segfaults-in-get_tensor_storage_id -> origin/codex/investigate-segfaults-in-get_tensor_storage_id 2025-12-04T08:53:59.3019887Z * [new branch] codex/refactor-lintrunner-config-to-use-uv-run -> origin/codex/refactor-lintrunner-config-to-use-uv-run 2025-12-04T08:53:59.3020150Z * [new branch] compatiblpy39util -> origin/compatiblpy39util 2025-12-04T08:53:59.3020325Z * [new branch] cond_hop_device -> origin/cond_hop_device 2025-12-04T08:53:59.3020494Z * [new branch] context_test -> origin/context_test 2025-12-04T08:53:59.3020726Z * [new branch] copilot/code-style-cleanup-python-pip -> origin/copilot/code-style-cleanup-python-pip 2025-12-04T08:53:59.3020973Z * [new branch] cpio/fix_new_ami_tests -> origin/cpio/fix_new_ami_tests 2025-12-04T08:53:59.3021189Z * [new branch] cpp-docs-dependency-upgrade -> origin/cpp-docs-dependency-upgrade 2025-12-04T08:53:59.3021400Z * [new branch] csl/always_produce_xml -> origin/csl/always_produce_xml 2025-12-04T08:53:59.3021599Z * [new branch] csl/build_test_more_procs -> origin/csl/build_test_more_procs 2025-12-04T08:53:59.3021801Z * [new branch] csl/build_test_more_procs2 -> origin/csl/build_test_more_procs2 2025-12-04T08:53:59.3022021Z * [new branch] csl/clean_up -> origin/csl/clean_up 2025-12-04T08:53:59.3022213Z * [new branch] csl/fix_retry_segfault_exit -> origin/csl/fix_retry_segfault_exit 2025-12-04T08:53:59.3022402Z * [new branch] csl/katex -> origin/csl/katex 2025-12-04T08:53:59.3022568Z * [new branch] csl/larger_runner -> origin/csl/larger_runner 2025-12-04T08:53:59.3022740Z * [new branch] csl/lint_testing -> origin/csl/lint_testing 2025-12-04T08:53:59.3022910Z * [new branch] csl/lint_thing -> origin/csl/lint_thing 2025-12-04T08:53:59.3023088Z * [new branch] csl/lintrunner_stuff -> origin/csl/lintrunner_stuff 2025-12-04T08:53:59.3023280Z * [new branch] csl/manually_gen_json -> origin/csl/manually_gen_json 2025-12-04T08:53:59.3023487Z * [new branch] csl/mps_sharding -> origin/csl/mps_sharding 2025-12-04T08:53:59.3023675Z * [new branch] csl/multistage_docker -> origin/csl/multistage_docker 2025-12-04T08:53:59.3023857Z * [new branch] csl/print_timing -> origin/csl/print_timing 2025-12-04T08:53:59.3024046Z * [new branch] csl/remove_experiment -> origin/csl/remove_experiment 2025-12-04T08:53:59.3024290Z * [new branch] csl/remove_maybe_unused_var -> origin/csl/remove_maybe_unused_var 2025-12-04T08:53:59.3024526Z * [new branch] csl/remove_repo_specific_autolabel -> origin/csl/remove_repo_specific_autolabel 2025-12-04T08:53:59.3024754Z * [new branch] csl/remove_run_parallel -> origin/csl/remove_run_parallel 2025-12-04T08:53:59.3024956Z * [new branch] csl/remove_unused_vars -> origin/csl/remove_unused_vars 2025-12-04T08:53:59.3025138Z * [new branch] csl/revert_open -> origin/csl/revert_open 2025-12-04T08:53:59.3025313Z * [new branch] csl/skip_build -> origin/csl/skip_build 2025-12-04T08:53:59.3025508Z * [new branch] csl/smaller_avx_amx_runenrs -> origin/csl/smaller_avx_amx_runenrs 2025-12-04T08:53:59.3025699Z * [new branch] csl/td_job_level -> origin/csl/td_job_level 2025-12-04T08:53:59.3025909Z * [new branch] csl/test_cuda_build_large_runner -> origin/csl/test_cuda_build_large_runner 2025-12-04T08:53:59.3026159Z * [new branch] csl/test_owners_autograd_dispatch_nn -> origin/csl/test_owners_autograd_dispatch_nn 2025-12-04T08:53:59.3026410Z * [new branch] csl/test_owners_higher_confidence -> origin/csl/test_owners_higher_confidence 2025-12-04T08:53:59.3026677Z * [new branch] csl/upload_json_running -> origin/csl/upload_json_running 2025-12-04T08:53:59.3026861Z * [new branch] csl/win_sccache -> origin/csl/win_sccache 2025-12-04T08:53:59.3027033Z * [new branch] csl/xml_stuff -> origin/csl/xml_stuff 2025-12-04T08:53:59.3027205Z * [new branch] cublasrelax2 -> origin/cublasrelax2 2025-12-04T08:53:59.3027372Z * [new branch] cuda_mempool -> origin/cuda_mempool 2025-12-04T08:53:59.3027557Z * [new branch] custom_lowering_dict -> origin/custom_lowering_dict 2025-12-04T08:53:59.3027762Z * [new branch] d4l3k/debug_plane_frtrace -> origin/d4l3k/debug_plane_frtrace 2025-12-04T08:53:59.3027947Z * [new branch] daxia6/2.8o3 -> origin/daxia6/2.8o3 2025-12-04T08:53:59.3028112Z * [new branch] debug-guard -> origin/debug-guard 2025-12-04T08:53:59.3028287Z * [new branch] delete-quant-docs -> origin/delete-quant-docs 2025-12-04T08:53:59.3028623Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.0 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.0 2025-12-04T08:53:59.3029074Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.1 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.1 2025-12-04T08:53:59.3029411Z * [new branch] desertfire/test_cpp_wrapper -> origin/desertfire/test_cpp_wrapper 2025-12-04T08:53:59.3029649Z * [new branch] desertfire/triton-cpu-for-aarch64 -> origin/desertfire/triton-cpu-for-aarch64 2025-12-04T08:53:59.3029878Z * [new branch] dev/dhruva/flex_attn_opt -> origin/dev/dhruva/flex_attn_opt 2025-12-04T08:53:59.3030078Z * [new branch] dev/joona/MPSNDArrayAdd -> origin/dev/joona/MPSNDArrayAdd 2025-12-04T08:53:59.3030267Z * [new branch] dev/joona/Unranked -> origin/dev/joona/Unranked 2025-12-04T08:53:59.3030449Z * [new branch] dev/joona/cat -> origin/dev/joona/cat 2025-12-04T08:53:59.3030630Z * [new branch] dev/joona/embeddingbag -> origin/dev/joona/embeddingbag 2025-12-04T08:53:59.3030828Z * [new branch] dev/joona/fix_sdpa_memtest -> origin/dev/joona/fix_sdpa_memtest 2025-12-04T08:53:59.3031044Z * [new branch] dev/joona/getTensorsString -> origin/dev/joona/getTensorsString 2025-12-04T08:53:59.3031263Z * [new branch] dev/joona/mps_linear_macos14 -> origin/dev/joona/mps_linear_macos14 2025-12-04T08:53:59.3031490Z * [new branch] dev/joona/scalar_clamp -> origin/dev/joona/scalar_clamp 2025-12-04T08:53:59.3031679Z * [new branch] dev/joona/sdpa -> origin/dev/joona/sdpa 2025-12-04T08:53:59.3031886Z * [new branch] dev/joona/sdpa_api -> origin/dev/joona/sdpa_api 2025-12-04T08:53:59.3032071Z * [new branch] dev/joona/type_inf -> origin/dev/joona/type_inf 2025-12-04T08:53:59.3032266Z * [new branch] dev/joona/ulpAssertClose -> origin/dev/joona/ulpAssertClose 2025-12-04T08:53:59.3032459Z * [new branch] dev/joona/upsize3d -> origin/dev/joona/upsize3d 2025-12-04T08:53:59.3032630Z * [new branch] disp_counter -> origin/disp_counter 2025-12-04T08:53:59.3032804Z * [new branch] divyanshk-patch-1 -> origin/divyanshk-patch-1 2025-12-04T08:53:59.3032978Z * [new branch] docs -> origin/docs 2025-12-04T08:53:59.3033136Z * [new branch] documentation -> origin/documentation 2025-12-04T08:53:59.3033314Z * [new branch] eager_model_benchmarks -> origin/eager_model_benchmarks 2025-12-04T08:53:59.3033521Z * [new branch] embg/test_inductor_ci_control -> origin/embg/test_inductor_ci_control 2025-12-04T08:53:59.3033778Z * [new branch] embg/triton_l2_prefetch_128B -> origin/embg/triton_l2_prefetch_128B 2025-12-04T08:53:59.3033998Z * [new branch] embg/triton_l2_prefetch_256B -> origin/embg/triton_l2_prefetch_256B 2025-12-04T08:53:59.3034185Z * [new branch] eqy-patch-1 -> origin/eqy-patch-1 2025-12-04T08:53:59.3034350Z * [new branch] eqy-patch-2 -> origin/eqy-patch-2 2025-12-04T08:53:59.3034518Z * [new branch] eqy-patch-3 -> origin/eqy-patch-3 2025-12-04T08:53:59.3034678Z * [new branch] eqy-patch-4 -> origin/eqy-patch-4 2025-12-04T08:53:59.3034849Z * [new branch] eqy-patch-5 -> origin/eqy-patch-5 2025-12-04T08:53:59.3035008Z * [new branch] eqy-patch-6 -> origin/eqy-patch-6 2025-12-04T08:53:59.3035182Z * [new branch] exclamaforte/amd-ma -> origin/exclamaforte/amd-ma 2025-12-04T08:53:59.3035417Z * [new branch] exclamaforte/combo-kernels-perf-run -> origin/exclamaforte/combo-kernels-perf-run 2025-12-04T08:53:59.3035677Z * [new branch] exclamaforte/do_bench_refactor -> origin/exclamaforte/do_bench_refactor 2025-12-04T08:53:59.3035928Z * [new branch] exclamaforte/enable-mem-dep-fusion -> origin/exclamaforte/enable-mem-dep-fusion 2025-12-04T08:53:59.3036221Z * [new branch] exclamaforte/fix-exhaustive-autotuning -> origin/exclamaforte/fix-exhaustive-autotuning 2025-12-04T08:53:59.3036508Z * [new branch] exclamaforte/fix-trace-parsing-fx-svg -> origin/exclamaforte/fix-trace-parsing-fx-svg 2025-12-04T08:53:59.3036814Z * [new branch] exclamaforte/force-pointwise-cat-perf-run -> origin/exclamaforte/force-pointwise-cat-perf-run 2025-12-04T08:53:59.3037082Z * [new branch] exclamaforte/fusion-data -> origin/exclamaforte/fusion-data 2025-12-04T08:53:59.3037310Z * [new branch] exclamaforte/gemm-benchmark-run -> origin/exclamaforte/gemm-benchmark-run 2025-12-04T08:53:59.3037556Z * [new branch] exclamaforte/gemm-export-model -> origin/exclamaforte/gemm-export-model 2025-12-04T08:53:59.3037774Z * [new branch] exclamaforte/gemm-model -> origin/exclamaforte/gemm-model 2025-12-04T08:53:59.3038035Z * [new branch] exclamaforte/gemm-model-all-data-collection -> origin/exclamaforte/gemm-model-all-data-collection 2025-12-04T08:53:59.3038297Z * [new branch] exclamaforte/gemm-to-amd -> origin/exclamaforte/gemm-to-amd 2025-12-04T08:53:59.3038564Z * [new branch] exclamaforte/just-gemm-model -> origin/exclamaforte/just-gemm-model 2025-12-04T08:53:59.3038835Z * [new branch] exclamaforte/just-gemm-model-no-refactor -> origin/exclamaforte/just-gemm-model-no-refactor 2025-12-04T08:53:59.3039112Z * [new branch] exclamaforte/profile-diff-algo -> origin/exclamaforte/profile-diff-algo 2025-12-04T08:53:59.3039376Z * [new branch] exclamaforte/profiler-visualization -> origin/exclamaforte/profiler-visualization 2025-12-04T08:53:59.3039651Z * [new branch] exclamaforte/test_cpp_wrapper_mode -> origin/exclamaforte/test_cpp_wrapper_mode 2025-12-04T08:53:59.3039918Z * [new branch] exclamaforte/update-autotune-configs -> origin/exclamaforte/update-autotune-configs 2025-12-04T08:53:59.3040207Z * [new branch] exclamaforte/update-autotune-configs-2 -> origin/exclamaforte/update-autotune-configs-2 2025-12-04T08:53:59.3040438Z * [new branch] exec -> origin/exec 2025-12-04T08:53:59.3040613Z * [new branch] experimental-mosaic -> origin/experimental-mosaic 2025-12-04T08:53:59.3040799Z * [new branch] export-D61047529 -> origin/export-D61047529 2025-12-04T08:53:59.3040975Z * [new branch] export-D71412006 -> origin/export-D71412006 2025-12-04T08:53:59.3041167Z * [new branch] export-D73042989 -> origin/export-D73042989 2025-12-04T08:53:59.3041412Z * [new branch] export-D78957093 -> origin/export-D78957093 2025-12-04T08:53:59.3041591Z * [new branch] export-D78996107 -> origin/export-D78996107 2025-12-04T08:53:59.3041760Z * [new branch] export-D80823877 -> origin/export-D80823877 2025-12-04T08:53:59.3041970Z * [new branch] export-D80958642 -> origin/export-D80958642 2025-12-04T08:53:59.3042148Z * [new branch] export-D81054193 -> origin/export-D81054193 2025-12-04T08:53:59.3042324Z * [new branch] export-D81204584 -> origin/export-D81204584 2025-12-04T08:53:59.3042496Z * [new branch] export-D81429090 -> origin/export-D81429090 2025-12-04T08:53:59.3042668Z * [new branch] export-D82250826 -> origin/export-D82250826 2025-12-04T08:53:59.3042835Z * [new branch] export-D82253817 -> origin/export-D82253817 2025-12-04T08:53:59.3043012Z * [new branch] export-D83541846 -> origin/export-D83541846 2025-12-04T08:53:59.3043180Z * [new branch] export-D83627170 -> origin/export-D83627170 2025-12-04T08:53:59.3043353Z * [new branch] export-D83766701 -> origin/export-D83766701 2025-12-04T08:53:59.3043524Z * [new branch] export-D83768878 -> origin/export-D83768878 2025-12-04T08:53:59.3043689Z * [new branch] export-D83769447 -> origin/export-D83769447 2025-12-04T08:53:59.3043865Z * [new branch] export-D84089824 -> origin/export-D84089824 2025-12-04T08:53:59.3044036Z * [new branch] export-D84213020 -> origin/export-D84213020 2025-12-04T08:53:59.3044204Z * [new branch] export-D84373821 -> origin/export-D84373821 2025-12-04T08:53:59.3044376Z * [new branch] export-D84612194 -> origin/export-D84612194 2025-12-04T08:53:59.3044548Z * [new branch] export-D84890985 -> origin/export-D84890985 2025-12-04T08:53:59.3044714Z * [new branch] export-D85122326 -> origin/export-D85122326 2025-12-04T08:53:59.3044887Z * [new branch] export-D86256198 -> origin/export-D86256198 2025-12-04T08:53:59.3045059Z * [new branch] export-D86460608 -> origin/export-D86460608 2025-12-04T08:53:59.3045225Z * [new branch] export-D86474796 -> origin/export-D86474796 2025-12-04T08:53:59.3045399Z * [new branch] export-D86712396 -> origin/export-D86712396 2025-12-04T08:53:59.3045624Z * [new branch] export-D87022129 -> origin/export-D87022129 2025-12-04T08:53:59.3045792Z * [new branch] export-D87838959 -> origin/export-D87838959 2025-12-04T08:53:59.3045962Z * [new branch] export-D88319437 -> origin/export-D88319437 2025-12-04T08:53:59.3046187Z * [new branch] exported-model-train-idempotent -> origin/exported-model-train-idempotent 2025-12-04T08:53:59.3046415Z * [new branch] ezyang-titan-october -> origin/ezyang-titan-october 2025-12-04T08:53:59.3046615Z * [new branch] ezyang-titan-october2 -> origin/ezyang-titan-october2 2025-12-04T08:53:59.3046797Z * [new branch] ezyang-war -> origin/ezyang-war 2025-12-04T08:53:59.3047001Z * [new branch] ezyang/wip-aot-descriptors -> origin/ezyang/wip-aot-descriptors 2025-12-04T08:53:59.3047202Z * [new branch] fa_u8_brgemm -> origin/fa_u8_brgemm 2025-12-04T08:53:59.3047387Z * [new branch] fadeputr/sequence_fbgemm -> origin/fadeputr/sequence_fbgemm 2025-12-04T08:53:59.3047582Z * [new branch] fastmath_baseline -> origin/fastmath_baseline 2025-12-04T08:53:59.3047759Z * [new branch] fbcode/warm -> origin/fbcode/warm 2025-12-04T08:53:59.3047951Z * [new branch] fca -> origin/fca 2025-12-04T08:53:59.3048114Z * [new branch] fca2_ca5984c -> origin/fca2_ca5984c 2025-12-04T08:53:59.3048277Z * [new branch] fca5 -> origin/fca5 2025-12-04T08:53:59.3048458Z * [new branch] feature/justknobs-cpp -> origin/feature/justknobs-cpp 2025-12-04T08:53:59.3048660Z * [new branch] feature/numa-forkserver -> origin/feature/numa-forkserver 2025-12-04T08:53:59.3048853Z * [new branch] ffast_math_baseline -> origin/ffast_math_baseline 2025-12-04T08:53:59.3049034Z * [new branch] ffast_math_target -> origin/ffast_math_target 2025-12-04T08:53:59.3049217Z * [new branch] findhao/base_commit -> origin/findhao/base_commit 2025-12-04T08:53:59.3049404Z * [new branch] findhao/base_commit1 -> origin/findhao/base_commit1 2025-12-04T08:53:59.3049597Z * [new branch] findhao/multistream2 -> origin/findhao/multistream2 2025-12-04T08:53:59.3049786Z * [new branch] findhao/multistream5 -> origin/findhao/multistream5 2025-12-04T08:53:59.3049975Z * [new branch] findhao/multistream6 -> origin/findhao/multistream6 2025-12-04T08:53:59.3050170Z * [new branch] findhao/operatorbench3 -> origin/findhao/operatorbench3 2025-12-04T08:53:59.3050373Z * [new branch] findhao/operatorbench5 -> origin/findhao/operatorbench5 2025-12-04T08:53:59.3050563Z * [new branch] findhao/tritonparse -> origin/findhao/tritonparse 2025-12-04T08:53:59.3050780Z * [new branch] fix-ck-gemm-template-format -> origin/fix-ck-gemm-template-format 2025-12-04T08:53:59.3050991Z * [new branch] fix-config-ignore -> origin/fix-config-ignore 2025-12-04T08:53:59.3051168Z * [new branch] fix-dict-guard -> origin/fix-dict-guard 2025-12-04T08:53:59.3051349Z * [new branch] fix_addmm_issue -> origin/fix_addmm_issue 2025-12-04T08:53:59.3051546Z * [new branch] fix_amd_missing_cluster_dims -> origin/fix_amd_missing_cluster_dims 2025-12-04T08:53:59.3051746Z * [new branch] fix_bench_bwd_pass -> origin/fix_bench_bwd_pass 2025-12-04T08:53:59.3051968Z * [new branch] fix_mem_profiler_config -> origin/fix_mem_profiler_config 2025-12-04T08:53:59.3052158Z * [new branch] fix_nvrtc_discovery -> origin/fix_nvrtc_discovery 2025-12-04T08:53:59.3052330Z * [new branch] fix_op_runner -> origin/fix_op_runner 2025-12-04T08:53:59.3052533Z * [new branch] fix_ubn_159469 -> origin/fix_ubn_159469 2025-12-04T08:53:59.3052707Z * [new branch] fixes-triage -> origin/fixes-triage 2025-12-04T08:53:59.3052876Z * [new branch] fixflashinfer -> origin/fixflashinfer 2025-12-04T08:53:59.3053057Z * [new branch] flash_decoding_cpu -> origin/flash_decoding_cpu 2025-12-04T08:53:59.3053237Z * [new branch] flex-flash -> origin/flex-flash 2025-12-04T08:53:59.3053434Z * [new branch] flex_attention_functorch_grad -> origin/flex_attention_functorch_grad 2025-12-04T08:53:59.3053635Z * [new branch] flex_flash -> origin/flex_flash 2025-12-04T08:53:59.3053841Z * [new branch] fmassa/fix_memeff_sharding_rule -> origin/fmassa/fix_memeff_sharding_rule 2025-12-04T08:53:59.3054082Z * [new branch] fmassa/tests_comm_compute_scheduler -> origin/fmassa/tests_comm_compute_scheduler 2025-12-04T08:53:59.3054305Z * [new branch] forkserver_fix -> origin/forkserver_fix 2025-12-04T08:53:59.3054482Z * [new branch] fsdp2_trace_rules -> origin/fsdp2_trace_rules 2025-12-04T08:53:59.3054653Z * [new branch] fx_cpp -> origin/fx_cpp 2025-12-04T08:53:59.3054853Z * [new branch] fy/fix-win -> origin/fy/fix-win 2025-12-04T08:53:59.3055019Z * [new branch] galv-patch-1 -> origin/galv-patch-1 2025-12-04T08:53:59.3055254Z * [new branch] galv/cudagraphs-conditional-nodes-4 -> origin/galv/cudagraphs-conditional-nodes-4 2025-12-04T08:53:59.3055513Z * [new branch] georgehong/cmakelists-patch -> origin/georgehong/cmakelists-patch 2025-12-04T08:53:59.3055720Z * [new branch] gh/AlnisM/1/base -> origin/gh/AlnisM/1/base 2025-12-04T08:53:59.3055906Z * [new branch] gh/AlnisM/1/head -> origin/gh/AlnisM/1/head 2025-12-04T08:53:59.3056095Z * [new branch] gh/EikanWang/67/base -> origin/gh/EikanWang/67/base 2025-12-04T08:53:59.3056283Z * [new branch] gh/EikanWang/67/head -> origin/gh/EikanWang/67/head 2025-12-04T08:53:59.3056472Z * [new branch] gh/Gasoonjia/1/base -> origin/gh/Gasoonjia/1/base 2025-12-04T08:53:59.3056661Z * [new branch] gh/Gasoonjia/1/head -> origin/gh/Gasoonjia/1/head 2025-12-04T08:53:59.3056841Z * [new branch] gh/H-Huang/131/base -> origin/gh/H-Huang/131/base 2025-12-04T08:53:59.3057026Z * [new branch] gh/H-Huang/131/head -> origin/gh/H-Huang/131/head 2025-12-04T08:53:59.3057205Z * [new branch] gh/H-Huang/131/orig -> origin/gh/H-Huang/131/orig 2025-12-04T08:53:59.3057381Z * [new branch] gh/H-Huang/132/base -> origin/gh/H-Huang/132/base 2025-12-04T08:53:59.3057563Z * [new branch] gh/H-Huang/132/head -> origin/gh/H-Huang/132/head 2025-12-04T08:53:59.3057740Z * [new branch] gh/H-Huang/132/orig -> origin/gh/H-Huang/132/orig 2025-12-04T08:53:59.3057916Z * [new branch] gh/H-Huang/180/base -> origin/gh/H-Huang/180/base 2025-12-04T08:53:59.3058095Z * [new branch] gh/H-Huang/180/head -> origin/gh/H-Huang/180/head 2025-12-04T08:53:59.3058279Z * [new branch] gh/H-Huang/180/orig -> origin/gh/H-Huang/180/orig 2025-12-04T08:53:59.3058458Z * [new branch] gh/H-Huang/182/base -> origin/gh/H-Huang/182/base 2025-12-04T08:53:59.3058638Z * [new branch] gh/H-Huang/182/head -> origin/gh/H-Huang/182/head 2025-12-04T08:53:59.3058810Z * [new branch] gh/H-Huang/182/orig -> origin/gh/H-Huang/182/orig 2025-12-04T08:53:59.3058986Z * [new branch] gh/H-Huang/226/base -> origin/gh/H-Huang/226/base 2025-12-04T08:53:59.3059181Z * [new branch] gh/H-Huang/226/head -> origin/gh/H-Huang/226/head 2025-12-04T08:53:59.3059354Z * [new branch] gh/H-Huang/226/orig -> origin/gh/H-Huang/226/orig 2025-12-04T08:53:59.3059529Z * [new branch] gh/H-Huang/228/base -> origin/gh/H-Huang/228/base 2025-12-04T08:53:59.3059704Z * [new branch] gh/H-Huang/228/head -> origin/gh/H-Huang/228/head 2025-12-04T08:53:59.3059883Z * [new branch] gh/H-Huang/228/orig -> origin/gh/H-Huang/228/orig 2025-12-04T08:53:59.3060076Z * [new branch] gh/IvanKobzarev/150/base -> origin/gh/IvanKobzarev/150/base 2025-12-04T08:53:59.3060277Z * [new branch] gh/IvanKobzarev/150/head -> origin/gh/IvanKobzarev/150/head 2025-12-04T08:53:59.3060474Z * [new branch] gh/IvanKobzarev/150/orig -> origin/gh/IvanKobzarev/150/orig 2025-12-04T08:53:59.3060672Z * [new branch] gh/IvanKobzarev/157/base -> origin/gh/IvanKobzarev/157/base 2025-12-04T08:53:59.3060875Z * [new branch] gh/IvanKobzarev/157/head -> origin/gh/IvanKobzarev/157/head 2025-12-04T08:53:59.3061068Z * [new branch] gh/IvanKobzarev/157/orig -> origin/gh/IvanKobzarev/157/orig 2025-12-04T08:53:59.3061264Z * [new branch] gh/IvanKobzarev/159/base -> origin/gh/IvanKobzarev/159/base 2025-12-04T08:53:59.3061484Z * [new branch] gh/IvanKobzarev/159/head -> origin/gh/IvanKobzarev/159/head 2025-12-04T08:53:59.3061682Z * [new branch] gh/IvanKobzarev/159/orig -> origin/gh/IvanKobzarev/159/orig 2025-12-04T08:53:59.3061915Z * [new branch] gh/IvanKobzarev/162/base -> origin/gh/IvanKobzarev/162/base 2025-12-04T08:53:59.3062110Z * [new branch] gh/IvanKobzarev/162/head -> origin/gh/IvanKobzarev/162/head 2025-12-04T08:53:59.3062300Z * [new branch] gh/IvanKobzarev/162/orig -> origin/gh/IvanKobzarev/162/orig 2025-12-04T08:53:59.3062499Z * [new branch] gh/IvanKobzarev/163/base -> origin/gh/IvanKobzarev/163/base 2025-12-04T08:53:59.3062697Z * [new branch] gh/IvanKobzarev/163/head -> origin/gh/IvanKobzarev/163/head 2025-12-04T08:53:59.3062890Z * [new branch] gh/IvanKobzarev/163/orig -> origin/gh/IvanKobzarev/163/orig 2025-12-04T08:53:59.3063084Z * [new branch] gh/IvanKobzarev/166/base -> origin/gh/IvanKobzarev/166/base 2025-12-04T08:53:59.3063283Z * [new branch] gh/IvanKobzarev/166/head -> origin/gh/IvanKobzarev/166/head 2025-12-04T08:53:59.3063478Z * [new branch] gh/IvanKobzarev/166/orig -> origin/gh/IvanKobzarev/166/orig 2025-12-04T08:53:59.3063674Z * [new branch] gh/IvanKobzarev/167/base -> origin/gh/IvanKobzarev/167/base 2025-12-04T08:53:59.3063872Z * [new branch] gh/IvanKobzarev/167/head -> origin/gh/IvanKobzarev/167/head 2025-12-04T08:53:59.3064064Z * [new branch] gh/IvanKobzarev/167/orig -> origin/gh/IvanKobzarev/167/orig 2025-12-04T08:53:59.3064270Z * [new branch] gh/IvanKobzarev/168/base -> origin/gh/IvanKobzarev/168/base 2025-12-04T08:53:59.3064467Z * [new branch] gh/IvanKobzarev/168/head -> origin/gh/IvanKobzarev/168/head 2025-12-04T08:53:59.3064659Z * [new branch] gh/IvanKobzarev/168/orig -> origin/gh/IvanKobzarev/168/orig 2025-12-04T08:53:59.3064854Z * [new branch] gh/IvanKobzarev/169/base -> origin/gh/IvanKobzarev/169/base 2025-12-04T08:53:59.3065048Z * [new branch] gh/IvanKobzarev/169/head -> origin/gh/IvanKobzarev/169/head 2025-12-04T08:53:59.3065243Z * [new branch] gh/IvanKobzarev/169/orig -> origin/gh/IvanKobzarev/169/orig 2025-12-04T08:53:59.3065438Z * [new branch] gh/IvanKobzarev/170/base -> origin/gh/IvanKobzarev/170/base 2025-12-04T08:53:59.3065629Z * [new branch] gh/IvanKobzarev/170/head -> origin/gh/IvanKobzarev/170/head 2025-12-04T08:53:59.3065827Z * [new branch] gh/IvanKobzarev/170/orig -> origin/gh/IvanKobzarev/170/orig 2025-12-04T08:53:59.3066066Z * [new branch] gh/IvanKobzarev/171/base -> origin/gh/IvanKobzarev/171/base 2025-12-04T08:53:59.3066257Z * [new branch] gh/IvanKobzarev/171/head -> origin/gh/IvanKobzarev/171/head 2025-12-04T08:53:59.3066453Z * [new branch] gh/IvanKobzarev/171/orig -> origin/gh/IvanKobzarev/171/orig 2025-12-04T08:53:59.3066649Z * [new branch] gh/IvanKobzarev/172/base -> origin/gh/IvanKobzarev/172/base 2025-12-04T08:53:59.3066847Z * [new branch] gh/IvanKobzarev/172/head -> origin/gh/IvanKobzarev/172/head 2025-12-04T08:53:59.3067042Z * [new branch] gh/IvanKobzarev/172/orig -> origin/gh/IvanKobzarev/172/orig 2025-12-04T08:53:59.3067235Z * [new branch] gh/IvanKobzarev/173/base -> origin/gh/IvanKobzarev/173/base 2025-12-04T08:53:59.3067426Z * [new branch] gh/IvanKobzarev/173/head -> origin/gh/IvanKobzarev/173/head 2025-12-04T08:53:59.3067636Z * [new branch] gh/IvanKobzarev/173/orig -> origin/gh/IvanKobzarev/173/orig 2025-12-04T08:53:59.3067831Z * [new branch] gh/IvanKobzarev/174/base -> origin/gh/IvanKobzarev/174/base 2025-12-04T08:53:59.3068022Z * [new branch] gh/IvanKobzarev/174/head -> origin/gh/IvanKobzarev/174/head 2025-12-04T08:53:59.3068243Z * [new branch] gh/IvanKobzarev/174/orig -> origin/gh/IvanKobzarev/174/orig 2025-12-04T08:53:59.3068438Z * [new branch] gh/IvanKobzarev/175/base -> origin/gh/IvanKobzarev/175/base 2025-12-04T08:53:59.3068629Z * [new branch] gh/IvanKobzarev/175/head -> origin/gh/IvanKobzarev/175/head 2025-12-04T08:53:59.3068827Z * [new branch] gh/IvanKobzarev/175/orig -> origin/gh/IvanKobzarev/175/orig 2025-12-04T08:53:59.3069020Z * [new branch] gh/IvanKobzarev/176/base -> origin/gh/IvanKobzarev/176/base 2025-12-04T08:53:59.3069213Z * [new branch] gh/IvanKobzarev/176/head -> origin/gh/IvanKobzarev/176/head 2025-12-04T08:53:59.3069410Z * [new branch] gh/IvanKobzarev/176/orig -> origin/gh/IvanKobzarev/176/orig 2025-12-04T08:53:59.3069606Z * [new branch] gh/IvanKobzarev/177/base -> origin/gh/IvanKobzarev/177/base 2025-12-04T08:53:59.3069801Z * [new branch] gh/IvanKobzarev/177/head -> origin/gh/IvanKobzarev/177/head 2025-12-04T08:53:59.3070001Z * [new branch] gh/IvanKobzarev/177/orig -> origin/gh/IvanKobzarev/177/orig 2025-12-04T08:53:59.3070198Z * [new branch] gh/IvanKobzarev/178/base -> origin/gh/IvanKobzarev/178/base 2025-12-04T08:53:59.3070391Z * [new branch] gh/IvanKobzarev/178/head -> origin/gh/IvanKobzarev/178/head 2025-12-04T08:53:59.3070586Z * [new branch] gh/IvanKobzarev/178/orig -> origin/gh/IvanKobzarev/178/orig 2025-12-04T08:53:59.3070784Z * [new branch] gh/IvanKobzarev/179/base -> origin/gh/IvanKobzarev/179/base 2025-12-04T08:53:59.3070978Z * [new branch] gh/IvanKobzarev/179/head -> origin/gh/IvanKobzarev/179/head 2025-12-04T08:53:59.3071174Z * [new branch] gh/IvanKobzarev/179/orig -> origin/gh/IvanKobzarev/179/orig 2025-12-04T08:53:59.3071366Z * [new branch] gh/IvanKobzarev/180/base -> origin/gh/IvanKobzarev/180/base 2025-12-04T08:53:59.3071565Z * [new branch] gh/IvanKobzarev/180/head -> origin/gh/IvanKobzarev/180/head 2025-12-04T08:53:59.3071761Z * [new branch] gh/IvanKobzarev/180/orig -> origin/gh/IvanKobzarev/180/orig 2025-12-04T08:53:59.3071995Z * [new branch] gh/IvanKobzarev/181/base -> origin/gh/IvanKobzarev/181/base 2025-12-04T08:53:59.3072190Z * [new branch] gh/IvanKobzarev/181/head -> origin/gh/IvanKobzarev/181/head 2025-12-04T08:53:59.3072390Z * [new branch] gh/IvanKobzarev/181/orig -> origin/gh/IvanKobzarev/181/orig 2025-12-04T08:53:59.3072583Z * [new branch] gh/IvanKobzarev/182/base -> origin/gh/IvanKobzarev/182/base 2025-12-04T08:53:59.3072813Z * [new branch] gh/IvanKobzarev/182/head -> origin/gh/IvanKobzarev/182/head 2025-12-04T08:53:59.3073009Z * [new branch] gh/IvanKobzarev/182/orig -> origin/gh/IvanKobzarev/182/orig 2025-12-04T08:53:59.3073203Z * [new branch] gh/IvanKobzarev/183/base -> origin/gh/IvanKobzarev/183/base 2025-12-04T08:53:59.3073400Z * [new branch] gh/IvanKobzarev/183/head -> origin/gh/IvanKobzarev/183/head 2025-12-04T08:53:59.3073598Z * [new branch] gh/IvanKobzarev/183/orig -> origin/gh/IvanKobzarev/183/orig 2025-12-04T08:53:59.3073794Z * [new branch] gh/IvanKobzarev/184/base -> origin/gh/IvanKobzarev/184/base 2025-12-04T08:53:59.3073992Z * [new branch] gh/IvanKobzarev/184/head -> origin/gh/IvanKobzarev/184/head 2025-12-04T08:53:59.3074187Z * [new branch] gh/IvanKobzarev/184/orig -> origin/gh/IvanKobzarev/184/orig 2025-12-04T08:53:59.3074388Z * [new branch] gh/NikhilAPatel/1/base -> origin/gh/NikhilAPatel/1/base 2025-12-04T08:53:59.3074585Z * [new branch] gh/NikhilAPatel/1/head -> origin/gh/NikhilAPatel/1/head 2025-12-04T08:53:59.3074778Z * [new branch] gh/NikhilAPatel/2/base -> origin/gh/NikhilAPatel/2/base 2025-12-04T08:53:59.3074966Z * [new branch] gh/NikhilAPatel/2/head -> origin/gh/NikhilAPatel/2/head 2025-12-04T08:53:59.3075426Z * [new branch] gh/NikhilAPatel/4/base -> origin/gh/NikhilAPatel/4/base 2025-12-04T08:53:59.3075623Z * [new branch] gh/NikhilAPatel/4/head -> origin/gh/NikhilAPatel/4/head 2025-12-04T08:53:59.3075809Z * [new branch] gh/NikhilAPatel/5/base -> origin/gh/NikhilAPatel/5/base 2025-12-04T08:53:59.3076001Z * [new branch] gh/NikhilAPatel/5/head -> origin/gh/NikhilAPatel/5/head 2025-12-04T08:53:59.3076191Z * [new branch] gh/NikhilAPatel/5/orig -> origin/gh/NikhilAPatel/5/orig 2025-12-04T08:53:59.3076377Z * [new branch] gh/PaliC/17/base -> origin/gh/PaliC/17/base 2025-12-04T08:53:59.3076553Z * [new branch] gh/PaliC/17/head -> origin/gh/PaliC/17/head 2025-12-04T08:53:59.3076725Z * [new branch] gh/PaliC/17/orig -> origin/gh/PaliC/17/orig 2025-12-04T08:53:59.3076892Z * [new branch] gh/PaliC/18/base -> origin/gh/PaliC/18/base 2025-12-04T08:53:59.3077067Z * [new branch] gh/PaliC/18/head -> origin/gh/PaliC/18/head 2025-12-04T08:53:59.3077242Z * [new branch] gh/PaliC/18/orig -> origin/gh/PaliC/18/orig 2025-12-04T08:53:59.3077408Z * [new branch] gh/PaliC/20/base -> origin/gh/PaliC/20/base 2025-12-04T08:53:59.3077578Z * [new branch] gh/PaliC/20/head -> origin/gh/PaliC/20/head 2025-12-04T08:53:59.3077744Z * [new branch] gh/PaliC/20/orig -> origin/gh/PaliC/20/orig 2025-12-04T08:53:59.3077913Z * [new branch] gh/PaliC/21/base -> origin/gh/PaliC/21/base 2025-12-04T08:53:59.3078081Z * [new branch] gh/PaliC/21/head -> origin/gh/PaliC/21/head 2025-12-04T08:53:59.3078248Z * [new branch] gh/PaliC/21/orig -> origin/gh/PaliC/21/orig 2025-12-04T08:53:59.3078417Z * [new branch] gh/PaliC/23/base -> origin/gh/PaliC/23/base 2025-12-04T08:53:59.3078588Z * [new branch] gh/PaliC/23/head -> origin/gh/PaliC/23/head 2025-12-04T08:53:59.3078755Z * [new branch] gh/PaliC/23/orig -> origin/gh/PaliC/23/orig 2025-12-04T08:53:59.3078925Z * [new branch] gh/PaliC/24/base -> origin/gh/PaliC/24/base 2025-12-04T08:53:59.3079096Z * [new branch] gh/PaliC/24/head -> origin/gh/PaliC/24/head 2025-12-04T08:53:59.3079262Z * [new branch] gh/PaliC/24/orig -> origin/gh/PaliC/24/orig 2025-12-04T08:53:59.3079431Z * [new branch] gh/PaliC/25/head -> origin/gh/PaliC/25/head 2025-12-04T08:53:59.3079627Z * [new branch] gh/PaliC/25/next -> origin/gh/PaliC/25/next 2025-12-04T08:53:59.3079793Z * [new branch] gh/PaliC/25/orig -> origin/gh/PaliC/25/orig 2025-12-04T08:53:59.3079965Z * [new branch] gh/PaliC/26/head -> origin/gh/PaliC/26/head 2025-12-04T08:53:59.3080134Z * [new branch] gh/PaliC/26/next -> origin/gh/PaliC/26/next 2025-12-04T08:53:59.3080300Z * [new branch] gh/PaliC/26/orig -> origin/gh/PaliC/26/orig 2025-12-04T08:53:59.3080468Z * [new branch] gh/PaliC/27/next -> origin/gh/PaliC/27/next 2025-12-04T08:53:59.3080638Z * [new branch] gh/PaliC/28/head -> origin/gh/PaliC/28/head 2025-12-04T08:53:59.3080806Z * [new branch] gh/PaliC/28/next -> origin/gh/PaliC/28/next 2025-12-04T08:53:59.3080976Z * [new branch] gh/PaliC/28/orig -> origin/gh/PaliC/28/orig 2025-12-04T08:53:59.3081146Z * [new branch] gh/PaliC/29/head -> origin/gh/PaliC/29/head 2025-12-04T08:53:59.3081317Z * [new branch] gh/PaliC/29/next -> origin/gh/PaliC/29/next 2025-12-04T08:53:59.3081488Z * [new branch] gh/PaliC/29/orig -> origin/gh/PaliC/29/orig 2025-12-04T08:53:59.3081680Z * [new branch] gh/PaliC/30/head -> origin/gh/PaliC/30/head 2025-12-04T08:53:59.3081899Z * [new branch] gh/PaliC/30/next -> origin/gh/PaliC/30/next 2025-12-04T08:53:59.3082069Z * [new branch] gh/PaliC/30/orig -> origin/gh/PaliC/30/orig 2025-12-04T08:53:59.3082235Z * [new branch] gh/PaliC/31/head -> origin/gh/PaliC/31/head 2025-12-04T08:53:59.3082403Z * [new branch] gh/PaliC/31/next -> origin/gh/PaliC/31/next 2025-12-04T08:53:59.3082576Z * [new branch] gh/PaliC/31/orig -> origin/gh/PaliC/31/orig 2025-12-04T08:53:59.3082759Z * [new branch] gh/PaulZhang12/25/base -> origin/gh/PaulZhang12/25/base 2025-12-04T08:53:59.3082949Z * [new branch] gh/PaulZhang12/25/head -> origin/gh/PaulZhang12/25/head 2025-12-04T08:53:59.3083140Z * [new branch] gh/PaulZhang12/25/orig -> origin/gh/PaulZhang12/25/orig 2025-12-04T08:53:59.3083332Z * [new branch] gh/PaulZhang12/28/base -> origin/gh/PaulZhang12/28/base 2025-12-04T08:53:59.3083528Z * [new branch] gh/PaulZhang12/28/head -> origin/gh/PaulZhang12/28/head 2025-12-04T08:53:59.3083719Z * [new branch] gh/PaulZhang12/28/orig -> origin/gh/PaulZhang12/28/orig 2025-12-04T08:53:59.3083908Z * [new branch] gh/PaulZhang12/31/base -> origin/gh/PaulZhang12/31/base 2025-12-04T08:53:59.3084099Z * [new branch] gh/PaulZhang12/31/head -> origin/gh/PaulZhang12/31/head 2025-12-04T08:53:59.3084292Z * [new branch] gh/PaulZhang12/31/orig -> origin/gh/PaulZhang12/31/orig 2025-12-04T08:53:59.3084482Z * [new branch] gh/PaulZhang12/37/base -> origin/gh/PaulZhang12/37/base 2025-12-04T08:53:59.3084673Z * [new branch] gh/PaulZhang12/37/head -> origin/gh/PaulZhang12/37/head 2025-12-04T08:53:59.3084867Z * [new branch] gh/PaulZhang12/37/orig -> origin/gh/PaulZhang12/37/orig 2025-12-04T08:53:59.3085058Z * [new branch] gh/PaulZhang12/40/base -> origin/gh/PaulZhang12/40/base 2025-12-04T08:53:59.3085252Z * [new branch] gh/PaulZhang12/40/head -> origin/gh/PaulZhang12/40/head 2025-12-04T08:53:59.3085441Z * [new branch] gh/PaulZhang12/40/orig -> origin/gh/PaulZhang12/40/orig 2025-12-04T08:53:59.3085634Z * [new branch] gh/PaulZhang12/42/base -> origin/gh/PaulZhang12/42/base 2025-12-04T08:53:59.3085825Z * [new branch] gh/PaulZhang12/42/head -> origin/gh/PaulZhang12/42/head 2025-12-04T08:53:59.3086016Z * [new branch] gh/PaulZhang12/43/base -> origin/gh/PaulZhang12/43/base 2025-12-04T08:53:59.3086241Z * [new branch] gh/PaulZhang12/43/head -> origin/gh/PaulZhang12/43/head 2025-12-04T08:53:59.3086433Z * [new branch] gh/PaulZhang12/43/orig -> origin/gh/PaulZhang12/43/orig 2025-12-04T08:53:59.3086621Z * [new branch] gh/PaulZhang12/44/base -> origin/gh/PaulZhang12/44/base 2025-12-04T08:53:59.3086824Z * [new branch] gh/PaulZhang12/44/head -> origin/gh/PaulZhang12/44/head 2025-12-04T08:53:59.3087020Z * [new branch] gh/PaulZhang12/45/base -> origin/gh/PaulZhang12/45/base 2025-12-04T08:53:59.3087207Z * [new branch] gh/PaulZhang12/45/head -> origin/gh/PaulZhang12/45/head 2025-12-04T08:53:59.3087401Z * [new branch] gh/PaulZhang12/45/orig -> origin/gh/PaulZhang12/45/orig 2025-12-04T08:53:59.3087594Z * [new branch] gh/PaulZhang12/46/base -> origin/gh/PaulZhang12/46/base 2025-12-04T08:53:59.3087787Z * [new branch] gh/PaulZhang12/46/head -> origin/gh/PaulZhang12/46/head 2025-12-04T08:53:59.3087979Z * [new branch] gh/PaulZhang12/46/orig -> origin/gh/PaulZhang12/46/orig 2025-12-04T08:53:59.3088171Z * [new branch] gh/PaulZhang12/47/base -> origin/gh/PaulZhang12/47/base 2025-12-04T08:53:59.3088359Z * [new branch] gh/PaulZhang12/47/head -> origin/gh/PaulZhang12/47/head 2025-12-04T08:53:59.3088595Z * [new branch] gh/PaulZhang12/47/orig -> origin/gh/PaulZhang12/47/orig 2025-12-04T08:53:59.3088788Z * [new branch] gh/PaulZhang12/48/base -> origin/gh/PaulZhang12/48/base 2025-12-04T08:53:59.3088975Z * [new branch] gh/PaulZhang12/48/head -> origin/gh/PaulZhang12/48/head 2025-12-04T08:53:59.3089168Z * [new branch] gh/PaulZhang12/48/orig -> origin/gh/PaulZhang12/48/orig 2025-12-04T08:53:59.3089362Z * [new branch] gh/SamGinzburg/11/base -> origin/gh/SamGinzburg/11/base 2025-12-04T08:53:59.3089557Z * [new branch] gh/SamGinzburg/11/head -> origin/gh/SamGinzburg/11/head 2025-12-04T08:53:59.3089757Z * [new branch] gh/SherlockNoMad/1/base -> origin/gh/SherlockNoMad/1/base 2025-12-04T08:53:59.3089956Z * [new branch] gh/SherlockNoMad/1/head -> origin/gh/SherlockNoMad/1/head 2025-12-04T08:53:59.3090160Z * [new branch] gh/SherlockNoMad/10/base -> origin/gh/SherlockNoMad/10/base 2025-12-04T08:53:59.3090361Z * [new branch] gh/SherlockNoMad/10/head -> origin/gh/SherlockNoMad/10/head 2025-12-04T08:53:59.3090557Z * [new branch] gh/SherlockNoMad/10/orig -> origin/gh/SherlockNoMad/10/orig 2025-12-04T08:53:59.3090757Z * [new branch] gh/SherlockNoMad/11/base -> origin/gh/SherlockNoMad/11/base 2025-12-04T08:53:59.3090961Z * [new branch] gh/SherlockNoMad/11/head -> origin/gh/SherlockNoMad/11/head 2025-12-04T08:53:59.3091159Z * [new branch] gh/SherlockNoMad/11/orig -> origin/gh/SherlockNoMad/11/orig 2025-12-04T08:53:59.3091360Z * [new branch] gh/SherlockNoMad/12/base -> origin/gh/SherlockNoMad/12/base 2025-12-04T08:53:59.3091559Z * [new branch] gh/SherlockNoMad/12/head -> origin/gh/SherlockNoMad/12/head 2025-12-04T08:53:59.3091755Z * [new branch] gh/SherlockNoMad/12/orig -> origin/gh/SherlockNoMad/12/orig 2025-12-04T08:53:59.3091987Z * [new branch] gh/SherlockNoMad/15/base -> origin/gh/SherlockNoMad/15/base 2025-12-04T08:53:59.3092189Z * [new branch] gh/SherlockNoMad/15/head -> origin/gh/SherlockNoMad/15/head 2025-12-04T08:53:59.3092383Z * [new branch] gh/SherlockNoMad/15/orig -> origin/gh/SherlockNoMad/15/orig 2025-12-04T08:53:59.3092578Z * [new branch] gh/SherlockNoMad/17/base -> origin/gh/SherlockNoMad/17/base 2025-12-04T08:53:59.3092774Z * [new branch] gh/SherlockNoMad/17/head -> origin/gh/SherlockNoMad/17/head 2025-12-04T08:53:59.3093005Z * [new branch] gh/SherlockNoMad/17/orig -> origin/gh/SherlockNoMad/17/orig 2025-12-04T08:53:59.3093206Z * [new branch] gh/SherlockNoMad/18/base -> origin/gh/SherlockNoMad/18/base 2025-12-04T08:53:59.3093404Z * [new branch] gh/SherlockNoMad/18/head -> origin/gh/SherlockNoMad/18/head 2025-12-04T08:53:59.3093602Z * [new branch] gh/SherlockNoMad/18/orig -> origin/gh/SherlockNoMad/18/orig 2025-12-04T08:53:59.3093797Z * [new branch] gh/SherlockNoMad/19/base -> origin/gh/SherlockNoMad/19/base 2025-12-04T08:53:59.3093993Z * [new branch] gh/SherlockNoMad/19/head -> origin/gh/SherlockNoMad/19/head 2025-12-04T08:53:59.3094185Z * [new branch] gh/SherlockNoMad/19/orig -> origin/gh/SherlockNoMad/19/orig 2025-12-04T08:53:59.3094381Z * [new branch] gh/SherlockNoMad/2/base -> origin/gh/SherlockNoMad/2/base 2025-12-04T08:53:59.3094576Z * [new branch] gh/SherlockNoMad/2/head -> origin/gh/SherlockNoMad/2/head 2025-12-04T08:53:59.3094770Z * [new branch] gh/SherlockNoMad/20/base -> origin/gh/SherlockNoMad/20/base 2025-12-04T08:53:59.3094967Z * [new branch] gh/SherlockNoMad/20/head -> origin/gh/SherlockNoMad/20/head 2025-12-04T08:53:59.3095162Z * [new branch] gh/SherlockNoMad/20/orig -> origin/gh/SherlockNoMad/20/orig 2025-12-04T08:53:59.3095388Z * [new branch] gh/SherlockNoMad/21/base -> origin/gh/SherlockNoMad/21/base 2025-12-04T08:53:59.3095583Z * [new branch] gh/SherlockNoMad/21/head -> origin/gh/SherlockNoMad/21/head 2025-12-04T08:53:59.3095777Z * [new branch] gh/SherlockNoMad/21/orig -> origin/gh/SherlockNoMad/21/orig 2025-12-04T08:53:59.3095968Z * [new branch] gh/SherlockNoMad/3/base -> origin/gh/SherlockNoMad/3/base 2025-12-04T08:53:59.3096167Z * [new branch] gh/SherlockNoMad/3/head -> origin/gh/SherlockNoMad/3/head 2025-12-04T08:53:59.3096362Z * [new branch] gh/SherlockNoMad/4/base -> origin/gh/SherlockNoMad/4/base 2025-12-04T08:53:59.3096549Z * [new branch] gh/SherlockNoMad/4/head -> origin/gh/SherlockNoMad/4/head 2025-12-04T08:53:59.3096741Z * [new branch] gh/SherlockNoMad/5/base -> origin/gh/SherlockNoMad/5/base 2025-12-04T08:53:59.3096932Z * [new branch] gh/SherlockNoMad/5/head -> origin/gh/SherlockNoMad/5/head 2025-12-04T08:53:59.3097138Z * [new branch] gh/Sidharth123-cpu/24/base -> origin/gh/Sidharth123-cpu/24/base 2025-12-04T08:53:59.3097351Z * [new branch] gh/Sidharth123-cpu/25/base -> origin/gh/Sidharth123-cpu/25/base 2025-12-04T08:53:59.3097558Z * [new branch] gh/Sidharth123-cpu/26/base -> origin/gh/Sidharth123-cpu/26/base 2025-12-04T08:53:59.3097765Z * [new branch] gh/Sidharth123-cpu/27/base -> origin/gh/Sidharth123-cpu/27/base 2025-12-04T08:53:59.3097968Z * [new branch] gh/StrongerXi/1/base -> origin/gh/StrongerXi/1/base 2025-12-04T08:53:59.3098155Z * [new branch] gh/StrongerXi/1/head -> origin/gh/StrongerXi/1/head 2025-12-04T08:53:59.3098342Z * [new branch] gh/StrongerXi/71/base -> origin/gh/StrongerXi/71/base 2025-12-04T08:53:59.3098536Z * [new branch] gh/StrongerXi/71/head -> origin/gh/StrongerXi/71/head 2025-12-04T08:53:59.3118306Z * [new branch] gh/StrongerXi/72/base -> origin/gh/StrongerXi/72/base 2025-12-04T08:53:59.3118548Z * [new branch] gh/StrongerXi/72/head -> origin/gh/StrongerXi/72/head 2025-12-04T08:53:59.3118744Z * [new branch] gh/StrongerXi/73/base -> origin/gh/StrongerXi/73/base 2025-12-04T08:53:59.3118959Z * [new branch] gh/StrongerXi/73/head -> origin/gh/StrongerXi/73/head 2025-12-04T08:53:59.3119163Z * [new branch] gh/StrongerXi/73/orig -> origin/gh/StrongerXi/73/orig 2025-12-04T08:53:59.3119379Z * [new branch] gh/XilunWu/160/base -> origin/gh/XilunWu/160/base 2025-12-04T08:53:59.3119677Z * [new branch] gh/XilunWu/160/head -> origin/gh/XilunWu/160/head 2025-12-04T08:53:59.3119876Z * [new branch] gh/XilunWu/160/orig -> origin/gh/XilunWu/160/orig 2025-12-04T08:53:59.3120082Z * [new branch] gh/XilunWu/163/base -> origin/gh/XilunWu/163/base 2025-12-04T08:53:59.3120269Z * [new branch] gh/XilunWu/163/head -> origin/gh/XilunWu/163/head 2025-12-04T08:53:59.3120453Z * [new branch] gh/XilunWu/163/orig -> origin/gh/XilunWu/163/orig 2025-12-04T08:53:59.3120634Z * [new branch] gh/XilunWu/168/base -> origin/gh/XilunWu/168/base 2025-12-04T08:53:59.3120818Z * [new branch] gh/XilunWu/168/head -> origin/gh/XilunWu/168/head 2025-12-04T08:53:59.3121004Z * [new branch] gh/XilunWu/168/orig -> origin/gh/XilunWu/168/orig 2025-12-04T08:53:59.3121190Z * [new branch] gh/XilunWu/169/base -> origin/gh/XilunWu/169/base 2025-12-04T08:53:59.3121374Z * [new branch] gh/XilunWu/169/head -> origin/gh/XilunWu/169/head 2025-12-04T08:53:59.3121597Z * [new branch] gh/XilunWu/169/orig -> origin/gh/XilunWu/169/orig 2025-12-04T08:53:59.3121790Z * [new branch] gh/XilunWu/170/base -> origin/gh/XilunWu/170/base 2025-12-04T08:53:59.3122059Z * [new branch] gh/XilunWu/170/head -> origin/gh/XilunWu/170/head 2025-12-04T08:53:59.3122254Z * [new branch] gh/XilunWu/170/orig -> origin/gh/XilunWu/170/orig 2025-12-04T08:53:59.3122437Z * [new branch] gh/XilunWu/171/base -> origin/gh/XilunWu/171/base 2025-12-04T08:53:59.3122617Z * [new branch] gh/XilunWu/171/head -> origin/gh/XilunWu/171/head 2025-12-04T08:53:59.3122798Z * [new branch] gh/XilunWu/171/orig -> origin/gh/XilunWu/171/orig 2025-12-04T08:53:59.3122980Z * [new branch] gh/XilunWu/173/base -> origin/gh/XilunWu/173/base 2025-12-04T08:53:59.3123166Z * [new branch] gh/XilunWu/173/head -> origin/gh/XilunWu/173/head 2025-12-04T08:53:59.3123347Z * [new branch] gh/XilunWu/173/orig -> origin/gh/XilunWu/173/orig 2025-12-04T08:53:59.3123528Z * [new branch] gh/XilunWu/175/base -> origin/gh/XilunWu/175/base 2025-12-04T08:53:59.3123715Z * [new branch] gh/XilunWu/175/head -> origin/gh/XilunWu/175/head 2025-12-04T08:53:59.3123895Z * [new branch] gh/XilunWu/175/orig -> origin/gh/XilunWu/175/orig 2025-12-04T08:53:59.3124076Z * [new branch] gh/XilunWu/176/base -> origin/gh/XilunWu/176/base 2025-12-04T08:53:59.3124255Z * [new branch] gh/XilunWu/176/head -> origin/gh/XilunWu/176/head 2025-12-04T08:53:59.3124435Z * [new branch] gh/XilunWu/176/orig -> origin/gh/XilunWu/176/orig 2025-12-04T08:53:59.3124619Z * [new branch] gh/XuehaiPan/14/base -> origin/gh/XuehaiPan/14/base 2025-12-04T08:53:59.3124813Z * [new branch] gh/XuehaiPan/14/head -> origin/gh/XuehaiPan/14/head 2025-12-04T08:53:59.3125000Z * [new branch] gh/XuehaiPan/14/orig -> origin/gh/XuehaiPan/14/orig 2025-12-04T08:53:59.3125186Z * [new branch] gh/XuehaiPan/179/base -> origin/gh/XuehaiPan/179/base 2025-12-04T08:53:59.3125378Z * [new branch] gh/XuehaiPan/179/head -> origin/gh/XuehaiPan/179/head 2025-12-04T08:53:59.3125570Z * [new branch] gh/XuehaiPan/179/orig -> origin/gh/XuehaiPan/179/orig 2025-12-04T08:53:59.3125755Z * [new branch] gh/XuehaiPan/249/base -> origin/gh/XuehaiPan/249/base 2025-12-04T08:53:59.3125948Z * [new branch] gh/XuehaiPan/249/head -> origin/gh/XuehaiPan/249/head 2025-12-04T08:53:59.3126127Z * [new branch] gh/XuehaiPan/249/orig -> origin/gh/XuehaiPan/249/orig 2025-12-04T08:53:59.3126363Z * [new branch] gh/XuehaiPan/253/base -> origin/gh/XuehaiPan/253/base 2025-12-04T08:53:59.3126545Z * [new branch] gh/XuehaiPan/253/head -> origin/gh/XuehaiPan/253/head 2025-12-04T08:53:59.3126726Z * [new branch] gh/XuehaiPan/253/orig -> origin/gh/XuehaiPan/253/orig 2025-12-04T08:53:59.3126912Z * [new branch] gh/XuehaiPan/254/base -> origin/gh/XuehaiPan/254/base 2025-12-04T08:53:59.3127096Z * [new branch] gh/XuehaiPan/254/head -> origin/gh/XuehaiPan/254/head 2025-12-04T08:53:59.3127276Z * [new branch] gh/XuehaiPan/254/orig -> origin/gh/XuehaiPan/254/orig 2025-12-04T08:53:59.3127462Z * [new branch] gh/XuehaiPan/255/base -> origin/gh/XuehaiPan/255/base 2025-12-04T08:53:59.3127648Z * [new branch] gh/XuehaiPan/255/head -> origin/gh/XuehaiPan/255/head 2025-12-04T08:53:59.3127828Z * [new branch] gh/XuehaiPan/255/orig -> origin/gh/XuehaiPan/255/orig 2025-12-04T08:53:59.3128015Z * [new branch] gh/XuehaiPan/271/base -> origin/gh/XuehaiPan/271/base 2025-12-04T08:53:59.3128206Z * [new branch] gh/XuehaiPan/271/head -> origin/gh/XuehaiPan/271/head 2025-12-04T08:53:59.3128395Z * [new branch] gh/XuehaiPan/271/orig -> origin/gh/XuehaiPan/271/orig 2025-12-04T08:53:59.3128581Z * [new branch] gh/XuehaiPan/343/base -> origin/gh/XuehaiPan/343/base 2025-12-04T08:53:59.3128791Z * [new branch] gh/XuehaiPan/343/head -> origin/gh/XuehaiPan/343/head 2025-12-04T08:53:59.3128972Z * [new branch] gh/XuehaiPan/343/orig -> origin/gh/XuehaiPan/343/orig 2025-12-04T08:53:59.3129153Z * [new branch] gh/XuehaiPan/347/base -> origin/gh/XuehaiPan/347/base 2025-12-04T08:53:59.3129332Z * [new branch] gh/XuehaiPan/347/head -> origin/gh/XuehaiPan/347/head 2025-12-04T08:53:59.3129512Z * [new branch] gh/XuehaiPan/347/orig -> origin/gh/XuehaiPan/347/orig 2025-12-04T08:53:59.3129695Z * [new branch] gh/XuehaiPan/348/base -> origin/gh/XuehaiPan/348/base 2025-12-04T08:53:59.3129874Z * [new branch] gh/XuehaiPan/348/head -> origin/gh/XuehaiPan/348/head 2025-12-04T08:53:59.3130055Z * [new branch] gh/XuehaiPan/348/orig -> origin/gh/XuehaiPan/348/orig 2025-12-04T08:53:59.3130240Z * [new branch] gh/XuehaiPan/350/base -> origin/gh/XuehaiPan/350/base 2025-12-04T08:53:59.3130422Z * [new branch] gh/XuehaiPan/350/head -> origin/gh/XuehaiPan/350/head 2025-12-04T08:53:59.3130604Z * [new branch] gh/XuehaiPan/350/orig -> origin/gh/XuehaiPan/350/orig 2025-12-04T08:53:59.3130785Z * [new branch] gh/XuehaiPan/365/base -> origin/gh/XuehaiPan/365/base 2025-12-04T08:53:59.3130965Z * [new branch] gh/XuehaiPan/365/head -> origin/gh/XuehaiPan/365/head 2025-12-04T08:53:59.3131148Z * [new branch] gh/XuehaiPan/365/orig -> origin/gh/XuehaiPan/365/orig 2025-12-04T08:53:59.3131330Z * [new branch] gh/XuehaiPan/366/base -> origin/gh/XuehaiPan/366/base 2025-12-04T08:53:59.3131510Z * [new branch] gh/XuehaiPan/366/head -> origin/gh/XuehaiPan/366/head 2025-12-04T08:53:59.3131693Z * [new branch] gh/XuehaiPan/370/base -> origin/gh/XuehaiPan/370/base 2025-12-04T08:53:59.3131907Z * [new branch] gh/XuehaiPan/370/head -> origin/gh/XuehaiPan/370/head 2025-12-04T08:53:59.3132090Z * [new branch] gh/XuehaiPan/370/orig -> origin/gh/XuehaiPan/370/orig 2025-12-04T08:53:59.3132279Z * [new branch] gh/XuehaiPan/390/base -> origin/gh/XuehaiPan/390/base 2025-12-04T08:53:59.3132460Z * [new branch] gh/XuehaiPan/390/head -> origin/gh/XuehaiPan/390/head 2025-12-04T08:53:59.3132648Z * [new branch] gh/XuehaiPan/390/orig -> origin/gh/XuehaiPan/390/orig 2025-12-04T08:53:59.3132832Z * [new branch] gh/XuehaiPan/391/base -> origin/gh/XuehaiPan/391/base 2025-12-04T08:53:59.3133061Z * [new branch] gh/XuehaiPan/391/head -> origin/gh/XuehaiPan/391/head 2025-12-04T08:53:59.3133250Z * [new branch] gh/XuehaiPan/391/orig -> origin/gh/XuehaiPan/391/orig 2025-12-04T08:53:59.3133435Z * [new branch] gh/XuehaiPan/392/base -> origin/gh/XuehaiPan/392/base 2025-12-04T08:53:59.3133619Z * [new branch] gh/XuehaiPan/392/head -> origin/gh/XuehaiPan/392/head 2025-12-04T08:53:59.3133804Z * [new branch] gh/XuehaiPan/392/orig -> origin/gh/XuehaiPan/392/orig 2025-12-04T08:53:59.3133988Z * [new branch] gh/XuehaiPan/394/base -> origin/gh/XuehaiPan/394/base 2025-12-04T08:53:59.3134172Z * [new branch] gh/XuehaiPan/394/head -> origin/gh/XuehaiPan/394/head 2025-12-04T08:53:59.3134356Z * [new branch] gh/XuehaiPan/394/orig -> origin/gh/XuehaiPan/394/orig 2025-12-04T08:53:59.3134541Z * [new branch] gh/XuehaiPan/397/base -> origin/gh/XuehaiPan/397/base 2025-12-04T08:53:59.3134728Z * [new branch] gh/XuehaiPan/397/head -> origin/gh/XuehaiPan/397/head 2025-12-04T08:53:59.3134912Z * [new branch] gh/XuehaiPan/397/orig -> origin/gh/XuehaiPan/397/orig 2025-12-04T08:53:59.3135099Z * [new branch] gh/XuehaiPan/398/base -> origin/gh/XuehaiPan/398/base 2025-12-04T08:53:59.3135315Z * [new branch] gh/XuehaiPan/398/head -> origin/gh/XuehaiPan/398/head 2025-12-04T08:53:59.3135499Z * [new branch] gh/XuehaiPan/398/orig -> origin/gh/XuehaiPan/398/orig 2025-12-04T08:53:59.3135683Z * [new branch] gh/XuehaiPan/399/base -> origin/gh/XuehaiPan/399/base 2025-12-04T08:53:59.3135865Z * [new branch] gh/XuehaiPan/399/head -> origin/gh/XuehaiPan/399/head 2025-12-04T08:53:59.3136049Z * [new branch] gh/XuehaiPan/399/orig -> origin/gh/XuehaiPan/399/orig 2025-12-04T08:53:59.3136238Z * [new branch] gh/XuehaiPan/400/base -> origin/gh/XuehaiPan/400/base 2025-12-04T08:53:59.3136421Z * [new branch] gh/XuehaiPan/400/head -> origin/gh/XuehaiPan/400/head 2025-12-04T08:53:59.3136604Z * [new branch] gh/XuehaiPan/400/orig -> origin/gh/XuehaiPan/400/orig 2025-12-04T08:53:59.3136796Z * [new branch] gh/ZhiweiYan-96/39/base -> origin/gh/ZhiweiYan-96/39/base 2025-12-04T08:53:59.3136990Z * [new branch] gh/ZhiweiYan-96/39/head -> origin/gh/ZhiweiYan-96/39/head 2025-12-04T08:53:59.3137177Z * [new branch] gh/ZhiweiYan-96/39/orig -> origin/gh/ZhiweiYan-96/39/orig 2025-12-04T08:53:59.3137362Z * [new branch] gh/ZhiweiYan-96/44/base -> origin/gh/ZhiweiYan-96/44/base 2025-12-04T08:53:59.3137550Z * [new branch] gh/ZhiweiYan-96/44/head -> origin/gh/ZhiweiYan-96/44/head 2025-12-04T08:53:59.3137736Z * [new branch] gh/ZhiweiYan-96/45/base -> origin/gh/ZhiweiYan-96/45/base 2025-12-04T08:53:59.3137921Z * [new branch] gh/ZhiweiYan-96/45/head -> origin/gh/ZhiweiYan-96/45/head 2025-12-04T08:53:59.3138108Z * [new branch] gh/ZhiweiYan-96/49/base -> origin/gh/ZhiweiYan-96/49/base 2025-12-04T08:53:59.3138295Z * [new branch] gh/ZhiweiYan-96/49/head -> origin/gh/ZhiweiYan-96/49/head 2025-12-04T08:53:59.3138482Z * [new branch] gh/ZhiweiYan-96/62/base -> origin/gh/ZhiweiYan-96/62/base 2025-12-04T08:53:59.3138669Z * [new branch] gh/ZhiweiYan-96/62/head -> origin/gh/ZhiweiYan-96/62/head 2025-12-04T08:53:59.3138854Z * [new branch] gh/ZhiweiYan-96/66/base -> origin/gh/ZhiweiYan-96/66/base 2025-12-04T08:53:59.3139040Z * [new branch] gh/ZhiweiYan-96/66/head -> origin/gh/ZhiweiYan-96/66/head 2025-12-04T08:53:59.3139227Z * [new branch] gh/ZhiweiYan-96/67/base -> origin/gh/ZhiweiYan-96/67/base 2025-12-04T08:53:59.3139413Z * [new branch] gh/ZhiweiYan-96/67/head -> origin/gh/ZhiweiYan-96/67/head 2025-12-04T08:53:59.3139622Z * [new branch] gh/ZhiweiYan-96/68/base -> origin/gh/ZhiweiYan-96/68/base 2025-12-04T08:53:59.3139809Z * [new branch] gh/ZhiweiYan-96/68/head -> origin/gh/ZhiweiYan-96/68/head 2025-12-04T08:53:59.3139998Z * [new branch] gh/ZhiweiYan-96/68/orig -> origin/gh/ZhiweiYan-96/68/orig 2025-12-04T08:53:59.3140183Z * [new branch] gh/aakhundov/1/base -> origin/gh/aakhundov/1/base 2025-12-04T08:53:59.3140368Z * [new branch] gh/aakhundov/1/head -> origin/gh/aakhundov/1/head 2025-12-04T08:53:59.3140549Z * [new branch] gh/aakhundov/2/base -> origin/gh/aakhundov/2/base 2025-12-04T08:53:59.3140727Z * [new branch] gh/aakhundov/2/head -> origin/gh/aakhundov/2/head 2025-12-04T08:53:59.3140911Z * [new branch] gh/aditew01/openblas -> origin/gh/aditew01/openblas 2025-12-04T08:53:59.3141096Z * [new branch] gh/aditew01/sbgemm -> origin/gh/aditew01/sbgemm 2025-12-04T08:53:59.3141280Z * [new branch] gh/aditew01/vecbf16 -> origin/gh/aditew01/vecbf16 2025-12-04T08:53:59.3141460Z * [new branch] gh/albanD/4/base -> origin/gh/albanD/4/base 2025-12-04T08:53:59.3141632Z * [new branch] gh/albanD/4/head -> origin/gh/albanD/4/head 2025-12-04T08:53:59.3141842Z * [new branch] gh/albanD/4/orig -> origin/gh/albanD/4/orig 2025-12-04T08:53:59.3142138Z * [new branch] gh/alexbrauckmann/paddedtensor_faketensor_init -> origin/gh/alexbrauckmann/paddedtensor_faketensor_init 2025-12-04T08:53:59.3142403Z * [new branch] gh/alexsamardzic/12/base -> origin/gh/alexsamardzic/12/base 2025-12-04T08:53:59.3142605Z * [new branch] gh/alexsamardzic/12/head -> origin/gh/alexsamardzic/12/head 2025-12-04T08:53:59.3142803Z * [new branch] gh/alexsamardzic/12/orig -> origin/gh/alexsamardzic/12/orig 2025-12-04T08:53:59.3143002Z * [new branch] gh/alexsamardzic/14/base -> origin/gh/alexsamardzic/14/base 2025-12-04T08:53:59.3143199Z * [new branch] gh/alexsamardzic/14/head -> origin/gh/alexsamardzic/14/head 2025-12-04T08:53:59.3143396Z * [new branch] gh/alexsamardzic/14/orig -> origin/gh/alexsamardzic/14/orig 2025-12-04T08:53:59.3143592Z * [new branch] gh/alexsamardzic/15/base -> origin/gh/alexsamardzic/15/base 2025-12-04T08:53:59.3143789Z * [new branch] gh/alexsamardzic/15/head -> origin/gh/alexsamardzic/15/head 2025-12-04T08:53:59.3143984Z * [new branch] gh/alexsamardzic/15/orig -> origin/gh/alexsamardzic/15/orig 2025-12-04T08:53:59.3144173Z * [new branch] gh/amjames/18/base -> origin/gh/amjames/18/base 2025-12-04T08:53:59.3144354Z * [new branch] gh/amjames/18/head -> origin/gh/amjames/18/head 2025-12-04T08:53:59.3144534Z * [new branch] gh/amjames/18/orig -> origin/gh/amjames/18/orig 2025-12-04T08:53:59.3144717Z * [new branch] gh/andrewor14/35/base -> origin/gh/andrewor14/35/base 2025-12-04T08:53:59.3144906Z * [new branch] gh/andrewor14/35/head -> origin/gh/andrewor14/35/head 2025-12-04T08:53:59.3145092Z * [new branch] gh/andrewor14/35/orig -> origin/gh/andrewor14/35/orig 2025-12-04T08:53:59.3145277Z * [new branch] gh/andrewor14/50/base -> origin/gh/andrewor14/50/base 2025-12-04T08:53:59.3145460Z * [new branch] gh/andrewor14/50/head -> origin/gh/andrewor14/50/head 2025-12-04T08:53:59.3145642Z * [new branch] gh/andrewor14/50/orig -> origin/gh/andrewor14/50/orig 2025-12-04T08:53:59.3145824Z * [new branch] gh/andyanwang/30/base -> origin/gh/andyanwang/30/base 2025-12-04T08:53:59.3146007Z * [new branch] gh/andyanwang/30/orig -> origin/gh/andyanwang/30/orig 2025-12-04T08:53:59.3146191Z * [new branch] gh/andyanwang/31/base -> origin/gh/andyanwang/31/base 2025-12-04T08:53:59.3146410Z * [new branch] gh/andyanwang/31/orig -> origin/gh/andyanwang/31/orig 2025-12-04T08:53:59.3146596Z * [new branch] gh/andyanwang/39/base -> origin/gh/andyanwang/39/base 2025-12-04T08:53:59.3146778Z * [new branch] gh/andyanwang/39/head -> origin/gh/andyanwang/39/head 2025-12-04T08:53:59.3146965Z * [new branch] gh/andyanwang/39/orig -> origin/gh/andyanwang/39/orig 2025-12-04T08:53:59.3147151Z * [new branch] gh/andyanwang/42/base -> origin/gh/andyanwang/42/base 2025-12-04T08:53:59.3147333Z * [new branch] gh/andyanwang/42/head -> origin/gh/andyanwang/42/head 2025-12-04T08:53:59.3147521Z * [new branch] gh/andyanwang/42/orig -> origin/gh/andyanwang/42/orig 2025-12-04T08:53:59.3147707Z * [new branch] gh/andyanwang/45/base -> origin/gh/andyanwang/45/base 2025-12-04T08:53:59.3147892Z * [new branch] gh/andyanwang/45/head -> origin/gh/andyanwang/45/head 2025-12-04T08:53:59.3148077Z * [new branch] gh/andyanwang/45/orig -> origin/gh/andyanwang/45/orig 2025-12-04T08:53:59.3148262Z * [new branch] gh/angelayi/107/base -> origin/gh/angelayi/107/base 2025-12-04T08:53:59.3148444Z * [new branch] gh/angelayi/107/head -> origin/gh/angelayi/107/head 2025-12-04T08:53:59.3148658Z * [new branch] gh/angelayi/114/base -> origin/gh/angelayi/114/base 2025-12-04T08:53:59.3148841Z * [new branch] gh/angelayi/114/head -> origin/gh/angelayi/114/head 2025-12-04T08:53:59.3149019Z * [new branch] gh/angelayi/114/orig -> origin/gh/angelayi/114/orig 2025-12-04T08:53:59.3149200Z * [new branch] gh/angelayi/116/base -> origin/gh/angelayi/116/base 2025-12-04T08:53:59.3149380Z * [new branch] gh/angelayi/116/head -> origin/gh/angelayi/116/head 2025-12-04T08:53:59.3149563Z * [new branch] gh/angelayi/116/orig -> origin/gh/angelayi/116/orig 2025-12-04T08:53:59.3149743Z * [new branch] gh/angelayi/122/base -> origin/gh/angelayi/122/base 2025-12-04T08:53:59.3149924Z * [new branch] gh/angelayi/122/head -> origin/gh/angelayi/122/head 2025-12-04T08:53:59.3150104Z * [new branch] gh/angelayi/122/orig -> origin/gh/angelayi/122/orig 2025-12-04T08:53:59.3150289Z * [new branch] gh/angelayi/124/base -> origin/gh/angelayi/124/base 2025-12-04T08:53:59.3150470Z * [new branch] gh/angelayi/124/head -> origin/gh/angelayi/124/head 2025-12-04T08:53:59.3150649Z * [new branch] gh/angelayi/124/orig -> origin/gh/angelayi/124/orig 2025-12-04T08:53:59.3150830Z * [new branch] gh/angelayi/128/base -> origin/gh/angelayi/128/base 2025-12-04T08:53:59.3151008Z * [new branch] gh/angelayi/128/head -> origin/gh/angelayi/128/head 2025-12-04T08:53:59.3151193Z * [new branch] gh/angelayi/128/orig -> origin/gh/angelayi/128/orig 2025-12-04T08:53:59.3151375Z * [new branch] gh/angelayi/131/base -> origin/gh/angelayi/131/base 2025-12-04T08:53:59.3151554Z * [new branch] gh/angelayi/131/head -> origin/gh/angelayi/131/head 2025-12-04T08:53:59.3151736Z * [new branch] gh/angelayi/131/orig -> origin/gh/angelayi/131/orig 2025-12-04T08:53:59.3151962Z * [new branch] gh/angelayi/132/base -> origin/gh/angelayi/132/base 2025-12-04T08:53:59.3152141Z * [new branch] gh/angelayi/132/head -> origin/gh/angelayi/132/head 2025-12-04T08:53:59.3152329Z * [new branch] gh/angelayi/132/orig -> origin/gh/angelayi/132/orig 2025-12-04T08:53:59.3152512Z * [new branch] gh/angelayi/133/base -> origin/gh/angelayi/133/base 2025-12-04T08:53:59.3152691Z * [new branch] gh/angelayi/133/head -> origin/gh/angelayi/133/head 2025-12-04T08:53:59.3152903Z * [new branch] gh/angelayi/133/orig -> origin/gh/angelayi/133/orig 2025-12-04T08:53:59.3153084Z * [new branch] gh/angelayi/134/base -> origin/gh/angelayi/134/base 2025-12-04T08:53:59.3153264Z * [new branch] gh/angelayi/134/head -> origin/gh/angelayi/134/head 2025-12-04T08:53:59.3153447Z * [new branch] gh/angelayi/134/orig -> origin/gh/angelayi/134/orig 2025-12-04T08:53:59.3153629Z * [new branch] gh/angelayi/135/base -> origin/gh/angelayi/135/base 2025-12-04T08:53:59.3153809Z * [new branch] gh/angelayi/135/head -> origin/gh/angelayi/135/head 2025-12-04T08:53:59.3153993Z * [new branch] gh/angelayi/135/orig -> origin/gh/angelayi/135/orig 2025-12-04T08:53:59.3154176Z * [new branch] gh/angelayi/136/base -> origin/gh/angelayi/136/base 2025-12-04T08:53:59.3154355Z * [new branch] gh/angelayi/136/head -> origin/gh/angelayi/136/head 2025-12-04T08:53:59.3154541Z * [new branch] gh/angelayi/136/orig -> origin/gh/angelayi/136/orig 2025-12-04T08:53:59.3154720Z * [new branch] gh/angelayi/137/base -> origin/gh/angelayi/137/base 2025-12-04T08:53:59.3154901Z * [new branch] gh/angelayi/137/head -> origin/gh/angelayi/137/head 2025-12-04T08:53:59.3155122Z * [new branch] gh/angelayi/137/orig -> origin/gh/angelayi/137/orig 2025-12-04T08:53:59.3155302Z * [new branch] gh/angelayi/138/base -> origin/gh/angelayi/138/base 2025-12-04T08:53:59.3155483Z * [new branch] gh/angelayi/138/head -> origin/gh/angelayi/138/head 2025-12-04T08:53:59.3155663Z * [new branch] gh/angelayi/138/orig -> origin/gh/angelayi/138/orig 2025-12-04T08:53:59.3155841Z * [new branch] gh/angelayi/139/base -> origin/gh/angelayi/139/base 2025-12-04T08:53:59.3156022Z * [new branch] gh/angelayi/139/head -> origin/gh/angelayi/139/head 2025-12-04T08:53:59.3156205Z * [new branch] gh/angelayi/139/orig -> origin/gh/angelayi/139/orig 2025-12-04T08:53:59.3156386Z * [new branch] gh/angelayi/140/base -> origin/gh/angelayi/140/base 2025-12-04T08:53:59.3156565Z * [new branch] gh/angelayi/140/head -> origin/gh/angelayi/140/head 2025-12-04T08:53:59.3156752Z * [new branch] gh/angelayi/140/orig -> origin/gh/angelayi/140/orig 2025-12-04T08:53:59.3156931Z * [new branch] gh/angelayi/141/base -> origin/gh/angelayi/141/base 2025-12-04T08:53:59.3157113Z * [new branch] gh/angelayi/141/head -> origin/gh/angelayi/141/head 2025-12-04T08:53:59.3157295Z * [new branch] gh/angelayi/141/orig -> origin/gh/angelayi/141/orig 2025-12-04T08:53:59.3157473Z * [new branch] gh/angelayi/142/base -> origin/gh/angelayi/142/base 2025-12-04T08:53:59.3157654Z * [new branch] gh/angelayi/142/head -> origin/gh/angelayi/142/head 2025-12-04T08:53:59.3157837Z * [new branch] gh/angelayi/142/orig -> origin/gh/angelayi/142/orig 2025-12-04T08:53:59.3158017Z * [new branch] gh/angelayi/143/base -> origin/gh/angelayi/143/base 2025-12-04T08:53:59.3158197Z * [new branch] gh/angelayi/143/head -> origin/gh/angelayi/143/head 2025-12-04T08:53:59.3158380Z * [new branch] gh/angelayi/143/orig -> origin/gh/angelayi/143/orig 2025-12-04T08:53:59.3158615Z * [new branch] gh/angelayi/144/base -> origin/gh/angelayi/144/base 2025-12-04T08:53:59.3158797Z * [new branch] gh/angelayi/144/head -> origin/gh/angelayi/144/head 2025-12-04T08:53:59.3158977Z * [new branch] gh/angelayi/144/orig -> origin/gh/angelayi/144/orig 2025-12-04T08:53:59.3159168Z * [new branch] gh/anijain2305/753/base -> origin/gh/anijain2305/753/base 2025-12-04T08:53:59.3159358Z * [new branch] gh/anijain2305/753/head -> origin/gh/anijain2305/753/head 2025-12-04T08:53:59.3159577Z * [new branch] gh/anijain2305/753/orig -> origin/gh/anijain2305/753/orig 2025-12-04T08:53:59.3159764Z * [new branch] gh/anijain2305/810/base -> origin/gh/anijain2305/810/base 2025-12-04T08:53:59.3159950Z * [new branch] gh/anijain2305/810/head -> origin/gh/anijain2305/810/head 2025-12-04T08:53:59.3160135Z * [new branch] gh/anijain2305/810/orig -> origin/gh/anijain2305/810/orig 2025-12-04T08:53:59.3160325Z * [new branch] gh/anijain2305/854/base -> origin/gh/anijain2305/854/base 2025-12-04T08:53:59.3160511Z * [new branch] gh/anijain2305/854/head -> origin/gh/anijain2305/854/head 2025-12-04T08:53:59.3160695Z * [new branch] gh/anijain2305/854/orig -> origin/gh/anijain2305/854/orig 2025-12-04T08:53:59.3160884Z * [new branch] gh/anijain2305/864/base -> origin/gh/anijain2305/864/base 2025-12-04T08:53:59.3161074Z * [new branch] gh/anijain2305/864/head -> origin/gh/anijain2305/864/head 2025-12-04T08:53:59.3161261Z * [new branch] gh/anijain2305/864/orig -> origin/gh/anijain2305/864/orig 2025-12-04T08:53:59.3161446Z * [new branch] gh/anijain2305/870/base -> origin/gh/anijain2305/870/base 2025-12-04T08:53:59.3161635Z * [new branch] gh/anijain2305/870/head -> origin/gh/anijain2305/870/head 2025-12-04T08:53:59.3161889Z * [new branch] gh/anijain2305/870/orig -> origin/gh/anijain2305/870/orig 2025-12-04T08:53:59.3162075Z * [new branch] gh/anijain2305/873/base -> origin/gh/anijain2305/873/base 2025-12-04T08:53:59.3162259Z * [new branch] gh/anijain2305/873/head -> origin/gh/anijain2305/873/head 2025-12-04T08:53:59.3162443Z * [new branch] gh/anijain2305/873/orig -> origin/gh/anijain2305/873/orig 2025-12-04T08:53:59.3162628Z * [new branch] gh/anijain2305/894/base -> origin/gh/anijain2305/894/base 2025-12-04T08:53:59.3162819Z * [new branch] gh/anijain2305/894/head -> origin/gh/anijain2305/894/head 2025-12-04T08:53:59.3163001Z * [new branch] gh/anijain2305/894/orig -> origin/gh/anijain2305/894/orig 2025-12-04T08:53:59.3163187Z * [new branch] gh/anijain2305/895/base -> origin/gh/anijain2305/895/base 2025-12-04T08:53:59.3163378Z * [new branch] gh/anijain2305/895/head -> origin/gh/anijain2305/895/head 2025-12-04T08:53:59.3163561Z * [new branch] gh/anijain2305/895/orig -> origin/gh/anijain2305/895/orig 2025-12-04T08:53:59.3163745Z * [new branch] gh/anijain2305/910/base -> origin/gh/anijain2305/910/base 2025-12-04T08:53:59.3163932Z * [new branch] gh/anijain2305/910/head -> origin/gh/anijain2305/910/head 2025-12-04T08:53:59.3164117Z * [new branch] gh/anijain2305/910/orig -> origin/gh/anijain2305/910/orig 2025-12-04T08:53:59.3164304Z * [new branch] gh/anijain2305/919/base -> origin/gh/anijain2305/919/base 2025-12-04T08:53:59.3164491Z * [new branch] gh/anijain2305/919/head -> origin/gh/anijain2305/919/head 2025-12-04T08:53:59.3164673Z * [new branch] gh/anijain2305/919/orig -> origin/gh/anijain2305/919/orig 2025-12-04T08:53:59.3164857Z * [new branch] gh/anijain2305/922/base -> origin/gh/anijain2305/922/base 2025-12-04T08:53:59.3165043Z * [new branch] gh/anijain2305/922/head -> origin/gh/anijain2305/922/head 2025-12-04T08:53:59.3165233Z * [new branch] gh/anijain2305/922/orig -> origin/gh/anijain2305/922/orig 2025-12-04T08:53:59.3165421Z * [new branch] gh/anijain2305/932/base -> origin/gh/anijain2305/932/base 2025-12-04T08:53:59.3165603Z * [new branch] gh/anijain2305/932/head -> origin/gh/anijain2305/932/head 2025-12-04T08:53:59.3165789Z * [new branch] gh/anijain2305/932/orig -> origin/gh/anijain2305/932/orig 2025-12-04T08:53:59.3166027Z * [new branch] gh/anijain2305/940/base -> origin/gh/anijain2305/940/base 2025-12-04T08:53:59.3166210Z * [new branch] gh/anijain2305/940/head -> origin/gh/anijain2305/940/head 2025-12-04T08:53:59.3166395Z * [new branch] gh/anijain2305/940/orig -> origin/gh/anijain2305/940/orig 2025-12-04T08:53:59.3166579Z * [new branch] gh/anijain2305/941/base -> origin/gh/anijain2305/941/base 2025-12-04T08:53:59.3166767Z * [new branch] gh/anijain2305/941/head -> origin/gh/anijain2305/941/head 2025-12-04T08:53:59.3166953Z * [new branch] gh/anijain2305/941/orig -> origin/gh/anijain2305/941/orig 2025-12-04T08:53:59.3167138Z * [new branch] gh/anijain2305/942/base -> origin/gh/anijain2305/942/base 2025-12-04T08:53:59.3167321Z * [new branch] gh/anijain2305/942/head -> origin/gh/anijain2305/942/head 2025-12-04T08:53:59.3167505Z * [new branch] gh/anijain2305/942/orig -> origin/gh/anijain2305/942/orig 2025-12-04T08:53:59.3167694Z * [new branch] gh/anijain2305/943/base -> origin/gh/anijain2305/943/base 2025-12-04T08:53:59.3167877Z * [new branch] gh/anijain2305/943/head -> origin/gh/anijain2305/943/head 2025-12-04T08:53:59.3168063Z * [new branch] gh/anijain2305/943/orig -> origin/gh/anijain2305/943/orig 2025-12-04T08:53:59.3168279Z * [new branch] gh/anijain2305/944/base -> origin/gh/anijain2305/944/base 2025-12-04T08:53:59.3168462Z * [new branch] gh/anijain2305/944/head -> origin/gh/anijain2305/944/head 2025-12-04T08:53:59.3168647Z * [new branch] gh/anijain2305/944/orig -> origin/gh/anijain2305/944/orig 2025-12-04T08:53:59.3168831Z * [new branch] gh/anijain2305/945/base -> origin/gh/anijain2305/945/base 2025-12-04T08:53:59.3169014Z * [new branch] gh/anijain2305/945/head -> origin/gh/anijain2305/945/head 2025-12-04T08:53:59.3169204Z * [new branch] gh/anijain2305/945/orig -> origin/gh/anijain2305/945/orig 2025-12-04T08:53:59.3169393Z * [new branch] gh/anijain2305/946/base -> origin/gh/anijain2305/946/base 2025-12-04T08:53:59.3169583Z * [new branch] gh/anijain2305/946/head -> origin/gh/anijain2305/946/head 2025-12-04T08:53:59.3169768Z * [new branch] gh/anijain2305/946/orig -> origin/gh/anijain2305/946/orig 2025-12-04T08:53:59.3169956Z * [new branch] gh/anijain2305/947/base -> origin/gh/anijain2305/947/base 2025-12-04T08:53:59.3170141Z * [new branch] gh/anijain2305/947/head -> origin/gh/anijain2305/947/head 2025-12-04T08:53:59.3170334Z * [new branch] gh/anijain2305/947/orig -> origin/gh/anijain2305/947/orig 2025-12-04T08:53:59.3170517Z * [new branch] gh/anijain2305/948/base -> origin/gh/anijain2305/948/base 2025-12-04T08:53:59.3170701Z * [new branch] gh/anijain2305/948/head -> origin/gh/anijain2305/948/head 2025-12-04T08:53:59.3170890Z * [new branch] gh/anijain2305/948/orig -> origin/gh/anijain2305/948/orig 2025-12-04T08:53:59.3171073Z * [new branch] gh/anijain2305/949/base -> origin/gh/anijain2305/949/base 2025-12-04T08:53:59.3171258Z * [new branch] gh/anijain2305/949/head -> origin/gh/anijain2305/949/head 2025-12-04T08:53:59.3171442Z * [new branch] gh/anijain2305/949/orig -> origin/gh/anijain2305/949/orig 2025-12-04T08:53:59.3171631Z * [new branch] gh/anijain2305/950/base -> origin/gh/anijain2305/950/base 2025-12-04T08:53:59.3171818Z * [new branch] gh/anijain2305/950/head -> origin/gh/anijain2305/950/head 2025-12-04T08:53:59.3172033Z * [new branch] gh/anijain2305/950/orig -> origin/gh/anijain2305/950/orig 2025-12-04T08:53:59.3172216Z * [new branch] gh/anijain2305/951/base -> origin/gh/anijain2305/951/base 2025-12-04T08:53:59.3172404Z * [new branch] gh/anijain2305/951/head -> origin/gh/anijain2305/951/head 2025-12-04T08:53:59.3172638Z * [new branch] gh/anijain2305/951/orig -> origin/gh/anijain2305/951/orig 2025-12-04T08:53:59.3172821Z * [new branch] gh/anijain2305/952/base -> origin/gh/anijain2305/952/base 2025-12-04T08:53:59.3173008Z * [new branch] gh/anijain2305/952/head -> origin/gh/anijain2305/952/head 2025-12-04T08:53:59.3173198Z * [new branch] gh/anijain2305/952/orig -> origin/gh/anijain2305/952/orig 2025-12-04T08:53:59.3173382Z * [new branch] gh/anijain2305/953/base -> origin/gh/anijain2305/953/base 2025-12-04T08:53:59.3173566Z * [new branch] gh/anijain2305/953/head -> origin/gh/anijain2305/953/head 2025-12-04T08:53:59.3173749Z * [new branch] gh/anijain2305/953/orig -> origin/gh/anijain2305/953/orig 2025-12-04T08:53:59.3173938Z * [new branch] gh/anijain2305/954/base -> origin/gh/anijain2305/954/base 2025-12-04T08:53:59.3174123Z * [new branch] gh/anijain2305/954/head -> origin/gh/anijain2305/954/head 2025-12-04T08:53:59.3174312Z * [new branch] gh/anijain2305/954/orig -> origin/gh/anijain2305/954/orig 2025-12-04T08:53:59.3174499Z * [new branch] gh/anijain2305/955/base -> origin/gh/anijain2305/955/base 2025-12-04T08:53:59.3174684Z * [new branch] gh/anijain2305/955/head -> origin/gh/anijain2305/955/head 2025-12-04T08:53:59.3174913Z * [new branch] gh/anijain2305/955/orig -> origin/gh/anijain2305/955/orig 2025-12-04T08:53:59.3175100Z * [new branch] gh/anijain2305/956/base -> origin/gh/anijain2305/956/base 2025-12-04T08:53:59.3175287Z * [new branch] gh/anijain2305/956/head -> origin/gh/anijain2305/956/head 2025-12-04T08:53:59.3175470Z * [new branch] gh/anijain2305/956/orig -> origin/gh/anijain2305/956/orig 2025-12-04T08:53:59.3175655Z * [new branch] gh/anijain2305/957/base -> origin/gh/anijain2305/957/base 2025-12-04T08:53:59.3175840Z * [new branch] gh/anijain2305/957/head -> origin/gh/anijain2305/957/head 2025-12-04T08:53:59.3176026Z * [new branch] gh/anijain2305/957/orig -> origin/gh/anijain2305/957/orig 2025-12-04T08:53:59.3176211Z * [new branch] gh/anijain2305/958/base -> origin/gh/anijain2305/958/base 2025-12-04T08:53:59.3176398Z * [new branch] gh/anijain2305/958/head -> origin/gh/anijain2305/958/head 2025-12-04T08:53:59.3176589Z * [new branch] gh/anijain2305/958/orig -> origin/gh/anijain2305/958/orig 2025-12-04T08:53:59.3176773Z * [new branch] gh/anijain2305/959/base -> origin/gh/anijain2305/959/base 2025-12-04T08:53:59.3176958Z * [new branch] gh/anijain2305/959/head -> origin/gh/anijain2305/959/head 2025-12-04T08:53:59.3177142Z * [new branch] gh/anijain2305/959/orig -> origin/gh/anijain2305/959/orig 2025-12-04T08:53:59.3177327Z * [new branch] gh/anijain2305/960/base -> origin/gh/anijain2305/960/base 2025-12-04T08:53:59.3177513Z * [new branch] gh/anijain2305/960/head -> origin/gh/anijain2305/960/head 2025-12-04T08:53:59.3177698Z * [new branch] gh/anijain2305/960/orig -> origin/gh/anijain2305/960/orig 2025-12-04T08:53:59.3177885Z * [new branch] gh/anijain2305/961/base -> origin/gh/anijain2305/961/base 2025-12-04T08:53:59.3178074Z * [new branch] gh/anijain2305/961/head -> origin/gh/anijain2305/961/head 2025-12-04T08:53:59.3178257Z * [new branch] gh/anijain2305/961/orig -> origin/gh/anijain2305/961/orig 2025-12-04T08:53:59.3178444Z * [new branch] gh/anijain2305/962/base -> origin/gh/anijain2305/962/base 2025-12-04T08:53:59.3178634Z * [new branch] gh/anijain2305/962/head -> origin/gh/anijain2305/962/head 2025-12-04T08:53:59.3178821Z * [new branch] gh/anijain2305/962/orig -> origin/gh/anijain2305/962/orig 2025-12-04T08:53:59.3179006Z * [new branch] gh/anijain2305/963/base -> origin/gh/anijain2305/963/base 2025-12-04T08:53:59.3179223Z * [new branch] gh/anijain2305/963/head -> origin/gh/anijain2305/963/head 2025-12-04T08:53:59.3179412Z * [new branch] gh/anijain2305/963/orig -> origin/gh/anijain2305/963/orig 2025-12-04T08:53:59.3179598Z * [new branch] gh/anijain2305/964/base -> origin/gh/anijain2305/964/base 2025-12-04T08:53:59.3179784Z * [new branch] gh/anijain2305/964/head -> origin/gh/anijain2305/964/head 2025-12-04T08:53:59.3179974Z * [new branch] gh/anijain2305/964/orig -> origin/gh/anijain2305/964/orig 2025-12-04T08:53:59.3180161Z * [new branch] gh/anijain2305/965/base -> origin/gh/anijain2305/965/base 2025-12-04T08:53:59.3180344Z * [new branch] gh/anijain2305/965/head -> origin/gh/anijain2305/965/head 2025-12-04T08:53:59.3180743Z * [new branch] gh/anijain2305/965/orig -> origin/gh/anijain2305/965/orig 2025-12-04T08:53:59.3180929Z * [new branch] gh/anijain2305/966/base -> origin/gh/anijain2305/966/base 2025-12-04T08:53:59.3181120Z * [new branch] gh/anijain2305/966/head -> origin/gh/anijain2305/966/head 2025-12-04T08:53:59.3181308Z * [new branch] gh/anijain2305/966/orig -> origin/gh/anijain2305/966/orig 2025-12-04T08:53:59.3181495Z * [new branch] gh/anijain2305/967/base -> origin/gh/anijain2305/967/base 2025-12-04T08:53:59.3181714Z * [new branch] gh/anijain2305/967/head -> origin/gh/anijain2305/967/head 2025-12-04T08:53:59.3181938Z * [new branch] gh/anijain2305/967/orig -> origin/gh/anijain2305/967/orig 2025-12-04T08:53:59.3182124Z * [new branch] gh/anijain2305/968/base -> origin/gh/anijain2305/968/base 2025-12-04T08:53:59.3182307Z * [new branch] gh/anijain2305/968/head -> origin/gh/anijain2305/968/head 2025-12-04T08:53:59.3182493Z * [new branch] gh/anijain2305/968/orig -> origin/gh/anijain2305/968/orig 2025-12-04T08:53:59.3182685Z * [new branch] gh/anijain2305/969/base -> origin/gh/anijain2305/969/base 2025-12-04T08:53:59.3182869Z * [new branch] gh/anijain2305/969/head -> origin/gh/anijain2305/969/head 2025-12-04T08:53:59.3183057Z * [new branch] gh/anijain2305/969/orig -> origin/gh/anijain2305/969/orig 2025-12-04T08:53:59.3183245Z * [new branch] gh/anijain2305/970/base -> origin/gh/anijain2305/970/base 2025-12-04T08:53:59.3183430Z * [new branch] gh/anijain2305/970/head -> origin/gh/anijain2305/970/head 2025-12-04T08:53:59.3183615Z * [new branch] gh/anijain2305/970/orig -> origin/gh/anijain2305/970/orig 2025-12-04T08:53:59.3183804Z * [new branch] gh/anjali411/216/base -> origin/gh/anjali411/216/base 2025-12-04T08:53:59.3183989Z * [new branch] gh/anjali411/216/head -> origin/gh/anjali411/216/head 2025-12-04T08:53:59.3184172Z * [new branch] gh/anjali411/216/orig -> origin/gh/anjali411/216/orig 2025-12-04T08:53:59.3184359Z * [new branch] gh/anshul-si/1/base -> origin/gh/anshul-si/1/base 2025-12-04T08:53:59.3184541Z * [new branch] gh/anshul-si/1/head -> origin/gh/anshul-si/1/head 2025-12-04T08:53:59.3184717Z * [new branch] gh/anshul-si/2/base -> origin/gh/anshul-si/2/base 2025-12-04T08:53:59.3184895Z * [new branch] gh/anshul-si/2/head -> origin/gh/anshul-si/2/head 2025-12-04T08:53:59.3185071Z * [new branch] gh/anshul-si/3/base -> origin/gh/anshul-si/3/base 2025-12-04T08:53:59.3185251Z * [new branch] gh/anshul-si/3/head -> origin/gh/anshul-si/3/head 2025-12-04T08:53:59.3185424Z * [new branch] gh/anshul-si/4/base -> origin/gh/anshul-si/4/base 2025-12-04T08:53:59.3185599Z * [new branch] gh/anshul-si/4/head -> origin/gh/anshul-si/4/head 2025-12-04T08:53:59.3185776Z * [new branch] gh/anshul-si/5/base -> origin/gh/anshul-si/5/base 2025-12-04T08:53:59.3185997Z * [new branch] gh/anshul-si/5/head -> origin/gh/anshul-si/5/head 2025-12-04T08:53:59.3186177Z * [new branch] gh/anshul-si/53/base -> origin/gh/anshul-si/53/base 2025-12-04T08:53:59.3186356Z * [new branch] gh/anshul-si/53/head -> origin/gh/anshul-si/53/head 2025-12-04T08:53:59.3186536Z * [new branch] gh/anshul-si/58/base -> origin/gh/anshul-si/58/base 2025-12-04T08:53:59.3186717Z * [new branch] gh/anshul-si/58/head -> origin/gh/anshul-si/58/head 2025-12-04T08:53:59.3186892Z * [new branch] gh/anshul-si/66/base -> origin/gh/anshul-si/66/base 2025-12-04T08:53:59.3187067Z * [new branch] gh/anshul-si/66/head -> origin/gh/anshul-si/66/head 2025-12-04T08:53:59.3187245Z * [new branch] gh/anshul-si/66/orig -> origin/gh/anshul-si/66/orig 2025-12-04T08:53:59.3187419Z * [new branch] gh/anshul-si/67/base -> origin/gh/anshul-si/67/base 2025-12-04T08:53:59.3187602Z * [new branch] gh/anshul-si/67/head -> origin/gh/anshul-si/67/head 2025-12-04T08:53:59.3187778Z * [new branch] gh/anshul-si/67/orig -> origin/gh/anshul-si/67/orig 2025-12-04T08:53:59.3187959Z * [new branch] gh/anshul-si/68/base -> origin/gh/anshul-si/68/base 2025-12-04T08:53:59.3188183Z * [new branch] gh/anshul-si/68/head -> origin/gh/anshul-si/68/head 2025-12-04T08:53:59.3188361Z * [new branch] gh/anshul-si/68/orig -> origin/gh/anshul-si/68/orig 2025-12-04T08:53:59.3188538Z * [new branch] gh/anshul-si/69/base -> origin/gh/anshul-si/69/base 2025-12-04T08:53:59.3188710Z * [new branch] gh/anshul-si/69/head -> origin/gh/anshul-si/69/head 2025-12-04T08:53:59.3188886Z * [new branch] gh/anshul-si/69/orig -> origin/gh/anshul-si/69/orig 2025-12-04T08:53:59.3189062Z * [new branch] gh/anshul-si/70/base -> origin/gh/anshul-si/70/base 2025-12-04T08:53:59.3189238Z * [new branch] gh/anshul-si/70/head -> origin/gh/anshul-si/70/head 2025-12-04T08:53:59.3189417Z * [new branch] gh/anshul-si/70/orig -> origin/gh/anshul-si/70/orig 2025-12-04T08:53:59.3189592Z * [new branch] gh/anshul-si/71/base -> origin/gh/anshul-si/71/base 2025-12-04T08:53:59.3189771Z * [new branch] gh/anshul-si/71/head -> origin/gh/anshul-si/71/head 2025-12-04T08:53:59.3189946Z * [new branch] gh/anshul-si/71/orig -> origin/gh/anshul-si/71/orig 2025-12-04T08:53:59.3190124Z * [new branch] gh/anshul-si/72/base -> origin/gh/anshul-si/72/base 2025-12-04T08:53:59.3190299Z * [new branch] gh/anshul-si/72/head -> origin/gh/anshul-si/72/head 2025-12-04T08:53:59.3190476Z * [new branch] gh/anshul-si/72/orig -> origin/gh/anshul-si/72/orig 2025-12-04T08:53:59.3190652Z * [new branch] gh/anshul-si/73/base -> origin/gh/anshul-si/73/base 2025-12-04T08:53:59.3190829Z * [new branch] gh/anshul-si/73/head -> origin/gh/anshul-si/73/head 2025-12-04T08:53:59.3191009Z * [new branch] gh/anshul-si/73/orig -> origin/gh/anshul-si/73/orig 2025-12-04T08:53:59.3191190Z * [new branch] gh/aorenste/132/base -> origin/gh/aorenste/132/base 2025-12-04T08:53:59.3191372Z * [new branch] gh/aorenste/132/head -> origin/gh/aorenste/132/head 2025-12-04T08:53:59.3191551Z * [new branch] gh/aorenste/134/base -> origin/gh/aorenste/134/base 2025-12-04T08:53:59.3191731Z * [new branch] gh/aorenste/134/head -> origin/gh/aorenste/134/head 2025-12-04T08:53:59.3191944Z * [new branch] gh/aorenste/134/orig -> origin/gh/aorenste/134/orig 2025-12-04T08:53:59.3192124Z * [new branch] gh/aorenste/139/base -> origin/gh/aorenste/139/base 2025-12-04T08:53:59.3192304Z * [new branch] gh/aorenste/139/head -> origin/gh/aorenste/139/head 2025-12-04T08:53:59.3192544Z * [new branch] gh/aorenste/139/orig -> origin/gh/aorenste/139/orig 2025-12-04T08:53:59.3192723Z * [new branch] gh/aorenste/141/base -> origin/gh/aorenste/141/base 2025-12-04T08:53:59.3192904Z * [new branch] gh/aorenste/141/head -> origin/gh/aorenste/141/head 2025-12-04T08:53:59.3193083Z * [new branch] gh/aorenste/145/base -> origin/gh/aorenste/145/base 2025-12-04T08:53:59.3193262Z * [new branch] gh/aorenste/145/head -> origin/gh/aorenste/145/head 2025-12-04T08:53:59.3193438Z * [new branch] gh/aorenste/145/orig -> origin/gh/aorenste/145/orig 2025-12-04T08:53:59.3193621Z * [new branch] gh/aorenste/146/base -> origin/gh/aorenste/146/base 2025-12-04T08:53:59.3193802Z * [new branch] gh/aorenste/146/head -> origin/gh/aorenste/146/head 2025-12-04T08:53:59.3193982Z * [new branch] gh/aorenste/146/orig -> origin/gh/aorenste/146/orig 2025-12-04T08:53:59.3194160Z * [new branch] gh/aorenste/147/base -> origin/gh/aorenste/147/base 2025-12-04T08:53:59.3194339Z * [new branch] gh/aorenste/147/head -> origin/gh/aorenste/147/head 2025-12-04T08:53:59.3194519Z * [new branch] gh/aorenste/147/orig -> origin/gh/aorenste/147/orig 2025-12-04T08:53:59.3194748Z * [new branch] gh/aorenste/148/base -> origin/gh/aorenste/148/base 2025-12-04T08:53:59.3194927Z * [new branch] gh/aorenste/148/head -> origin/gh/aorenste/148/head 2025-12-04T08:53:59.3195104Z * [new branch] gh/aorenste/148/orig -> origin/gh/aorenste/148/orig 2025-12-04T08:53:59.3195284Z * [new branch] gh/aorenste/149/base -> origin/gh/aorenste/149/base 2025-12-04T08:53:59.3195464Z * [new branch] gh/aorenste/149/head -> origin/gh/aorenste/149/head 2025-12-04T08:53:59.3195646Z * [new branch] gh/aorenste/149/orig -> origin/gh/aorenste/149/orig 2025-12-04T08:53:59.3195827Z * [new branch] gh/aorenste/150/base -> origin/gh/aorenste/150/base 2025-12-04T08:53:59.3196005Z * [new branch] gh/aorenste/150/head -> origin/gh/aorenste/150/head 2025-12-04T08:53:59.3196187Z * [new branch] gh/aorenste/150/orig -> origin/gh/aorenste/150/orig 2025-12-04T08:53:59.3196376Z * [new branch] gh/aorenste/151/base -> origin/gh/aorenste/151/base 2025-12-04T08:53:59.3196555Z * [new branch] gh/aorenste/151/head -> origin/gh/aorenste/151/head 2025-12-04T08:53:59.3196735Z * [new branch] gh/aorenste/151/orig -> origin/gh/aorenste/151/orig 2025-12-04T08:53:59.3196916Z * [new branch] gh/aorenste/152/base -> origin/gh/aorenste/152/base 2025-12-04T08:53:59.3197097Z * [new branch] gh/aorenste/152/head -> origin/gh/aorenste/152/head 2025-12-04T08:53:59.3197278Z * [new branch] gh/aorenste/152/orig -> origin/gh/aorenste/152/orig 2025-12-04T08:53:59.3197456Z * [new branch] gh/aorenste/153/base -> origin/gh/aorenste/153/base 2025-12-04T08:53:59.3197634Z * [new branch] gh/aorenste/153/head -> origin/gh/aorenste/153/head 2025-12-04T08:53:59.3197816Z * [new branch] gh/aorenste/153/orig -> origin/gh/aorenste/153/orig 2025-12-04T08:53:59.3198000Z * [new branch] gh/aorenste/154/base -> origin/gh/aorenste/154/base 2025-12-04T08:53:59.3198180Z * [new branch] gh/aorenste/154/head -> origin/gh/aorenste/154/head 2025-12-04T08:53:59.3198359Z * [new branch] gh/aorenste/154/orig -> origin/gh/aorenste/154/orig 2025-12-04T08:53:59.3198542Z * [new branch] gh/aorenste/155/base -> origin/gh/aorenste/155/base 2025-12-04T08:53:59.3198721Z * [new branch] gh/aorenste/155/head -> origin/gh/aorenste/155/head 2025-12-04T08:53:59.3198939Z * [new branch] gh/aorenste/155/orig -> origin/gh/aorenste/155/orig 2025-12-04T08:53:59.3199118Z * [new branch] gh/aorenste/156/base -> origin/gh/aorenste/156/base 2025-12-04T08:53:59.3199296Z * [new branch] gh/aorenste/156/head -> origin/gh/aorenste/156/head 2025-12-04T08:53:59.3199479Z * [new branch] gh/aorenste/156/orig -> origin/gh/aorenste/156/orig 2025-12-04T08:53:59.3199661Z * [new branch] gh/aorenste/157/base -> origin/gh/aorenste/157/base 2025-12-04T08:53:59.3199842Z * [new branch] gh/aorenste/157/head -> origin/gh/aorenste/157/head 2025-12-04T08:53:59.3200020Z * [new branch] gh/aorenste/157/orig -> origin/gh/aorenste/157/orig 2025-12-04T08:53:59.3200205Z * [new branch] gh/aorenste/158/base -> origin/gh/aorenste/158/base 2025-12-04T08:53:59.3200384Z * [new branch] gh/aorenste/158/head -> origin/gh/aorenste/158/head 2025-12-04T08:53:59.3200567Z * [new branch] gh/aorenste/158/orig -> origin/gh/aorenste/158/orig 2025-12-04T08:53:59.3200744Z * [new branch] gh/aorenste/159/base -> origin/gh/aorenste/159/base 2025-12-04T08:53:59.3200925Z * [new branch] gh/aorenste/159/head -> origin/gh/aorenste/159/head 2025-12-04T08:53:59.3201140Z * [new branch] gh/aorenste/159/orig -> origin/gh/aorenste/159/orig 2025-12-04T08:53:59.3201328Z * [new branch] gh/avikchaudhuri/1/base -> origin/gh/avikchaudhuri/1/base 2025-12-04T08:53:59.3201524Z * [new branch] gh/avikchaudhuri/1/head -> origin/gh/avikchaudhuri/1/head 2025-12-04T08:53:59.3201717Z * [new branch] gh/avikchaudhuri/2/base -> origin/gh/avikchaudhuri/2/base 2025-12-04T08:53:59.3201943Z * [new branch] gh/avikchaudhuri/2/head -> origin/gh/avikchaudhuri/2/head 2025-12-04T08:53:59.3202133Z * [new branch] gh/avikchaudhuri/2/orig -> origin/gh/avikchaudhuri/2/orig 2025-12-04T08:53:59.3202321Z * [new branch] gh/bdhirsh/666/base -> origin/gh/bdhirsh/666/base 2025-12-04T08:53:59.3202497Z * [new branch] gh/bdhirsh/666/head -> origin/gh/bdhirsh/666/head 2025-12-04T08:53:59.3202680Z * [new branch] gh/bdhirsh/666/orig -> origin/gh/bdhirsh/666/orig 2025-12-04T08:53:59.3202861Z * [new branch] gh/bdhirsh/668/base -> origin/gh/bdhirsh/668/base 2025-12-04T08:53:59.3203037Z * [new branch] gh/bdhirsh/668/head -> origin/gh/bdhirsh/668/head 2025-12-04T08:53:59.3203214Z * [new branch] gh/bdhirsh/668/orig -> origin/gh/bdhirsh/668/orig 2025-12-04T08:53:59.3203393Z * [new branch] gh/bdhirsh/669/base -> origin/gh/bdhirsh/669/base 2025-12-04T08:53:59.3203566Z * [new branch] gh/bdhirsh/669/head -> origin/gh/bdhirsh/669/head 2025-12-04T08:53:59.3203740Z * [new branch] gh/bdhirsh/669/orig -> origin/gh/bdhirsh/669/orig 2025-12-04T08:53:59.3203916Z * [new branch] gh/bdhirsh/670/base -> origin/gh/bdhirsh/670/base 2025-12-04T08:53:59.3204091Z * [new branch] gh/bdhirsh/670/head -> origin/gh/bdhirsh/670/head 2025-12-04T08:53:59.3204269Z * [new branch] gh/bdhirsh/670/orig -> origin/gh/bdhirsh/670/orig 2025-12-04T08:53:59.3204445Z * [new branch] gh/bdhirsh/672/base -> origin/gh/bdhirsh/672/base 2025-12-04T08:53:59.3204619Z * [new branch] gh/bdhirsh/672/head -> origin/gh/bdhirsh/672/head 2025-12-04T08:53:59.3204794Z * [new branch] gh/bdhirsh/672/orig -> origin/gh/bdhirsh/672/orig 2025-12-04T08:53:59.3204968Z * [new branch] gh/bdhirsh/675/base -> origin/gh/bdhirsh/675/base 2025-12-04T08:53:59.3205140Z * [new branch] gh/bdhirsh/675/head -> origin/gh/bdhirsh/675/head 2025-12-04T08:53:59.3205319Z * [new branch] gh/bdhirsh/675/orig -> origin/gh/bdhirsh/675/orig 2025-12-04T08:53:59.3205535Z * [new branch] gh/bdhirsh/676/base -> origin/gh/bdhirsh/676/base 2025-12-04T08:53:59.3205712Z * [new branch] gh/bdhirsh/676/head -> origin/gh/bdhirsh/676/head 2025-12-04T08:53:59.3205889Z * [new branch] gh/bdhirsh/676/orig -> origin/gh/bdhirsh/676/orig 2025-12-04T08:53:59.3206066Z * [new branch] gh/bdhirsh/677/base -> origin/gh/bdhirsh/677/base 2025-12-04T08:53:59.3206241Z * [new branch] gh/bdhirsh/677/head -> origin/gh/bdhirsh/677/head 2025-12-04T08:53:59.3206312Z * [new branch] gh/bdhirsh/677/orig -> origin/gh/bdhirsh/677/orig 2025-12-04T08:53:59.3206384Z * [new branch] gh/bdhirsh/678/base -> origin/gh/bdhirsh/678/base 2025-12-04T08:53:59.3206452Z * [new branch] gh/bdhirsh/678/head -> origin/gh/bdhirsh/678/head 2025-12-04T08:53:59.3206521Z * [new branch] gh/bdhirsh/678/orig -> origin/gh/bdhirsh/678/orig 2025-12-04T08:53:59.3206594Z * [new branch] gh/bdhirsh/679/base -> origin/gh/bdhirsh/679/base 2025-12-04T08:53:59.3206662Z * [new branch] gh/bdhirsh/679/head -> origin/gh/bdhirsh/679/head 2025-12-04T08:53:59.3206730Z * [new branch] gh/bdhirsh/679/orig -> origin/gh/bdhirsh/679/orig 2025-12-04T08:53:59.3206847Z * [new branch] gh/bdhirsh/680/base -> origin/gh/bdhirsh/680/base 2025-12-04T08:53:59.3206915Z * [new branch] gh/bdhirsh/680/head -> origin/gh/bdhirsh/680/head 2025-12-04T08:53:59.3206983Z * [new branch] gh/bdhirsh/680/orig -> origin/gh/bdhirsh/680/orig 2025-12-04T08:53:59.3207057Z * [new branch] gh/bdhirsh/681/base -> origin/gh/bdhirsh/681/base 2025-12-04T08:53:59.3207127Z * [new branch] gh/bdhirsh/681/head -> origin/gh/bdhirsh/681/head 2025-12-04T08:53:59.3207197Z * [new branch] gh/bdhirsh/681/orig -> origin/gh/bdhirsh/681/orig 2025-12-04T08:53:59.3207290Z * [new branch] gh/benjaminglass1/101/base -> origin/gh/benjaminglass1/101/base 2025-12-04T08:53:59.3207378Z * [new branch] gh/benjaminglass1/101/head -> origin/gh/benjaminglass1/101/head 2025-12-04T08:53:59.3207464Z * [new branch] gh/benjaminglass1/101/orig -> origin/gh/benjaminglass1/101/orig 2025-12-04T08:53:59.3207550Z * [new branch] gh/benjaminglass1/102/base -> origin/gh/benjaminglass1/102/base 2025-12-04T08:53:59.3207635Z * [new branch] gh/benjaminglass1/102/head -> origin/gh/benjaminglass1/102/head 2025-12-04T08:53:59.3207722Z * [new branch] gh/benjaminglass1/102/orig -> origin/gh/benjaminglass1/102/orig 2025-12-04T08:53:59.3207806Z * [new branch] gh/benjaminglass1/106/base -> origin/gh/benjaminglass1/106/base 2025-12-04T08:53:59.3207892Z * [new branch] gh/benjaminglass1/106/head -> origin/gh/benjaminglass1/106/head 2025-12-04T08:53:59.3207982Z * [new branch] gh/benjaminglass1/106/orig -> origin/gh/benjaminglass1/106/orig 2025-12-04T08:53:59.3208067Z * [new branch] gh/benjaminglass1/107/base -> origin/gh/benjaminglass1/107/base 2025-12-04T08:53:59.3208151Z * [new branch] gh/benjaminglass1/107/head -> origin/gh/benjaminglass1/107/head 2025-12-04T08:53:59.3208238Z * [new branch] gh/benjaminglass1/107/orig -> origin/gh/benjaminglass1/107/orig 2025-12-04T08:53:59.3208322Z * [new branch] gh/benjaminglass1/108/base -> origin/gh/benjaminglass1/108/base 2025-12-04T08:53:59.3208406Z * [new branch] gh/benjaminglass1/108/head -> origin/gh/benjaminglass1/108/head 2025-12-04T08:53:59.3208492Z * [new branch] gh/benjaminglass1/108/orig -> origin/gh/benjaminglass1/108/orig 2025-12-04T08:53:59.3208576Z * [new branch] gh/benjaminglass1/109/base -> origin/gh/benjaminglass1/109/base 2025-12-04T08:53:59.3208696Z * [new branch] gh/benjaminglass1/109/head -> origin/gh/benjaminglass1/109/head 2025-12-04T08:53:59.3208780Z * [new branch] gh/benjaminglass1/109/orig -> origin/gh/benjaminglass1/109/orig 2025-12-04T08:53:59.3208864Z * [new branch] gh/benjaminglass1/97/base -> origin/gh/benjaminglass1/97/base 2025-12-04T08:53:59.3208953Z * [new branch] gh/benjaminglass1/97/head -> origin/gh/benjaminglass1/97/head 2025-12-04T08:53:59.3209034Z * [new branch] gh/benjaminglass1/97/orig -> origin/gh/benjaminglass1/97/orig 2025-12-04T08:53:59.3209114Z * [new branch] gh/bobrenjc93/570/base -> origin/gh/bobrenjc93/570/base 2025-12-04T08:53:59.3209192Z * [new branch] gh/bobrenjc93/570/head -> origin/gh/bobrenjc93/570/head 2025-12-04T08:53:59.3209266Z * [new branch] gh/bobrenjc93/570/orig -> origin/gh/bobrenjc93/570/orig 2025-12-04T08:53:59.3209340Z * [new branch] gh/bobrenjc93/604/base -> origin/gh/bobrenjc93/604/base 2025-12-04T08:53:59.3209416Z * [new branch] gh/bobrenjc93/604/head -> origin/gh/bobrenjc93/604/head 2025-12-04T08:53:59.3209490Z * [new branch] gh/bobrenjc93/604/orig -> origin/gh/bobrenjc93/604/orig 2025-12-04T08:53:59.3209564Z * [new branch] gh/bobrenjc93/638/base -> origin/gh/bobrenjc93/638/base 2025-12-04T08:53:59.3209676Z * [new branch] gh/bobrenjc93/638/head -> origin/gh/bobrenjc93/638/head 2025-12-04T08:53:59.3209750Z * [new branch] gh/bobrenjc93/638/orig -> origin/gh/bobrenjc93/638/orig 2025-12-04T08:53:59.3209824Z * [new branch] gh/bobrenjc93/653/base -> origin/gh/bobrenjc93/653/base 2025-12-04T08:53:59.3209900Z * [new branch] gh/bobrenjc93/653/head -> origin/gh/bobrenjc93/653/head 2025-12-04T08:53:59.3209972Z * [new branch] gh/bobrenjc93/653/orig -> origin/gh/bobrenjc93/653/orig 2025-12-04T08:53:59.3210044Z * [new branch] gh/bobrenjc93/654/base -> origin/gh/bobrenjc93/654/base 2025-12-04T08:53:59.3210119Z * [new branch] gh/bobrenjc93/654/head -> origin/gh/bobrenjc93/654/head 2025-12-04T08:53:59.3210191Z * [new branch] gh/bobrenjc93/654/orig -> origin/gh/bobrenjc93/654/orig 2025-12-04T08:53:59.3210265Z * [new branch] gh/bobrenjc93/657/base -> origin/gh/bobrenjc93/657/base 2025-12-04T08:53:59.3210338Z * [new branch] gh/bobrenjc93/657/head -> origin/gh/bobrenjc93/657/head 2025-12-04T08:53:59.3210410Z * [new branch] gh/bobrenjc93/657/orig -> origin/gh/bobrenjc93/657/orig 2025-12-04T08:53:59.3210487Z * [new branch] gh/bobrenjc93/672/base -> origin/gh/bobrenjc93/672/base 2025-12-04T08:53:59.3210561Z * [new branch] gh/bobrenjc93/672/head -> origin/gh/bobrenjc93/672/head 2025-12-04T08:53:59.3210635Z * [new branch] gh/bobrenjc93/672/orig -> origin/gh/bobrenjc93/672/orig 2025-12-04T08:53:59.3210710Z * [new branch] gh/bobrenjc93/679/base -> origin/gh/bobrenjc93/679/base 2025-12-04T08:53:59.3210782Z * [new branch] gh/bobrenjc93/679/head -> origin/gh/bobrenjc93/679/head 2025-12-04T08:53:59.3210855Z * [new branch] gh/bobrenjc93/679/orig -> origin/gh/bobrenjc93/679/orig 2025-12-04T08:53:59.3210928Z * [new branch] gh/bobrenjc93/680/base -> origin/gh/bobrenjc93/680/base 2025-12-04T08:53:59.3211002Z * [new branch] gh/bobrenjc93/680/head -> origin/gh/bobrenjc93/680/head 2025-12-04T08:53:59.3211074Z * [new branch] gh/bobrenjc93/680/orig -> origin/gh/bobrenjc93/680/orig 2025-12-04T08:53:59.3211147Z * [new branch] gh/bobrenjc93/681/base -> origin/gh/bobrenjc93/681/base 2025-12-04T08:53:59.3211219Z * [new branch] gh/bobrenjc93/681/head -> origin/gh/bobrenjc93/681/head 2025-12-04T08:53:59.3211293Z * [new branch] gh/bobrenjc93/681/orig -> origin/gh/bobrenjc93/681/orig 2025-12-04T08:53:59.3211407Z * [new branch] gh/bobrenjc93/682/base -> origin/gh/bobrenjc93/682/base 2025-12-04T08:53:59.3211479Z * [new branch] gh/bobrenjc93/682/head -> origin/gh/bobrenjc93/682/head 2025-12-04T08:53:59.3211551Z * [new branch] gh/bobrenjc93/682/orig -> origin/gh/bobrenjc93/682/orig 2025-12-04T08:53:59.3211626Z * [new branch] gh/bobrenjc93/683/base -> origin/gh/bobrenjc93/683/base 2025-12-04T08:53:59.3211698Z * [new branch] gh/bobrenjc93/683/head -> origin/gh/bobrenjc93/683/head 2025-12-04T08:53:59.3211769Z * [new branch] gh/bobrenjc93/683/orig -> origin/gh/bobrenjc93/683/orig 2025-12-04T08:53:59.3211875Z * [new branch] gh/bobrenjc93/684/base -> origin/gh/bobrenjc93/684/base 2025-12-04T08:53:59.3211948Z * [new branch] gh/bobrenjc93/684/head -> origin/gh/bobrenjc93/684/head 2025-12-04T08:53:59.3212021Z * [new branch] gh/bobrenjc93/684/orig -> origin/gh/bobrenjc93/684/orig 2025-12-04T08:53:59.3212095Z * [new branch] gh/bobrenjc93/685/base -> origin/gh/bobrenjc93/685/base 2025-12-04T08:53:59.3212169Z * [new branch] gh/bobrenjc93/685/head -> origin/gh/bobrenjc93/685/head 2025-12-04T08:53:59.3212246Z * [new branch] gh/bobrenjc93/685/orig -> origin/gh/bobrenjc93/685/orig 2025-12-04T08:53:59.3212357Z * [new branch] gh/bobrenjc93/686/base -> origin/gh/bobrenjc93/686/base 2025-12-04T08:53:59.3212431Z * [new branch] gh/bobrenjc93/686/head -> origin/gh/bobrenjc93/686/head 2025-12-04T08:53:59.3212506Z * [new branch] gh/bobrenjc93/686/orig -> origin/gh/bobrenjc93/686/orig 2025-12-04T08:53:59.3212578Z * [new branch] gh/bobrenjc93/687/base -> origin/gh/bobrenjc93/687/base 2025-12-04T08:53:59.3212650Z * [new branch] gh/bobrenjc93/687/head -> origin/gh/bobrenjc93/687/head 2025-12-04T08:53:59.3212724Z * [new branch] gh/bobrenjc93/687/orig -> origin/gh/bobrenjc93/687/orig 2025-12-04T08:53:59.3212798Z * [new branch] gh/bobrenjc93/688/base -> origin/gh/bobrenjc93/688/base 2025-12-04T08:53:59.3212871Z * [new branch] gh/bobrenjc93/688/head -> origin/gh/bobrenjc93/688/head 2025-12-04T08:53:59.3212946Z * [new branch] gh/bobrenjc93/688/orig -> origin/gh/bobrenjc93/688/orig 2025-12-04T08:53:59.3213021Z * [new branch] gh/bobrenjc93/689/base -> origin/gh/bobrenjc93/689/base 2025-12-04T08:53:59.3213095Z * [new branch] gh/bobrenjc93/689/head -> origin/gh/bobrenjc93/689/head 2025-12-04T08:53:59.3213175Z * [new branch] gh/bobrenjc93/689/orig -> origin/gh/bobrenjc93/689/orig 2025-12-04T08:53:59.3213248Z * [new branch] gh/bobrenjc93/690/base -> origin/gh/bobrenjc93/690/base 2025-12-04T08:53:59.3213321Z * [new branch] gh/bobrenjc93/690/head -> origin/gh/bobrenjc93/690/head 2025-12-04T08:53:59.3213400Z * [new branch] gh/bobrenjc93/690/orig -> origin/gh/bobrenjc93/690/orig 2025-12-04T08:53:59.3213474Z * [new branch] gh/bobrenjc93/691/base -> origin/gh/bobrenjc93/691/base 2025-12-04T08:53:59.3213550Z * [new branch] gh/bobrenjc93/691/head -> origin/gh/bobrenjc93/691/head 2025-12-04T08:53:59.3213624Z * [new branch] gh/bobrenjc93/691/orig -> origin/gh/bobrenjc93/691/orig 2025-12-04T08:53:59.3213699Z * [new branch] gh/bobrenjc93/692/base -> origin/gh/bobrenjc93/692/base 2025-12-04T08:53:59.3213776Z * [new branch] gh/bobrenjc93/692/head -> origin/gh/bobrenjc93/692/head 2025-12-04T08:53:59.3213849Z * [new branch] gh/bobrenjc93/692/orig -> origin/gh/bobrenjc93/692/orig 2025-12-04T08:53:59.3213921Z * [new branch] gh/bobrenjc93/693/base -> origin/gh/bobrenjc93/693/base 2025-12-04T08:53:59.3213996Z * [new branch] gh/bobrenjc93/693/head -> origin/gh/bobrenjc93/693/head 2025-12-04T08:53:59.3214250Z * [new branch] gh/bobrenjc93/693/orig -> origin/gh/bobrenjc93/693/orig 2025-12-04T08:53:59.3214325Z * [new branch] gh/bobrenjc93/694/base -> origin/gh/bobrenjc93/694/base 2025-12-04T08:53:59.3214402Z * [new branch] gh/bobrenjc93/694/head -> origin/gh/bobrenjc93/694/head 2025-12-04T08:53:59.3214477Z * [new branch] gh/bobrenjc93/694/orig -> origin/gh/bobrenjc93/694/orig 2025-12-04T08:53:59.3214550Z * [new branch] gh/bobrenjc93/695/base -> origin/gh/bobrenjc93/695/base 2025-12-04T08:53:59.3214629Z * [new branch] gh/bobrenjc93/695/head -> origin/gh/bobrenjc93/695/head 2025-12-04T08:53:59.3214702Z * [new branch] gh/bobrenjc93/695/orig -> origin/gh/bobrenjc93/695/orig 2025-12-04T08:53:59.3214770Z * [new branch] gh/c00w/23/base -> origin/gh/c00w/23/base 2025-12-04T08:53:59.3214839Z * [new branch] gh/c00w/23/head -> origin/gh/c00w/23/head 2025-12-04T08:53:59.3214904Z * [new branch] gh/c00w/53/base -> origin/gh/c00w/53/base 2025-12-04T08:53:59.3214968Z * [new branch] gh/c00w/53/head -> origin/gh/c00w/53/head 2025-12-04T08:53:59.3215033Z * [new branch] gh/c00w/53/orig -> origin/gh/c00w/53/orig 2025-12-04T08:53:59.3215130Z * [new branch] gh/c00w/54/base -> origin/gh/c00w/54/base 2025-12-04T08:53:59.3215196Z * [new branch] gh/c00w/54/head -> origin/gh/c00w/54/head 2025-12-04T08:53:59.3215267Z * [new branch] gh/c00w/54/orig -> origin/gh/c00w/54/orig 2025-12-04T08:53:59.3215330Z * [new branch] gh/c00w/56/base -> origin/gh/c00w/56/base 2025-12-04T08:53:59.3215396Z * [new branch] gh/c00w/56/head -> origin/gh/c00w/56/head 2025-12-04T08:53:59.3215460Z * [new branch] gh/c00w/56/orig -> origin/gh/c00w/56/orig 2025-12-04T08:53:59.3215525Z * [new branch] gh/c00w/57/base -> origin/gh/c00w/57/base 2025-12-04T08:53:59.3215592Z * [new branch] gh/c00w/57/head -> origin/gh/c00w/57/head 2025-12-04T08:53:59.3215655Z * [new branch] gh/c00w/57/orig -> origin/gh/c00w/57/orig 2025-12-04T08:53:59.3215718Z * [new branch] gh/c00w/58/base -> origin/gh/c00w/58/base 2025-12-04T08:53:59.3215786Z * [new branch] gh/c00w/58/head -> origin/gh/c00w/58/head 2025-12-04T08:53:59.3215849Z * [new branch] gh/c00w/58/orig -> origin/gh/c00w/58/orig 2025-12-04T08:53:59.3215923Z * [new branch] gh/clee2000/1/base -> origin/gh/clee2000/1/base 2025-12-04T08:53:59.3215998Z * [new branch] gh/clee2000/1/head -> origin/gh/clee2000/1/head 2025-12-04T08:53:59.3216067Z * [new branch] gh/clee2000/1/orig -> origin/gh/clee2000/1/orig 2025-12-04T08:53:59.3216148Z * [new branch] gh/coconutruben/1/base -> origin/gh/coconutruben/1/base 2025-12-04T08:53:59.3216229Z * [new branch] gh/coconutruben/1/head -> origin/gh/coconutruben/1/head 2025-12-04T08:53:59.3216308Z * [new branch] gh/coconutruben/55/base -> origin/gh/coconutruben/55/base 2025-12-04T08:53:59.3216387Z * [new branch] gh/coconutruben/55/head -> origin/gh/coconutruben/55/head 2025-12-04T08:53:59.3216470Z * [new branch] gh/coconutruben/55/orig -> origin/gh/coconutruben/55/orig 2025-12-04T08:53:59.3216548Z * [new branch] gh/coconutruben/57/base -> origin/gh/coconutruben/57/base 2025-12-04T08:53:59.3216626Z * [new branch] gh/coconutruben/57/head -> origin/gh/coconutruben/57/head 2025-12-04T08:53:59.3216707Z * [new branch] gh/coconutruben/57/orig -> origin/gh/coconutruben/57/orig 2025-12-04T08:53:59.3216785Z * [new branch] gh/coconutruben/70/base -> origin/gh/coconutruben/70/base 2025-12-04T08:53:59.3216896Z * [new branch] gh/coconutruben/70/head -> origin/gh/coconutruben/70/head 2025-12-04T08:53:59.3216974Z * [new branch] gh/coconutruben/70/orig -> origin/gh/coconutruben/70/orig 2025-12-04T08:53:59.3217050Z * [new branch] gh/coconutruben/71/base -> origin/gh/coconutruben/71/base 2025-12-04T08:53:59.3217130Z * [new branch] gh/coconutruben/71/head -> origin/gh/coconutruben/71/head 2025-12-04T08:53:59.3217206Z * [new branch] gh/coconutruben/71/orig -> origin/gh/coconutruben/71/orig 2025-12-04T08:53:59.3217282Z * [new branch] gh/coconutruben/72/base -> origin/gh/coconutruben/72/base 2025-12-04T08:53:59.3217364Z * [new branch] gh/coconutruben/72/head -> origin/gh/coconutruben/72/head 2025-12-04T08:53:59.3217441Z * [new branch] gh/coconutruben/72/orig -> origin/gh/coconutruben/72/orig 2025-12-04T08:53:59.3217517Z * [new branch] gh/coconutruben/73/base -> origin/gh/coconutruben/73/base 2025-12-04T08:53:59.3217597Z * [new branch] gh/coconutruben/73/head -> origin/gh/coconutruben/73/head 2025-12-04T08:53:59.3217675Z * [new branch] gh/coconutruben/73/orig -> origin/gh/coconutruben/73/orig 2025-12-04T08:53:59.3217752Z * [new branch] gh/coconutruben/74/base -> origin/gh/coconutruben/74/base 2025-12-04T08:53:59.3217861Z * [new branch] gh/coconutruben/74/head -> origin/gh/coconutruben/74/head 2025-12-04T08:53:59.3217938Z * [new branch] gh/coconutruben/74/orig -> origin/gh/coconutruben/74/orig 2025-12-04T08:53:59.3218017Z * [new branch] gh/coconutruben/79/base -> origin/gh/coconutruben/79/base 2025-12-04T08:53:59.3218099Z * [new branch] gh/coconutruben/79/head -> origin/gh/coconutruben/79/head 2025-12-04T08:53:59.3218176Z * [new branch] gh/coconutruben/79/orig -> origin/gh/coconutruben/79/orig 2025-12-04T08:53:59.3218253Z * [new branch] gh/coconutruben/80/base -> origin/gh/coconutruben/80/base 2025-12-04T08:53:59.3218335Z * [new branch] gh/coconutruben/80/head -> origin/gh/coconutruben/80/head 2025-12-04T08:53:59.3218410Z * [new branch] gh/coconutruben/80/orig -> origin/gh/coconutruben/80/orig 2025-12-04T08:53:59.3218486Z * [new branch] gh/coconutruben/82/base -> origin/gh/coconutruben/82/base 2025-12-04T08:53:59.3218565Z * [new branch] gh/coconutruben/82/head -> origin/gh/coconutruben/82/head 2025-12-04T08:53:59.3218641Z * [new branch] gh/coconutruben/82/orig -> origin/gh/coconutruben/82/orig 2025-12-04T08:53:59.3218720Z * [new branch] gh/coconutruben/83/base -> origin/gh/coconutruben/83/base 2025-12-04T08:53:59.3218797Z * [new branch] gh/coconutruben/83/head -> origin/gh/coconutruben/83/head 2025-12-04T08:53:59.3218874Z * [new branch] gh/coconutruben/83/orig -> origin/gh/coconutruben/83/orig 2025-12-04T08:53:59.3218954Z * [new branch] gh/coconutruben/84/base -> origin/gh/coconutruben/84/base 2025-12-04T08:53:59.3219030Z * [new branch] gh/coconutruben/84/head -> origin/gh/coconutruben/84/head 2025-12-04T08:53:59.3219108Z * [new branch] gh/coconutruben/84/orig -> origin/gh/coconutruben/84/orig 2025-12-04T08:53:59.3219188Z * [new branch] gh/coconutruben/85/base -> origin/gh/coconutruben/85/base 2025-12-04T08:53:59.3219264Z * [new branch] gh/coconutruben/85/head -> origin/gh/coconutruben/85/head 2025-12-04T08:53:59.3219339Z * [new branch] gh/coconutruben/85/orig -> origin/gh/coconutruben/85/orig 2025-12-04T08:53:59.3219418Z * [new branch] gh/coconutruben/86/base -> origin/gh/coconutruben/86/base 2025-12-04T08:53:59.3219496Z * [new branch] gh/coconutruben/86/head -> origin/gh/coconutruben/86/head 2025-12-04T08:53:59.3219573Z * [new branch] gh/coconutruben/86/orig -> origin/gh/coconutruben/86/orig 2025-12-04T08:53:59.3219678Z * [new branch] gh/colinchan15/1/base -> origin/gh/colinchan15/1/base 2025-12-04T08:53:59.3219753Z * [new branch] gh/colinchan15/1/head -> origin/gh/colinchan15/1/head 2025-12-04T08:53:59.3219829Z * [new branch] gh/colinchan15/2/base -> origin/gh/colinchan15/2/base 2025-12-04T08:53:59.3219907Z * [new branch] gh/colinchan15/2/head -> origin/gh/colinchan15/2/head 2025-12-04T08:53:59.3219979Z * [new branch] gh/colinchan15/3/base -> origin/gh/colinchan15/3/base 2025-12-04T08:53:59.3220053Z * [new branch] gh/colinchan15/3/head -> origin/gh/colinchan15/3/head 2025-12-04T08:53:59.3220130Z * [new branch] gh/colinchan15/6/base -> origin/gh/colinchan15/6/base 2025-12-04T08:53:59.3220204Z * [new branch] gh/colinchan15/6/head -> origin/gh/colinchan15/6/head 2025-12-04T08:53:59.3220273Z * [new branch] gh/d4l3k/1/base -> origin/gh/d4l3k/1/base 2025-12-04T08:53:59.3220341Z * [new branch] gh/d4l3k/1/head -> origin/gh/d4l3k/1/head 2025-12-04T08:53:59.3220406Z * [new branch] gh/d4l3k/2/base -> origin/gh/d4l3k/2/base 2025-12-04T08:53:59.3220476Z * [new branch] gh/d4l3k/2/head -> origin/gh/d4l3k/2/head 2025-12-04T08:53:59.3220591Z * [new branch] gh/d4l3k/2/orig -> origin/gh/d4l3k/2/orig 2025-12-04T08:53:59.3220655Z * [new branch] gh/d4l3k/3/base -> origin/gh/d4l3k/3/base 2025-12-04T08:53:59.3220722Z * [new branch] gh/d4l3k/3/head -> origin/gh/d4l3k/3/head 2025-12-04T08:53:59.3220786Z * [new branch] gh/d4l3k/3/orig -> origin/gh/d4l3k/3/orig 2025-12-04T08:53:59.3220850Z * [new branch] gh/d4l3k/4/base -> origin/gh/d4l3k/4/base 2025-12-04T08:53:59.3220921Z * [new branch] gh/d4l3k/4/head -> origin/gh/d4l3k/4/head 2025-12-04T08:53:59.3220987Z * [new branch] gh/d4l3k/4/orig -> origin/gh/d4l3k/4/orig 2025-12-04T08:53:59.3221051Z * [new branch] gh/d4l3k/5/base -> origin/gh/d4l3k/5/base 2025-12-04T08:53:59.3221120Z * [new branch] gh/d4l3k/5/orig -> origin/gh/d4l3k/5/orig 2025-12-04T08:53:59.3221212Z * [new branch] gh/davidberard98/392/base -> origin/gh/davidberard98/392/base 2025-12-04T08:53:59.3221298Z * [new branch] gh/davidberard98/392/head -> origin/gh/davidberard98/392/head 2025-12-04T08:53:59.3221387Z * [new branch] gh/davidberard98/392/orig -> origin/gh/davidberard98/392/orig 2025-12-04T08:53:59.3221472Z * [new branch] gh/davidberard98/399/base -> origin/gh/davidberard98/399/base 2025-12-04T08:53:59.3221556Z * [new branch] gh/davidberard98/399/head -> origin/gh/davidberard98/399/head 2025-12-04T08:53:59.3221643Z * [new branch] gh/davidberard98/399/orig -> origin/gh/davidberard98/399/orig 2025-12-04T08:53:59.3221721Z * [new branch] gh/desertfire/605/base -> origin/gh/desertfire/605/base 2025-12-04T08:53:59.3221798Z * [new branch] gh/desertfire/605/head -> origin/gh/desertfire/605/head 2025-12-04T08:53:59.3221901Z * [new branch] gh/desertfire/605/orig -> origin/gh/desertfire/605/orig 2025-12-04T08:53:59.3221978Z * [new branch] gh/desertfire/606/base -> origin/gh/desertfire/606/base 2025-12-04T08:53:59.3222052Z * [new branch] gh/desertfire/606/head -> origin/gh/desertfire/606/head 2025-12-04T08:53:59.3222128Z * [new branch] gh/desertfire/606/orig -> origin/gh/desertfire/606/orig 2025-12-04T08:53:59.3222202Z * [new branch] gh/desertfire/607/base -> origin/gh/desertfire/607/base 2025-12-04T08:53:59.3222281Z * [new branch] gh/desertfire/607/head -> origin/gh/desertfire/607/head 2025-12-04T08:53:59.3222397Z * [new branch] gh/desertfire/607/orig -> origin/gh/desertfire/607/orig 2025-12-04T08:53:59.3222471Z * [new branch] gh/desertfire/608/base -> origin/gh/desertfire/608/base 2025-12-04T08:53:59.3222550Z * [new branch] gh/desertfire/608/head -> origin/gh/desertfire/608/head 2025-12-04T08:53:59.3222624Z * [new branch] gh/desertfire/608/orig -> origin/gh/desertfire/608/orig 2025-12-04T08:53:59.3222699Z * [new branch] gh/desertfire/609/base -> origin/gh/desertfire/609/base 2025-12-04T08:53:59.3222777Z * [new branch] gh/desertfire/609/head -> origin/gh/desertfire/609/head 2025-12-04T08:53:59.3222856Z * [new branch] gh/desertfire/609/orig -> origin/gh/desertfire/609/orig 2025-12-04T08:53:59.3222932Z * [new branch] gh/desertfire/610/base -> origin/gh/desertfire/610/base 2025-12-04T08:53:59.3223010Z * [new branch] gh/desertfire/610/head -> origin/gh/desertfire/610/head 2025-12-04T08:53:59.3223089Z * [new branch] gh/desertfire/610/orig -> origin/gh/desertfire/610/orig 2025-12-04T08:53:59.3223164Z * [new branch] gh/desertfire/611/base -> origin/gh/desertfire/611/base 2025-12-04T08:53:59.3223243Z * [new branch] gh/desertfire/611/head -> origin/gh/desertfire/611/head 2025-12-04T08:53:59.3223317Z * [new branch] gh/desertfire/611/orig -> origin/gh/desertfire/611/orig 2025-12-04T08:53:59.3223434Z * [new branch] gh/desertfire/612/base -> origin/gh/desertfire/612/base 2025-12-04T08:53:59.3223512Z * [new branch] gh/desertfire/612/head -> origin/gh/desertfire/612/head 2025-12-04T08:53:59.3223587Z * [new branch] gh/desertfire/612/orig -> origin/gh/desertfire/612/orig 2025-12-04T08:53:59.3223662Z * [new branch] gh/desertfire/613/base -> origin/gh/desertfire/613/base 2025-12-04T08:53:59.3223742Z * [new branch] gh/desertfire/613/head -> origin/gh/desertfire/613/head 2025-12-04T08:53:59.3223818Z * [new branch] gh/desertfire/613/orig -> origin/gh/desertfire/613/orig 2025-12-04T08:53:59.3223896Z * [new branch] gh/desertfire/614/base -> origin/gh/desertfire/614/base 2025-12-04T08:53:59.3223970Z * [new branch] gh/desertfire/614/head -> origin/gh/desertfire/614/head 2025-12-04T08:53:59.3224045Z * [new branch] gh/desertfire/614/orig -> origin/gh/desertfire/614/orig 2025-12-04T08:53:59.3224121Z * [new branch] gh/desertfire/615/base -> origin/gh/desertfire/615/base 2025-12-04T08:53:59.3224194Z * [new branch] gh/desertfire/615/head -> origin/gh/desertfire/615/head 2025-12-04T08:53:59.3224268Z * [new branch] gh/desertfire/615/orig -> origin/gh/desertfire/615/orig 2025-12-04T08:53:59.3224344Z * [new branch] gh/desertfire/616/base -> origin/gh/desertfire/616/base 2025-12-04T08:53:59.3224418Z * [new branch] gh/desertfire/616/head -> origin/gh/desertfire/616/head 2025-12-04T08:53:59.3224493Z * [new branch] gh/desertfire/616/orig -> origin/gh/desertfire/616/orig 2025-12-04T08:53:59.3224572Z * [new branch] gh/desertfire/617/base -> origin/gh/desertfire/617/base 2025-12-04T08:53:59.3224644Z * [new branch] gh/desertfire/617/head -> origin/gh/desertfire/617/head 2025-12-04T08:53:59.3224719Z * [new branch] gh/desertfire/617/orig -> origin/gh/desertfire/617/orig 2025-12-04T08:53:59.3224799Z * [new branch] gh/dharakk/1/base -> origin/gh/dharakk/1/base 2025-12-04T08:53:59.3224868Z * [new branch] gh/dharakk/1/head -> origin/gh/dharakk/1/head 2025-12-04T08:53:59.3224940Z * [new branch] gh/drisspg/170/base -> origin/gh/drisspg/170/base 2025-12-04T08:53:59.3225018Z * [new branch] gh/drisspg/170/head -> origin/gh/drisspg/170/head 2025-12-04T08:53:59.3225089Z * [new branch] gh/drisspg/170/orig -> origin/gh/drisspg/170/orig 2025-12-04T08:53:59.3225182Z * [new branch] gh/drisspg/182/base -> origin/gh/drisspg/182/base 2025-12-04T08:53:59.3225254Z * [new branch] gh/drisspg/182/head -> origin/gh/drisspg/182/head 2025-12-04T08:53:59.3225323Z * [new branch] gh/drisspg/183/base -> origin/gh/drisspg/183/base 2025-12-04T08:53:59.3225393Z * [new branch] gh/drisspg/183/head -> origin/gh/drisspg/183/head 2025-12-04T08:53:59.3225463Z * [new branch] gh/drisspg/184/base -> origin/gh/drisspg/184/base 2025-12-04T08:53:59.3225532Z * [new branch] gh/drisspg/184/head -> origin/gh/drisspg/184/head 2025-12-04T08:53:59.3225601Z * [new branch] gh/drisspg/185/base -> origin/gh/drisspg/185/base 2025-12-04T08:53:59.3225670Z * [new branch] gh/drisspg/185/head -> origin/gh/drisspg/185/head 2025-12-04T08:53:59.3225739Z * [new branch] gh/drisspg/194/base -> origin/gh/drisspg/194/base 2025-12-04T08:53:59.3225814Z * [new branch] gh/drisspg/194/head -> origin/gh/drisspg/194/head 2025-12-04T08:53:59.3225884Z * [new branch] gh/drisspg/194/orig -> origin/gh/drisspg/194/orig 2025-12-04T08:53:59.3225953Z * [new branch] gh/drisspg/200/base -> origin/gh/drisspg/200/base 2025-12-04T08:53:59.3226053Z * [new branch] gh/drisspg/200/head -> origin/gh/drisspg/200/head 2025-12-04T08:53:59.3226122Z * [new branch] gh/drisspg/200/orig -> origin/gh/drisspg/200/orig 2025-12-04T08:53:59.3226192Z * [new branch] gh/drisspg/218/base -> origin/gh/drisspg/218/base 2025-12-04T08:53:59.3226263Z * [new branch] gh/drisspg/218/head -> origin/gh/drisspg/218/head 2025-12-04T08:53:59.3226331Z * [new branch] gh/drisspg/218/orig -> origin/gh/drisspg/218/orig 2025-12-04T08:53:59.3226400Z * [new branch] gh/drisspg/219/base -> origin/gh/drisspg/219/base 2025-12-04T08:53:59.3226471Z * [new branch] gh/drisspg/219/head -> origin/gh/drisspg/219/head 2025-12-04T08:53:59.3226539Z * [new branch] gh/drisspg/219/orig -> origin/gh/drisspg/219/orig 2025-12-04T08:53:59.3226609Z * [new branch] gh/drisspg/220/base -> origin/gh/drisspg/220/base 2025-12-04T08:53:59.3226685Z * [new branch] gh/drisspg/220/head -> origin/gh/drisspg/220/head 2025-12-04T08:53:59.3226754Z * [new branch] gh/drisspg/220/orig -> origin/gh/drisspg/220/orig 2025-12-04T08:53:59.3226823Z * [new branch] gh/drisspg/221/base -> origin/gh/drisspg/221/base 2025-12-04T08:53:59.3226896Z * [new branch] gh/drisspg/221/head -> origin/gh/drisspg/221/head 2025-12-04T08:53:59.3226964Z * [new branch] gh/drisspg/221/orig -> origin/gh/drisspg/221/orig 2025-12-04T08:53:59.3227032Z * [new branch] gh/drisspg/222/base -> origin/gh/drisspg/222/base 2025-12-04T08:53:59.3227108Z * [new branch] gh/drisspg/222/head -> origin/gh/drisspg/222/head 2025-12-04T08:53:59.3227178Z * [new branch] gh/drisspg/222/orig -> origin/gh/drisspg/222/orig 2025-12-04T08:53:59.3227249Z * [new branch] gh/drisspg/223/base -> origin/gh/drisspg/223/base 2025-12-04T08:53:59.3227319Z * [new branch] gh/drisspg/223/head -> origin/gh/drisspg/223/head 2025-12-04T08:53:59.3227388Z * [new branch] gh/drisspg/223/orig -> origin/gh/drisspg/223/orig 2025-12-04T08:53:59.3227459Z * [new branch] gh/drisspg/224/base -> origin/gh/drisspg/224/base 2025-12-04T08:53:59.3227527Z * [new branch] gh/drisspg/224/head -> origin/gh/drisspg/224/head 2025-12-04T08:53:59.3227598Z * [new branch] gh/drisspg/224/orig -> origin/gh/drisspg/224/orig 2025-12-04T08:53:59.3227669Z * [new branch] gh/drisspg/225/base -> origin/gh/drisspg/225/base 2025-12-04T08:53:59.3227763Z * [new branch] gh/drisspg/225/head -> origin/gh/drisspg/225/head 2025-12-04T08:53:59.3227833Z * [new branch] gh/drisspg/225/orig -> origin/gh/drisspg/225/orig 2025-12-04T08:53:59.3227907Z * [new branch] gh/drisspg/226/base -> origin/gh/drisspg/226/base 2025-12-04T08:53:59.3227977Z * [new branch] gh/drisspg/226/head -> origin/gh/drisspg/226/head 2025-12-04T08:53:59.3228045Z * [new branch] gh/drisspg/226/orig -> origin/gh/drisspg/226/orig 2025-12-04T08:53:59.3228116Z * [new branch] gh/drisspg/227/base -> origin/gh/drisspg/227/base 2025-12-04T08:53:59.3228184Z * [new branch] gh/drisspg/227/head -> origin/gh/drisspg/227/head 2025-12-04T08:53:59.3228253Z * [new branch] gh/drisspg/227/orig -> origin/gh/drisspg/227/orig 2025-12-04T08:53:59.3228323Z * [new branch] gh/drisspg/228/base -> origin/gh/drisspg/228/base 2025-12-04T08:53:59.3228393Z * [new branch] gh/drisspg/228/head -> origin/gh/drisspg/228/head 2025-12-04T08:53:59.3228461Z * [new branch] gh/drisspg/228/orig -> origin/gh/drisspg/228/orig 2025-12-04T08:53:59.3228535Z * [new branch] gh/drisspg/229/base -> origin/gh/drisspg/229/base 2025-12-04T08:53:59.3228633Z * [new branch] gh/drisspg/229/head -> origin/gh/drisspg/229/head 2025-12-04T08:53:59.3228702Z * [new branch] gh/drisspg/229/orig -> origin/gh/drisspg/229/orig 2025-12-04T08:53:59.3228772Z * [new branch] gh/drisspg/230/base -> origin/gh/drisspg/230/base 2025-12-04T08:53:59.3228840Z * [new branch] gh/drisspg/230/head -> origin/gh/drisspg/230/head 2025-12-04T08:53:59.3228912Z * [new branch] gh/drisspg/230/orig -> origin/gh/drisspg/230/orig 2025-12-04T08:53:59.3228984Z * [new branch] gh/dsjohns2/1/base -> origin/gh/dsjohns2/1/base 2025-12-04T08:53:59.3229056Z * [new branch] gh/dsjohns2/1/head -> origin/gh/dsjohns2/1/head 2025-12-04T08:53:59.3229134Z * [new branch] gh/dzmitry-huba/1/base -> origin/gh/dzmitry-huba/1/base 2025-12-04T08:53:59.3229209Z * [new branch] gh/dzmitry-huba/1/head -> origin/gh/dzmitry-huba/1/head 2025-12-04T08:53:59.3229288Z * [new branch] gh/dzmitry-huba/12/base -> origin/gh/dzmitry-huba/12/base 2025-12-04T08:53:59.3229368Z * [new branch] gh/dzmitry-huba/12/head -> origin/gh/dzmitry-huba/12/head 2025-12-04T08:53:59.3229442Z * [new branch] gh/dzmitry-huba/12/orig -> origin/gh/dzmitry-huba/12/orig 2025-12-04T08:53:59.3229516Z * [new branch] gh/dzmitry-huba/13/base -> origin/gh/dzmitry-huba/13/base 2025-12-04T08:53:59.3229591Z * [new branch] gh/dzmitry-huba/13/head -> origin/gh/dzmitry-huba/13/head 2025-12-04T08:53:59.3229667Z * [new branch] gh/dzmitry-huba/13/orig -> origin/gh/dzmitry-huba/13/orig 2025-12-04T08:53:59.3229741Z * [new branch] gh/dzmitry-huba/14/base -> origin/gh/dzmitry-huba/14/base 2025-12-04T08:53:59.3229817Z * [new branch] gh/dzmitry-huba/14/head -> origin/gh/dzmitry-huba/14/head 2025-12-04T08:53:59.3229890Z * [new branch] gh/dzmitry-huba/14/orig -> origin/gh/dzmitry-huba/14/orig 2025-12-04T08:53:59.3229965Z * [new branch] gh/dzmitry-huba/15/base -> origin/gh/dzmitry-huba/15/base 2025-12-04T08:53:59.3230045Z * [new branch] gh/dzmitry-huba/15/head -> origin/gh/dzmitry-huba/15/head 2025-12-04T08:53:59.3230121Z * [new branch] gh/dzmitry-huba/15/orig -> origin/gh/dzmitry-huba/15/orig 2025-12-04T08:53:59.3230196Z * [new branch] gh/dzmitry-huba/16/base -> origin/gh/dzmitry-huba/16/base 2025-12-04T08:53:59.3230276Z * [new branch] gh/dzmitry-huba/16/head -> origin/gh/dzmitry-huba/16/head 2025-12-04T08:53:59.3230389Z * [new branch] gh/dzmitry-huba/16/orig -> origin/gh/dzmitry-huba/16/orig 2025-12-04T08:53:59.3230465Z * [new branch] gh/dzmitry-huba/17/base -> origin/gh/dzmitry-huba/17/base 2025-12-04T08:53:59.3230542Z * [new branch] gh/dzmitry-huba/17/head -> origin/gh/dzmitry-huba/17/head 2025-12-04T08:53:59.3230617Z * [new branch] gh/dzmitry-huba/17/orig -> origin/gh/dzmitry-huba/17/orig 2025-12-04T08:53:59.3230695Z * [new branch] gh/dzmitry-huba/2/base -> origin/gh/dzmitry-huba/2/base 2025-12-04T08:53:59.3230771Z * [new branch] gh/dzmitry-huba/2/head -> origin/gh/dzmitry-huba/2/head 2025-12-04T08:53:59.3230847Z * [new branch] gh/dzmitry-huba/3/base -> origin/gh/dzmitry-huba/3/base 2025-12-04T08:53:59.3230926Z * [new branch] gh/dzmitry-huba/3/head -> origin/gh/dzmitry-huba/3/head 2025-12-04T08:53:59.3231002Z * [new branch] gh/eellison/808/base -> origin/gh/eellison/808/base 2025-12-04T08:53:59.3231077Z * [new branch] gh/eellison/808/head -> origin/gh/eellison/808/head 2025-12-04T08:53:59.3231153Z * [new branch] gh/eellison/808/orig -> origin/gh/eellison/808/orig 2025-12-04T08:53:59.3231225Z * [new branch] gh/eellison/822/base -> origin/gh/eellison/822/base 2025-12-04T08:53:59.3231322Z * [new branch] gh/eellison/822/head -> origin/gh/eellison/822/head 2025-12-04T08:53:59.3231396Z * [new branch] gh/eellison/822/orig -> origin/gh/eellison/822/orig 2025-12-04T08:53:59.3231468Z * [new branch] gh/eellison/823/base -> origin/gh/eellison/823/base 2025-12-04T08:53:59.3231538Z * [new branch] gh/eellison/823/head -> origin/gh/eellison/823/head 2025-12-04T08:53:59.3231611Z * [new branch] gh/eellison/823/orig -> origin/gh/eellison/823/orig 2025-12-04T08:53:59.3231681Z * [new branch] gh/eellison/862/base -> origin/gh/eellison/862/base 2025-12-04T08:53:59.3231755Z * [new branch] gh/eellison/862/head -> origin/gh/eellison/862/head 2025-12-04T08:53:59.3231828Z * [new branch] gh/eellison/862/orig -> origin/gh/eellison/862/orig 2025-12-04T08:53:59.3231936Z * [new branch] gh/eellison/863/base -> origin/gh/eellison/863/base 2025-12-04T08:53:59.3232009Z * [new branch] gh/eellison/863/head -> origin/gh/eellison/863/head 2025-12-04T08:53:59.3232084Z * [new branch] gh/eellison/863/orig -> origin/gh/eellison/863/orig 2025-12-04T08:53:59.3232156Z * [new branch] gh/eellison/864/base -> origin/gh/eellison/864/base 2025-12-04T08:53:59.3232231Z * [new branch] gh/eellison/864/head -> origin/gh/eellison/864/head 2025-12-04T08:53:59.3232302Z * [new branch] gh/eellison/864/orig -> origin/gh/eellison/864/orig 2025-12-04T08:53:59.3232374Z * [new branch] gh/eellison/865/base -> origin/gh/eellison/865/base 2025-12-04T08:53:59.3232453Z * [new branch] gh/eellison/865/head -> origin/gh/eellison/865/head 2025-12-04T08:53:59.3232526Z * [new branch] gh/eellison/865/orig -> origin/gh/eellison/865/orig 2025-12-04T08:53:59.3232598Z * [new branch] gh/eellison/866/base -> origin/gh/eellison/866/base 2025-12-04T08:53:59.3232673Z * [new branch] gh/eellison/866/head -> origin/gh/eellison/866/head 2025-12-04T08:53:59.3232744Z * [new branch] gh/eellison/866/orig -> origin/gh/eellison/866/orig 2025-12-04T08:53:59.3232817Z * [new branch] gh/eellison/867/base -> origin/gh/eellison/867/base 2025-12-04T08:53:59.3232890Z * [new branch] gh/eellison/867/head -> origin/gh/eellison/867/head 2025-12-04T08:53:59.3232962Z * [new branch] gh/eellison/867/orig -> origin/gh/eellison/867/orig 2025-12-04T08:53:59.3233076Z * [new branch] gh/eellison/868/base -> origin/gh/eellison/868/base 2025-12-04T08:53:59.3233152Z * [new branch] gh/eellison/868/head -> origin/gh/eellison/868/head 2025-12-04T08:53:59.3233224Z * [new branch] gh/eellison/868/orig -> origin/gh/eellison/868/orig 2025-12-04T08:53:59.3233297Z * [new branch] gh/eellison/869/base -> origin/gh/eellison/869/base 2025-12-04T08:53:59.3233372Z * [new branch] gh/eellison/869/head -> origin/gh/eellison/869/head 2025-12-04T08:53:59.3233443Z * [new branch] gh/eellison/869/orig -> origin/gh/eellison/869/orig 2025-12-04T08:53:59.3233512Z * [new branch] gh/eellison/870/base -> origin/gh/eellison/870/base 2025-12-04T08:53:59.3233585Z * [new branch] gh/eellison/870/head -> origin/gh/eellison/870/head 2025-12-04T08:53:59.3233656Z * [new branch] gh/eellison/870/orig -> origin/gh/eellison/870/orig 2025-12-04T08:53:59.3233733Z * [new branch] gh/eellison/871/base -> origin/gh/eellison/871/base 2025-12-04T08:53:59.3233804Z * [new branch] gh/eellison/871/head -> origin/gh/eellison/871/head 2025-12-04T08:53:59.3233877Z * [new branch] gh/eellison/871/orig -> origin/gh/eellison/871/orig 2025-12-04T08:53:59.3233953Z * [new branch] gh/eellison/872/base -> origin/gh/eellison/872/base 2025-12-04T08:53:59.3234067Z * [new branch] gh/eellison/872/head -> origin/gh/eellison/872/head 2025-12-04T08:53:59.3234138Z * [new branch] gh/eellison/872/orig -> origin/gh/eellison/872/orig 2025-12-04T08:53:59.3234213Z * [new branch] gh/eellison/873/base -> origin/gh/eellison/873/base 2025-12-04T08:53:59.3234285Z * [new branch] gh/eellison/873/head -> origin/gh/eellison/873/head 2025-12-04T08:53:59.3234356Z * [new branch] gh/eellison/873/orig -> origin/gh/eellison/873/orig 2025-12-04T08:53:59.3234432Z * [new branch] gh/eellison/874/base -> origin/gh/eellison/874/base 2025-12-04T08:53:59.3234505Z * [new branch] gh/eellison/874/head -> origin/gh/eellison/874/head 2025-12-04T08:53:59.3234576Z * [new branch] gh/eellison/874/orig -> origin/gh/eellison/874/orig 2025-12-04T08:53:59.3234652Z * [new branch] gh/eellison/875/base -> origin/gh/eellison/875/base 2025-12-04T08:53:59.3234726Z * [new branch] gh/eellison/875/head -> origin/gh/eellison/875/head 2025-12-04T08:53:59.3234797Z * [new branch] gh/eellison/875/orig -> origin/gh/eellison/875/orig 2025-12-04T08:53:59.3234873Z * [new branch] gh/eellison/876/base -> origin/gh/eellison/876/base 2025-12-04T08:53:59.3234944Z * [new branch] gh/eellison/876/head -> origin/gh/eellison/876/head 2025-12-04T08:53:59.3235015Z * [new branch] gh/eellison/876/orig -> origin/gh/eellison/876/orig 2025-12-04T08:53:59.3235092Z * [new branch] gh/eellison/877/base -> origin/gh/eellison/877/base 2025-12-04T08:53:59.3235163Z * [new branch] gh/eellison/877/head -> origin/gh/eellison/877/head 2025-12-04T08:53:59.3235236Z * [new branch] gh/eellison/877/orig -> origin/gh/eellison/877/orig 2025-12-04T08:53:59.3235312Z * [new branch] gh/eellison/878/base -> origin/gh/eellison/878/base 2025-12-04T08:53:59.3235384Z * [new branch] gh/eellison/878/head -> origin/gh/eellison/878/head 2025-12-04T08:53:59.3235458Z * [new branch] gh/eellison/878/orig -> origin/gh/eellison/878/orig 2025-12-04T08:53:59.3235532Z * [new branch] gh/eellison/879/base -> origin/gh/eellison/879/base 2025-12-04T08:53:59.3235605Z * [new branch] gh/eellison/879/head -> origin/gh/eellison/879/head 2025-12-04T08:53:59.3235682Z * [new branch] gh/eellison/879/orig -> origin/gh/eellison/879/orig 2025-12-04T08:53:59.3235780Z * [new branch] gh/eellison/880/base -> origin/gh/eellison/880/base 2025-12-04T08:53:59.3235852Z * [new branch] gh/eellison/880/head -> origin/gh/eellison/880/head 2025-12-04T08:53:59.3235927Z * [new branch] gh/eellison/880/orig -> origin/gh/eellison/880/orig 2025-12-04T08:53:59.3235999Z * [new branch] gh/eellison/881/base -> origin/gh/eellison/881/base 2025-12-04T08:53:59.3236072Z * [new branch] gh/eellison/881/head -> origin/gh/eellison/881/head 2025-12-04T08:53:59.3236147Z * [new branch] gh/eellison/881/orig -> origin/gh/eellison/881/orig 2025-12-04T08:53:59.3236220Z * [new branch] gh/eellison/882/base -> origin/gh/eellison/882/base 2025-12-04T08:53:59.3236292Z * [new branch] gh/eellison/882/head -> origin/gh/eellison/882/head 2025-12-04T08:53:59.3236366Z * [new branch] gh/eellison/882/orig -> origin/gh/eellison/882/orig 2025-12-04T08:53:59.3236440Z * [new branch] gh/eellison/883/base -> origin/gh/eellison/883/base 2025-12-04T08:53:59.3236512Z * [new branch] gh/eellison/883/head -> origin/gh/eellison/883/head 2025-12-04T08:53:59.3236587Z * [new branch] gh/eellison/883/orig -> origin/gh/eellison/883/orig 2025-12-04T08:53:59.3236690Z * [new branch] gh/eellison/884/base -> origin/gh/eellison/884/base 2025-12-04T08:53:59.3236761Z * [new branch] gh/eellison/884/head -> origin/gh/eellison/884/head 2025-12-04T08:53:59.3236837Z * [new branch] gh/eellison/884/orig -> origin/gh/eellison/884/orig 2025-12-04T08:53:59.3236906Z * [new branch] gh/etaf/147/base -> origin/gh/etaf/147/base 2025-12-04T08:53:59.3236976Z * [new branch] gh/etaf/147/head -> origin/gh/etaf/147/head 2025-12-04T08:53:59.3237043Z * [new branch] gh/etaf/154/base -> origin/gh/etaf/154/base 2025-12-04T08:53:59.3237110Z * [new branch] gh/etaf/154/head -> origin/gh/etaf/154/head 2025-12-04T08:53:59.3237180Z * [new branch] gh/etaf/154/orig -> origin/gh/etaf/154/orig 2025-12-04T08:53:59.3237248Z * [new branch] gh/etaf/156/base -> origin/gh/etaf/156/base 2025-12-04T08:53:59.3237317Z * [new branch] gh/etaf/156/head -> origin/gh/etaf/156/head 2025-12-04T08:53:59.3237385Z * [new branch] gh/etaf/156/orig -> origin/gh/etaf/156/orig 2025-12-04T08:53:59.3237454Z * [new branch] gh/etaf/157/base -> origin/gh/etaf/157/base 2025-12-04T08:53:59.3237520Z * [new branch] gh/etaf/157/head -> origin/gh/etaf/157/head 2025-12-04T08:53:59.3237588Z * [new branch] gh/etaf/157/orig -> origin/gh/etaf/157/orig 2025-12-04T08:53:59.3237653Z * [new branch] gh/etaf/158/base -> origin/gh/etaf/158/base 2025-12-04T08:53:59.3237719Z * [new branch] gh/etaf/158/head -> origin/gh/etaf/158/head 2025-12-04T08:53:59.3237787Z * [new branch] gh/etaf/158/orig -> origin/gh/etaf/158/orig 2025-12-04T08:53:59.3237852Z * [new branch] gh/etaf/159/base -> origin/gh/etaf/159/base 2025-12-04T08:53:59.3237916Z * [new branch] gh/etaf/159/head -> origin/gh/etaf/159/head 2025-12-04T08:53:59.3237988Z * [new branch] gh/etaf/159/orig -> origin/gh/etaf/159/orig 2025-12-04T08:53:59.3238052Z * [new branch] gh/etaf/160/base -> origin/gh/etaf/160/base 2025-12-04T08:53:59.3238117Z * [new branch] gh/etaf/160/head -> origin/gh/etaf/160/head 2025-12-04T08:53:59.3238187Z * [new branch] gh/etaf/160/orig -> origin/gh/etaf/160/orig 2025-12-04T08:53:59.3238253Z * [new branch] gh/etaf/161/base -> origin/gh/etaf/161/base 2025-12-04T08:53:59.3238344Z * [new branch] gh/etaf/161/head -> origin/gh/etaf/161/head 2025-12-04T08:53:59.3238411Z * [new branch] gh/etaf/161/orig -> origin/gh/etaf/161/orig 2025-12-04T08:53:59.3238476Z * [new branch] gh/etaf/166/base -> origin/gh/etaf/166/base 2025-12-04T08:53:59.3238546Z * [new branch] gh/etaf/166/head -> origin/gh/etaf/166/head 2025-12-04T08:53:59.3238613Z * [new branch] gh/etaf/166/orig -> origin/gh/etaf/166/orig 2025-12-04T08:53:59.3238678Z * [new branch] gh/etaf/167/base -> origin/gh/etaf/167/base 2025-12-04T08:53:59.3238746Z * [new branch] gh/etaf/167/head -> origin/gh/etaf/167/head 2025-12-04T08:53:59.3238811Z * [new branch] gh/etaf/167/orig -> origin/gh/etaf/167/orig 2025-12-04T08:53:59.3238877Z * [new branch] gh/etaf/168/base -> origin/gh/etaf/168/base 2025-12-04T08:53:59.3238945Z * [new branch] gh/etaf/168/head -> origin/gh/etaf/168/head 2025-12-04T08:53:59.3239012Z * [new branch] gh/etaf/168/orig -> origin/gh/etaf/168/orig 2025-12-04T08:53:59.3239078Z * [new branch] gh/etaf/172/base -> origin/gh/etaf/172/base 2025-12-04T08:53:59.3239149Z * [new branch] gh/etaf/172/head -> origin/gh/etaf/172/head 2025-12-04T08:53:59.3239243Z * [new branch] gh/etaf/172/orig -> origin/gh/etaf/172/orig 2025-12-04T08:53:59.3239310Z * [new branch] gh/etaf/173/base -> origin/gh/etaf/173/base 2025-12-04T08:53:59.3239378Z * [new branch] gh/etaf/173/head -> origin/gh/etaf/173/head 2025-12-04T08:53:59.3239442Z * [new branch] gh/etaf/173/orig -> origin/gh/etaf/173/orig 2025-12-04T08:53:59.3239508Z * [new branch] gh/etaf/174/base -> origin/gh/etaf/174/base 2025-12-04T08:53:59.3239577Z * [new branch] gh/etaf/174/head -> origin/gh/etaf/174/head 2025-12-04T08:53:59.3239644Z * [new branch] gh/etaf/175/base -> origin/gh/etaf/175/base 2025-12-04T08:53:59.3239709Z * [new branch] gh/etaf/175/head -> origin/gh/etaf/175/head 2025-12-04T08:53:59.3239777Z * [new branch] gh/etaf/175/orig -> origin/gh/etaf/175/orig 2025-12-04T08:53:59.3239843Z * [new branch] gh/etaf/176/base -> origin/gh/etaf/176/base 2025-12-04T08:53:59.3239908Z * [new branch] gh/etaf/176/head -> origin/gh/etaf/176/head 2025-12-04T08:53:59.3239978Z * [new branch] gh/etaf/176/orig -> origin/gh/etaf/176/orig 2025-12-04T08:53:59.3240043Z * [new branch] gh/etaf/177/base -> origin/gh/etaf/177/base 2025-12-04T08:53:59.3240110Z * [new branch] gh/etaf/177/head -> origin/gh/etaf/177/head 2025-12-04T08:53:59.3240174Z * [new branch] gh/etaf/177/orig -> origin/gh/etaf/177/orig 2025-12-04T08:53:59.3240240Z * [new branch] gh/etaf/178/base -> origin/gh/etaf/178/base 2025-12-04T08:53:59.3240308Z * [new branch] gh/etaf/178/head -> origin/gh/etaf/178/head 2025-12-04T08:53:59.3240373Z * [new branch] gh/etaf/178/orig -> origin/gh/etaf/178/orig 2025-12-04T08:53:59.3240438Z * [new branch] gh/etaf/179/base -> origin/gh/etaf/179/base 2025-12-04T08:53:59.3240509Z * [new branch] gh/etaf/179/head -> origin/gh/etaf/179/head 2025-12-04T08:53:59.3240573Z * [new branch] gh/etaf/179/orig -> origin/gh/etaf/179/orig 2025-12-04T08:53:59.3240639Z * [new branch] gh/etaf/180/base -> origin/gh/etaf/180/base 2025-12-04T08:53:59.3240708Z * [new branch] gh/etaf/180/head -> origin/gh/etaf/180/head 2025-12-04T08:53:59.3240772Z * [new branch] gh/etaf/180/orig -> origin/gh/etaf/180/orig 2025-12-04T08:53:59.3240889Z * [new branch] gh/exclamaforte/1/base -> origin/gh/exclamaforte/1/base 2025-12-04T08:53:59.3240970Z * [new branch] gh/exclamaforte/1/head -> origin/gh/exclamaforte/1/head 2025-12-04T08:53:59.3241048Z * [new branch] gh/exclamaforte/2/base -> origin/gh/exclamaforte/2/base 2025-12-04T08:53:59.3241124Z * [new branch] gh/exclamaforte/2/head -> origin/gh/exclamaforte/2/head 2025-12-04T08:53:59.3241207Z * [new branch] gh/exclamaforte/3/base -> origin/gh/exclamaforte/3/base 2025-12-04T08:53:59.3241284Z * [new branch] gh/exclamaforte/3/head -> origin/gh/exclamaforte/3/head 2025-12-04T08:53:59.3241359Z * [new branch] gh/exclamaforte/4/base -> origin/gh/exclamaforte/4/base 2025-12-04T08:53:59.3241438Z * [new branch] gh/exclamaforte/4/head -> origin/gh/exclamaforte/4/head 2025-12-04T08:53:59.3241509Z * [new branch] gh/ezyang/2374/base -> origin/gh/ezyang/2374/base 2025-12-04T08:53:59.3241581Z * [new branch] gh/ezyang/2374/head -> origin/gh/ezyang/2374/head 2025-12-04T08:53:59.3241655Z * [new branch] gh/ezyang/2374/orig -> origin/gh/ezyang/2374/orig 2025-12-04T08:53:59.3241727Z * [new branch] gh/ezyang/2973/base -> origin/gh/ezyang/2973/base 2025-12-04T08:53:59.3241828Z * [new branch] gh/ezyang/2973/head -> origin/gh/ezyang/2973/head 2025-12-04T08:53:59.3241962Z * [new branch] gh/ezyang/2973/orig -> origin/gh/ezyang/2973/orig 2025-12-04T08:53:59.3242032Z * [new branch] gh/ezyang/2974/base -> origin/gh/ezyang/2974/base 2025-12-04T08:53:59.3242105Z * [new branch] gh/ezyang/2974/head -> origin/gh/ezyang/2974/head 2025-12-04T08:53:59.3242176Z * [new branch] gh/ezyang/2974/orig -> origin/gh/ezyang/2974/orig 2025-12-04T08:53:59.3242245Z * [new branch] gh/ezyang/3131/base -> origin/gh/ezyang/3131/base 2025-12-04T08:53:59.3242322Z * [new branch] gh/ezyang/3131/head -> origin/gh/ezyang/3131/head 2025-12-04T08:53:59.3242390Z * [new branch] gh/ezyang/3131/orig -> origin/gh/ezyang/3131/orig 2025-12-04T08:53:59.3242460Z * [new branch] gh/ezyang/3139/base -> origin/gh/ezyang/3139/base 2025-12-04T08:53:59.3242535Z * [new branch] gh/ezyang/3139/head -> origin/gh/ezyang/3139/head 2025-12-04T08:53:59.3242605Z * [new branch] gh/ezyang/3139/orig -> origin/gh/ezyang/3139/orig 2025-12-04T08:53:59.3242675Z * [new branch] gh/ezyang/3140/base -> origin/gh/ezyang/3140/base 2025-12-04T08:53:59.3242749Z * [new branch] gh/ezyang/3140/head -> origin/gh/ezyang/3140/head 2025-12-04T08:53:59.3242818Z * [new branch] gh/ezyang/3140/orig -> origin/gh/ezyang/3140/orig 2025-12-04T08:53:59.3242887Z * [new branch] gh/ezyang/3143/base -> origin/gh/ezyang/3143/base 2025-12-04T08:53:59.3242960Z * [new branch] gh/ezyang/3143/head -> origin/gh/ezyang/3143/head 2025-12-04T08:53:59.3243029Z * [new branch] gh/ezyang/3143/orig -> origin/gh/ezyang/3143/orig 2025-12-04T08:53:59.3243097Z * [new branch] gh/ezyang/3144/base -> origin/gh/ezyang/3144/base 2025-12-04T08:53:59.3243172Z * [new branch] gh/ezyang/3144/head -> origin/gh/ezyang/3144/head 2025-12-04T08:53:59.3243241Z * [new branch] gh/ezyang/3144/orig -> origin/gh/ezyang/3144/orig 2025-12-04T08:53:59.3243311Z * [new branch] gh/ezyang/3167/base -> origin/gh/ezyang/3167/base 2025-12-04T08:53:59.3243384Z * [new branch] gh/ezyang/3167/head -> origin/gh/ezyang/3167/head 2025-12-04T08:53:59.3243453Z * [new branch] gh/ezyang/3167/orig -> origin/gh/ezyang/3167/orig 2025-12-04T08:53:59.3243523Z * [new branch] gh/ezyang/3173/base -> origin/gh/ezyang/3173/base 2025-12-04T08:53:59.3243631Z * [new branch] gh/ezyang/3173/head -> origin/gh/ezyang/3173/head 2025-12-04T08:53:59.3243699Z * [new branch] gh/ezyang/3173/orig -> origin/gh/ezyang/3173/orig 2025-12-04T08:53:59.3243770Z * [new branch] gh/ezyang/3175/base -> origin/gh/ezyang/3175/base 2025-12-04T08:53:59.3243841Z * [new branch] gh/ezyang/3175/head -> origin/gh/ezyang/3175/head 2025-12-04T08:53:59.3243910Z * [new branch] gh/ezyang/3175/orig -> origin/gh/ezyang/3175/orig 2025-12-04T08:53:59.3243983Z * [new branch] gh/ezyang/3182/base -> origin/gh/ezyang/3182/base 2025-12-04T08:53:59.3244052Z * [new branch] gh/ezyang/3182/head -> origin/gh/ezyang/3182/head 2025-12-04T08:53:59.3244122Z * [new branch] gh/ezyang/3182/orig -> origin/gh/ezyang/3182/orig 2025-12-04T08:53:59.3244193Z * [new branch] gh/ezyang/3185/base -> origin/gh/ezyang/3185/base 2025-12-04T08:53:59.3244263Z * [new branch] gh/ezyang/3185/head -> origin/gh/ezyang/3185/head 2025-12-04T08:53:59.3244333Z * [new branch] gh/ezyang/3185/orig -> origin/gh/ezyang/3185/orig 2025-12-04T08:53:59.3244406Z * [new branch] gh/ezyang/3189/base -> origin/gh/ezyang/3189/base 2025-12-04T08:53:59.3244518Z * [new branch] gh/ezyang/3189/head -> origin/gh/ezyang/3189/head 2025-12-04T08:53:59.3244587Z * [new branch] gh/ezyang/3189/orig -> origin/gh/ezyang/3189/orig 2025-12-04T08:53:59.3244660Z * [new branch] gh/ezyang/3191/base -> origin/gh/ezyang/3191/base 2025-12-04T08:53:59.3244729Z * [new branch] gh/ezyang/3191/head -> origin/gh/ezyang/3191/head 2025-12-04T08:53:59.3244798Z * [new branch] gh/ezyang/3191/orig -> origin/gh/ezyang/3191/orig 2025-12-04T08:53:59.3244873Z * [new branch] gh/ezyang/3192/base -> origin/gh/ezyang/3192/base 2025-12-04T08:53:59.3244944Z * [new branch] gh/ezyang/3192/head -> origin/gh/ezyang/3192/head 2025-12-04T08:53:59.3245013Z * [new branch] gh/ezyang/3192/orig -> origin/gh/ezyang/3192/orig 2025-12-04T08:53:59.3245085Z * [new branch] gh/ezyang/3193/base -> origin/gh/ezyang/3193/base 2025-12-04T08:53:59.3245156Z * [new branch] gh/ezyang/3193/head -> origin/gh/ezyang/3193/head 2025-12-04T08:53:59.3245225Z * [new branch] gh/ezyang/3193/orig -> origin/gh/ezyang/3193/orig 2025-12-04T08:53:59.3245300Z * [new branch] gh/ezyang/3194/base -> origin/gh/ezyang/3194/base 2025-12-04T08:53:59.3245370Z * [new branch] gh/ezyang/3194/head -> origin/gh/ezyang/3194/head 2025-12-04T08:53:59.3245444Z * [new branch] gh/ezyang/3194/orig -> origin/gh/ezyang/3194/orig 2025-12-04T08:53:59.3245512Z * [new branch] gh/ezyang/3195/base -> origin/gh/ezyang/3195/base 2025-12-04T08:53:59.3245583Z * [new branch] gh/ezyang/3195/head -> origin/gh/ezyang/3195/head 2025-12-04T08:53:59.3245654Z * [new branch] gh/ezyang/3195/orig -> origin/gh/ezyang/3195/orig 2025-12-04T08:53:59.3245723Z * [new branch] gh/ezyang/3196/base -> origin/gh/ezyang/3196/base 2025-12-04T08:53:59.3245794Z * [new branch] gh/ezyang/3196/head -> origin/gh/ezyang/3196/head 2025-12-04T08:53:59.3245866Z * [new branch] gh/ezyang/3196/orig -> origin/gh/ezyang/3196/orig 2025-12-04T08:53:59.3245934Z * [new branch] gh/ezyang/3197/base -> origin/gh/ezyang/3197/base 2025-12-04T08:53:59.3246003Z * [new branch] gh/ezyang/3197/head -> origin/gh/ezyang/3197/head 2025-12-04T08:53:59.3246077Z * [new branch] gh/ezyang/3197/orig -> origin/gh/ezyang/3197/orig 2025-12-04T08:53:59.3246145Z * [new branch] gh/ezyang/3198/base -> origin/gh/ezyang/3198/base 2025-12-04T08:53:59.3246241Z * [new branch] gh/ezyang/3198/head -> origin/gh/ezyang/3198/head 2025-12-04T08:53:59.3246314Z * [new branch] gh/ezyang/3198/orig -> origin/gh/ezyang/3198/orig 2025-12-04T08:53:59.3246383Z * [new branch] gh/ezyang/3199/base -> origin/gh/ezyang/3199/base 2025-12-04T08:53:59.3246455Z * [new branch] gh/ezyang/3199/head -> origin/gh/ezyang/3199/head 2025-12-04T08:53:59.3246529Z * [new branch] gh/ezyang/3199/orig -> origin/gh/ezyang/3199/orig 2025-12-04T08:53:59.3246598Z * [new branch] gh/ezyang/3200/base -> origin/gh/ezyang/3200/base 2025-12-04T08:53:59.3246669Z * [new branch] gh/ezyang/3200/head -> origin/gh/ezyang/3200/head 2025-12-04T08:53:59.3246743Z * [new branch] gh/ezyang/3200/orig -> origin/gh/ezyang/3200/orig 2025-12-04T08:53:59.3246813Z * [new branch] gh/ezyang/3201/base -> origin/gh/ezyang/3201/base 2025-12-04T08:53:59.3246883Z * [new branch] gh/ezyang/3201/head -> origin/gh/ezyang/3201/head 2025-12-04T08:53:59.3246954Z * [new branch] gh/ezyang/3201/orig -> origin/gh/ezyang/3201/orig 2025-12-04T08:53:59.3247022Z * [new branch] gh/ezyang/3202/base -> origin/gh/ezyang/3202/base 2025-12-04T08:53:59.3247119Z * [new branch] gh/ezyang/3202/head -> origin/gh/ezyang/3202/head 2025-12-04T08:53:59.3247187Z * [new branch] gh/ezyang/3202/orig -> origin/gh/ezyang/3202/orig 2025-12-04T08:53:59.3247256Z * [new branch] gh/ezyang/3203/base -> origin/gh/ezyang/3203/base 2025-12-04T08:53:59.3247328Z * [new branch] gh/ezyang/3203/head -> origin/gh/ezyang/3203/head 2025-12-04T08:53:59.3247396Z * [new branch] gh/ezyang/3203/orig -> origin/gh/ezyang/3203/orig 2025-12-04T08:53:59.3247465Z * [new branch] gh/ezyang/3204/base -> origin/gh/ezyang/3204/base 2025-12-04T08:53:59.3247542Z * [new branch] gh/ezyang/3204/head -> origin/gh/ezyang/3204/head 2025-12-04T08:53:59.3247611Z * [new branch] gh/ezyang/3204/orig -> origin/gh/ezyang/3204/orig 2025-12-04T08:53:59.3247680Z * [new branch] gh/ezyang/3205/base -> origin/gh/ezyang/3205/base 2025-12-04T08:53:59.3247752Z * [new branch] gh/ezyang/3205/head -> origin/gh/ezyang/3205/head 2025-12-04T08:53:59.3247821Z * [new branch] gh/ezyang/3205/orig -> origin/gh/ezyang/3205/orig 2025-12-04T08:53:59.3247888Z * [new branch] gh/ezyang/3206/base -> origin/gh/ezyang/3206/base 2025-12-04T08:53:59.3247958Z * [new branch] gh/ezyang/3206/head -> origin/gh/ezyang/3206/head 2025-12-04T08:53:59.3248027Z * [new branch] gh/ezyang/3206/orig -> origin/gh/ezyang/3206/orig 2025-12-04T08:53:59.3248097Z * [new branch] gh/ezyang/3207/base -> origin/gh/ezyang/3207/base 2025-12-04T08:53:59.3248172Z * [new branch] gh/ezyang/3207/head -> origin/gh/ezyang/3207/head 2025-12-04T08:53:59.3248242Z * [new branch] gh/ezyang/3207/orig -> origin/gh/ezyang/3207/orig 2025-12-04T08:53:59.3248310Z * [new branch] gh/ezyang/3208/base -> origin/gh/ezyang/3208/base 2025-12-04T08:53:59.3248384Z * [new branch] gh/ezyang/3208/head -> origin/gh/ezyang/3208/head 2025-12-04T08:53:59.3255250Z * [new branch] gh/ezyang/3208/orig -> origin/gh/ezyang/3208/orig 2025-12-04T08:53:59.3255343Z * [new branch] gh/ezyang/3209/base -> origin/gh/ezyang/3209/base 2025-12-04T08:53:59.3255417Z * [new branch] gh/ezyang/3209/head -> origin/gh/ezyang/3209/head 2025-12-04T08:53:59.3255485Z * [new branch] gh/ezyang/3209/orig -> origin/gh/ezyang/3209/orig 2025-12-04T08:53:59.3255559Z * [new branch] gh/fadara01/3/base -> origin/gh/fadara01/3/base 2025-12-04T08:53:59.3255707Z * [new branch] gh/fadara01/3/head -> origin/gh/fadara01/3/head 2025-12-04T08:53:59.3255777Z * [new branch] gh/fadara01/3/orig -> origin/gh/fadara01/3/orig 2025-12-04T08:53:59.3255844Z * [new branch] gh/fadara01/5/base -> origin/gh/fadara01/5/base 2025-12-04T08:53:59.3255921Z * [new branch] gh/fadara01/5/head -> origin/gh/fadara01/5/head 2025-12-04T08:53:59.3255990Z * [new branch] gh/fadara01/5/orig -> origin/gh/fadara01/5/orig 2025-12-04T08:53:59.3256058Z * [new branch] gh/fadara01/6/base -> origin/gh/fadara01/6/base 2025-12-04T08:53:59.3256126Z * [new branch] gh/fadara01/6/head -> origin/gh/fadara01/6/head 2025-12-04T08:53:59.3256194Z * [new branch] gh/fadara01/6/orig -> origin/gh/fadara01/6/orig 2025-12-04T08:53:59.3256270Z * [new branch] gh/fadara01/7/base -> origin/gh/fadara01/7/base 2025-12-04T08:53:59.3256342Z * [new branch] gh/fadara01/7/head -> origin/gh/fadara01/7/head 2025-12-04T08:53:59.3256410Z * [new branch] gh/fadara01/7/orig -> origin/gh/fadara01/7/orig 2025-12-04T08:53:59.3256481Z * [new branch] gh/fadara01/8/base -> origin/gh/fadara01/8/base 2025-12-04T08:53:59.3256604Z * [new branch] gh/fadara01/8/head -> origin/gh/fadara01/8/head 2025-12-04T08:53:59.3256676Z * [new branch] gh/fadara01/8/orig -> origin/gh/fadara01/8/orig 2025-12-04T08:53:59.3256747Z * [new branch] gh/fadara01/9/base -> origin/gh/fadara01/9/base 2025-12-04T08:53:59.3256818Z * [new branch] gh/fadara01/9/head -> origin/gh/fadara01/9/head 2025-12-04T08:53:59.3256886Z * [new branch] gh/fadara01/9/orig -> origin/gh/fadara01/9/orig 2025-12-04T08:53:59.3256965Z * [new branch] gh/fduwjj/182/base -> origin/gh/fduwjj/182/base 2025-12-04T08:53:59.3257041Z * [new branch] gh/fduwjj/182/head -> origin/gh/fduwjj/182/head 2025-12-04T08:53:59.3257110Z * [new branch] gh/fduwjj/182/orig -> origin/gh/fduwjj/182/orig 2025-12-04T08:53:59.3257181Z * [new branch] gh/fduwjj/211/base -> origin/gh/fduwjj/211/base 2025-12-04T08:53:59.3257256Z * [new branch] gh/fduwjj/211/head -> origin/gh/fduwjj/211/head 2025-12-04T08:53:59.3257324Z * [new branch] gh/fduwjj/211/orig -> origin/gh/fduwjj/211/orig 2025-12-04T08:53:59.3257393Z * [new branch] gh/fduwjj/212/base -> origin/gh/fduwjj/212/base 2025-12-04T08:53:59.3257462Z * [new branch] gh/fduwjj/212/head -> origin/gh/fduwjj/212/head 2025-12-04T08:53:59.3257532Z * [new branch] gh/fduwjj/212/orig -> origin/gh/fduwjj/212/orig 2025-12-04T08:53:59.3257604Z * [new branch] gh/fduwjj/213/base -> origin/gh/fduwjj/213/base 2025-12-04T08:53:59.3257677Z * [new branch] gh/fduwjj/213/head -> origin/gh/fduwjj/213/head 2025-12-04T08:53:59.3257749Z * [new branch] gh/fduwjj/213/orig -> origin/gh/fduwjj/213/orig 2025-12-04T08:53:59.3257822Z * [new branch] gh/fduwjj/226/base -> origin/gh/fduwjj/226/base 2025-12-04T08:53:59.3257895Z * [new branch] gh/fduwjj/226/head -> origin/gh/fduwjj/226/head 2025-12-04T08:53:59.3257965Z * [new branch] gh/fduwjj/226/orig -> origin/gh/fduwjj/226/orig 2025-12-04T08:53:59.3258036Z * [new branch] gh/fduwjj/229/base -> origin/gh/fduwjj/229/base 2025-12-04T08:53:59.3258104Z * [new branch] gh/fduwjj/229/head -> origin/gh/fduwjj/229/head 2025-12-04T08:53:59.3258174Z * [new branch] gh/fduwjj/229/orig -> origin/gh/fduwjj/229/orig 2025-12-04T08:53:59.3258245Z * [new branch] gh/fduwjj/233/base -> origin/gh/fduwjj/233/base 2025-12-04T08:53:59.3258338Z * [new branch] gh/fduwjj/233/head -> origin/gh/fduwjj/233/head 2025-12-04T08:53:59.3258412Z * [new branch] gh/fduwjj/233/orig -> origin/gh/fduwjj/233/orig 2025-12-04T08:53:59.3258480Z * [new branch] gh/fduwjj/234/base -> origin/gh/fduwjj/234/base 2025-12-04T08:53:59.3258550Z * [new branch] gh/fduwjj/234/head -> origin/gh/fduwjj/234/head 2025-12-04T08:53:59.3258622Z * [new branch] gh/fduwjj/234/orig -> origin/gh/fduwjj/234/orig 2025-12-04T08:53:59.3258690Z * [new branch] gh/fduwjj/235/base -> origin/gh/fduwjj/235/base 2025-12-04T08:53:59.3258758Z * [new branch] gh/fduwjj/235/head -> origin/gh/fduwjj/235/head 2025-12-04T08:53:59.3258832Z * [new branch] gh/fduwjj/235/orig -> origin/gh/fduwjj/235/orig 2025-12-04T08:53:59.3258900Z * [new branch] gh/fduwjj/236/base -> origin/gh/fduwjj/236/base 2025-12-04T08:53:59.3258970Z * [new branch] gh/fduwjj/236/head -> origin/gh/fduwjj/236/head 2025-12-04T08:53:59.3259042Z * [new branch] gh/fduwjj/236/orig -> origin/gh/fduwjj/236/orig 2025-12-04T08:53:59.3259111Z * [new branch] gh/fduwjj/237/base -> origin/gh/fduwjj/237/base 2025-12-04T08:53:59.3259181Z * [new branch] gh/fduwjj/237/head -> origin/gh/fduwjj/237/head 2025-12-04T08:53:59.3259286Z * [new branch] gh/fduwjj/237/orig -> origin/gh/fduwjj/237/orig 2025-12-04T08:53:59.3259354Z * [new branch] gh/fduwjj/238/base -> origin/gh/fduwjj/238/base 2025-12-04T08:53:59.3259423Z * [new branch] gh/fduwjj/238/head -> origin/gh/fduwjj/238/head 2025-12-04T08:53:59.3259497Z * [new branch] gh/fduwjj/238/orig -> origin/gh/fduwjj/238/orig 2025-12-04T08:53:59.3259565Z * [new branch] gh/fduwjj/239/base -> origin/gh/fduwjj/239/base 2025-12-04T08:53:59.3259644Z * [new branch] gh/fduwjj/239/head -> origin/gh/fduwjj/239/head 2025-12-04T08:53:59.3259711Z * [new branch] gh/fduwjj/239/orig -> origin/gh/fduwjj/239/orig 2025-12-04T08:53:59.3259784Z * [new branch] gh/fegin/332/base -> origin/gh/fegin/332/base 2025-12-04T08:53:59.3259857Z * [new branch] gh/fegin/332/head -> origin/gh/fegin/332/head 2025-12-04T08:53:59.3259926Z * [new branch] gh/fegin/332/orig -> origin/gh/fegin/332/orig 2025-12-04T08:53:59.3259991Z * [new branch] gh/fegin/333/base -> origin/gh/fegin/333/base 2025-12-04T08:53:59.3260061Z * [new branch] gh/fegin/333/head -> origin/gh/fegin/333/head 2025-12-04T08:53:59.3260127Z * [new branch] gh/fegin/333/orig -> origin/gh/fegin/333/orig 2025-12-04T08:53:59.3260193Z * [new branch] gh/fegin/334/base -> origin/gh/fegin/334/base 2025-12-04T08:53:59.3260261Z * [new branch] gh/fegin/334/head -> origin/gh/fegin/334/head 2025-12-04T08:53:59.3260326Z * [new branch] gh/fegin/334/orig -> origin/gh/fegin/334/orig 2025-12-04T08:53:59.3260391Z * [new branch] gh/fegin/335/base -> origin/gh/fegin/335/base 2025-12-04T08:53:59.3260462Z * [new branch] gh/fegin/335/head -> origin/gh/fegin/335/head 2025-12-04T08:53:59.3260530Z * [new branch] gh/fegin/335/orig -> origin/gh/fegin/335/orig 2025-12-04T08:53:59.3260599Z * [new branch] gh/fffrog/160/base -> origin/gh/fffrog/160/base 2025-12-04T08:53:59.3260670Z * [new branch] gh/fffrog/160/head -> origin/gh/fffrog/160/head 2025-12-04T08:53:59.3260738Z * [new branch] gh/fffrog/177/base -> origin/gh/fffrog/177/base 2025-12-04T08:53:59.3260805Z * [new branch] gh/fffrog/177/head -> origin/gh/fffrog/177/head 2025-12-04T08:53:59.3260905Z * [new branch] gh/fffrog/177/orig -> origin/gh/fffrog/177/orig 2025-12-04T08:53:59.3260973Z * [new branch] gh/fffrog/178/base -> origin/gh/fffrog/178/base 2025-12-04T08:53:59.3261039Z * [new branch] gh/fffrog/178/head -> origin/gh/fffrog/178/head 2025-12-04T08:53:59.3261108Z * [new branch] gh/fffrog/178/orig -> origin/gh/fffrog/178/orig 2025-12-04T08:53:59.3261176Z * [new branch] gh/fffrog/181/base -> origin/gh/fffrog/181/base 2025-12-04T08:53:59.3261244Z * [new branch] gh/fffrog/181/head -> origin/gh/fffrog/181/head 2025-12-04T08:53:59.3261311Z * [new branch] gh/fffrog/181/orig -> origin/gh/fffrog/181/orig 2025-12-04T08:53:59.3261378Z * [new branch] gh/fffrog/183/base -> origin/gh/fffrog/183/base 2025-12-04T08:53:59.3261446Z * [new branch] gh/fffrog/183/head -> origin/gh/fffrog/183/head 2025-12-04T08:53:59.3261514Z * [new branch] gh/fffrog/183/orig -> origin/gh/fffrog/183/orig 2025-12-04T08:53:59.3261587Z * [new branch] gh/fxdawnn/10/base -> origin/gh/fxdawnn/10/base 2025-12-04T08:53:59.3261657Z * [new branch] gh/fxdawnn/10/head -> origin/gh/fxdawnn/10/head 2025-12-04T08:53:59.3261726Z * [new branch] gh/fxdawnn/10/orig -> origin/gh/fxdawnn/10/orig 2025-12-04T08:53:59.3261825Z * [new branch] gh/fxdawnn/11/base -> origin/gh/fxdawnn/11/base 2025-12-04T08:53:59.3261943Z * [new branch] gh/fxdawnn/11/head -> origin/gh/fxdawnn/11/head 2025-12-04T08:53:59.3262011Z * [new branch] gh/fxdawnn/11/orig -> origin/gh/fxdawnn/11/orig 2025-12-04T08:53:59.3262080Z * [new branch] gh/fxdawnn/12/base -> origin/gh/fxdawnn/12/base 2025-12-04T08:53:59.3262149Z * [new branch] gh/fxdawnn/12/head -> origin/gh/fxdawnn/12/head 2025-12-04T08:53:59.3262215Z * [new branch] gh/fxdawnn/12/orig -> origin/gh/fxdawnn/12/orig 2025-12-04T08:53:59.3262288Z * [new branch] gh/fxdawnn/13/base -> origin/gh/fxdawnn/13/base 2025-12-04T08:53:59.3262356Z * [new branch] gh/fxdawnn/13/head -> origin/gh/fxdawnn/13/head 2025-12-04T08:53:59.3262423Z * [new branch] gh/fxdawnn/13/orig -> origin/gh/fxdawnn/13/orig 2025-12-04T08:53:59.3262493Z * [new branch] gh/fxdawnn/14/base -> origin/gh/fxdawnn/14/base 2025-12-04T08:53:59.3262559Z * [new branch] gh/fxdawnn/14/head -> origin/gh/fxdawnn/14/head 2025-12-04T08:53:59.3262626Z * [new branch] gh/fxdawnn/14/orig -> origin/gh/fxdawnn/14/orig 2025-12-04T08:53:59.3262692Z * [new branch] gh/fxdawnn/15/base -> origin/gh/fxdawnn/15/base 2025-12-04T08:53:59.3262759Z * [new branch] gh/fxdawnn/15/head -> origin/gh/fxdawnn/15/head 2025-12-04T08:53:59.3262827Z * [new branch] gh/fxdawnn/15/orig -> origin/gh/fxdawnn/15/orig 2025-12-04T08:53:59.3262896Z * [new branch] gh/fxdawnn/6/base -> origin/gh/fxdawnn/6/base 2025-12-04T08:53:59.3262964Z * [new branch] gh/fxdawnn/6/head -> origin/gh/fxdawnn/6/head 2025-12-04T08:53:59.3263034Z * [new branch] gh/fxdawnn/6/orig -> origin/gh/fxdawnn/6/orig 2025-12-04T08:53:59.3263102Z * [new branch] gh/fxdawnn/7/base -> origin/gh/fxdawnn/7/base 2025-12-04T08:53:59.3263168Z * [new branch] gh/fxdawnn/7/head -> origin/gh/fxdawnn/7/head 2025-12-04T08:53:59.3263234Z * [new branch] gh/fxdawnn/7/orig -> origin/gh/fxdawnn/7/orig 2025-12-04T08:53:59.3263299Z * [new branch] gh/fxdawnn/9/base -> origin/gh/fxdawnn/9/base 2025-12-04T08:53:59.3263364Z * [new branch] gh/fxdawnn/9/head -> origin/gh/fxdawnn/9/head 2025-12-04T08:53:59.3263430Z * [new branch] gh/fxdawnn/9/orig -> origin/gh/fxdawnn/9/orig 2025-12-04T08:53:59.3263547Z * [new branch] gh/galv/1/base -> origin/gh/galv/1/base 2025-12-04T08:53:59.3263612Z * [new branch] gh/galv/1/head -> origin/gh/galv/1/head 2025-12-04T08:53:59.3263674Z * [new branch] gh/galv/1/orig -> origin/gh/galv/1/orig 2025-12-04T08:53:59.3263737Z * [new branch] gh/galv/2/base -> origin/gh/galv/2/base 2025-12-04T08:53:59.3263799Z * [new branch] gh/galv/2/head -> origin/gh/galv/2/head 2025-12-04T08:53:59.3263860Z * [new branch] gh/galv/2/orig -> origin/gh/galv/2/orig 2025-12-04T08:53:59.3263921Z * [new branch] gh/galv/3/base -> origin/gh/galv/3/base 2025-12-04T08:53:59.3263983Z * [new branch] gh/galv/3/head -> origin/gh/galv/3/head 2025-12-04T08:53:59.3264043Z * [new branch] gh/galv/3/orig -> origin/gh/galv/3/orig 2025-12-04T08:53:59.3264123Z * [new branch] gh/guangyey/134/base -> origin/gh/guangyey/134/base 2025-12-04T08:53:59.3264197Z * [new branch] gh/guangyey/134/head -> origin/gh/guangyey/134/head 2025-12-04T08:53:59.3264267Z * [new branch] gh/guangyey/134/orig -> origin/gh/guangyey/134/orig 2025-12-04T08:53:59.3264337Z * [new branch] gh/guangyey/163/base -> origin/gh/guangyey/163/base 2025-12-04T08:53:59.3264450Z * [new branch] gh/guangyey/163/head -> origin/gh/guangyey/163/head 2025-12-04T08:53:59.3264520Z * [new branch] gh/guangyey/163/orig -> origin/gh/guangyey/163/orig 2025-12-04T08:53:59.3264589Z * [new branch] gh/guangyey/168/base -> origin/gh/guangyey/168/base 2025-12-04T08:53:59.3264660Z * [new branch] gh/guangyey/168/head -> origin/gh/guangyey/168/head 2025-12-04T08:53:59.3264729Z * [new branch] gh/guangyey/168/orig -> origin/gh/guangyey/168/orig 2025-12-04T08:53:59.3264800Z * [new branch] gh/guangyey/169/base -> origin/gh/guangyey/169/base 2025-12-04T08:53:59.3264872Z * [new branch] gh/guangyey/169/head -> origin/gh/guangyey/169/head 2025-12-04T08:53:59.3264941Z * [new branch] gh/guangyey/169/orig -> origin/gh/guangyey/169/orig 2025-12-04T08:53:59.3265012Z * [new branch] gh/guangyey/170/base -> origin/gh/guangyey/170/base 2025-12-04T08:53:59.3265083Z * [new branch] gh/guangyey/170/head -> origin/gh/guangyey/170/head 2025-12-04T08:53:59.3265153Z * [new branch] gh/guangyey/170/orig -> origin/gh/guangyey/170/orig 2025-12-04T08:53:59.3265223Z * [new branch] gh/guangyey/171/base -> origin/gh/guangyey/171/base 2025-12-04T08:53:59.3265292Z * [new branch] gh/guangyey/171/head -> origin/gh/guangyey/171/head 2025-12-04T08:53:59.3265361Z * [new branch] gh/guangyey/171/orig -> origin/gh/guangyey/171/orig 2025-12-04T08:53:59.3265433Z * [new branch] gh/guangyey/178/base -> origin/gh/guangyey/178/base 2025-12-04T08:53:59.3265503Z * [new branch] gh/guangyey/178/head -> origin/gh/guangyey/178/head 2025-12-04T08:53:59.3265572Z * [new branch] gh/guangyey/178/orig -> origin/gh/guangyey/178/orig 2025-12-04T08:53:59.3265645Z * [new branch] gh/guangyey/182/base -> origin/gh/guangyey/182/base 2025-12-04T08:53:59.3265714Z * [new branch] gh/guangyey/182/head -> origin/gh/guangyey/182/head 2025-12-04T08:53:59.3265783Z * [new branch] gh/guangyey/182/orig -> origin/gh/guangyey/182/orig 2025-12-04T08:53:59.3265857Z * [new branch] gh/guangyey/183/base -> origin/gh/guangyey/183/base 2025-12-04T08:53:59.3265927Z * [new branch] gh/guangyey/183/head -> origin/gh/guangyey/183/head 2025-12-04T08:53:59.3265998Z * [new branch] gh/guangyey/183/orig -> origin/gh/guangyey/183/orig 2025-12-04T08:53:59.3266142Z * [new branch] gh/guangyey/185/base -> origin/gh/guangyey/185/base 2025-12-04T08:53:59.3266211Z * [new branch] gh/guangyey/185/head -> origin/gh/guangyey/185/head 2025-12-04T08:53:59.3266280Z * [new branch] gh/guangyey/185/orig -> origin/gh/guangyey/185/orig 2025-12-04T08:53:59.3266352Z * [new branch] gh/guangyey/186/base -> origin/gh/guangyey/186/base 2025-12-04T08:53:59.3266422Z * [new branch] gh/guangyey/186/head -> origin/gh/guangyey/186/head 2025-12-04T08:53:59.3266492Z * [new branch] gh/guangyey/186/orig -> origin/gh/guangyey/186/orig 2025-12-04T08:53:59.3266563Z * [new branch] gh/guangyey/187/base -> origin/gh/guangyey/187/base 2025-12-04T08:53:59.3266632Z * [new branch] gh/guangyey/187/head -> origin/gh/guangyey/187/head 2025-12-04T08:53:59.3266704Z * [new branch] gh/guangyey/187/orig -> origin/gh/guangyey/187/orig 2025-12-04T08:53:59.3266778Z * [new branch] gh/guangyey/188/base -> origin/gh/guangyey/188/base 2025-12-04T08:53:59.3266848Z * [new branch] gh/guangyey/188/head -> origin/gh/guangyey/188/head 2025-12-04T08:53:59.3266920Z * [new branch] gh/guangyey/188/orig -> origin/gh/guangyey/188/orig 2025-12-04T08:53:59.3267030Z * [new branch] gh/guangyey/190/base -> origin/gh/guangyey/190/base 2025-12-04T08:53:59.3267100Z * [new branch] gh/guangyey/190/head -> origin/gh/guangyey/190/head 2025-12-04T08:53:59.3267172Z * [new branch] gh/guangyey/190/orig -> origin/gh/guangyey/190/orig 2025-12-04T08:53:59.3267241Z * [new branch] gh/guangyey/208/base -> origin/gh/guangyey/208/base 2025-12-04T08:53:59.3267310Z * [new branch] gh/guangyey/208/head -> origin/gh/guangyey/208/head 2025-12-04T08:53:59.3267382Z * [new branch] gh/guangyey/208/orig -> origin/gh/guangyey/208/orig 2025-12-04T08:53:59.3267453Z * [new branch] gh/guangyey/228/base -> origin/gh/guangyey/228/base 2025-12-04T08:53:59.3267522Z * [new branch] gh/guangyey/228/head -> origin/gh/guangyey/228/head 2025-12-04T08:53:59.3267593Z * [new branch] gh/guangyey/228/orig -> origin/gh/guangyey/228/orig 2025-12-04T08:53:59.3267662Z * [new branch] gh/guangyey/230/base -> origin/gh/guangyey/230/base 2025-12-04T08:53:59.3267734Z * [new branch] gh/guangyey/230/head -> origin/gh/guangyey/230/head 2025-12-04T08:53:59.3267804Z * [new branch] gh/guangyey/230/orig -> origin/gh/guangyey/230/orig 2025-12-04T08:53:59.3267874Z * [new branch] gh/guangyey/231/base -> origin/gh/guangyey/231/base 2025-12-04T08:53:59.3267945Z * [new branch] gh/guangyey/231/head -> origin/gh/guangyey/231/head 2025-12-04T08:53:59.3268017Z * [new branch] gh/guangyey/231/orig -> origin/gh/guangyey/231/orig 2025-12-04T08:53:59.3268087Z * [new branch] gh/guangyey/232/base -> origin/gh/guangyey/232/base 2025-12-04T08:53:59.3268158Z * [new branch] gh/guangyey/232/head -> origin/gh/guangyey/232/head 2025-12-04T08:53:59.3268227Z * [new branch] gh/guangyey/232/orig -> origin/gh/guangyey/232/orig 2025-12-04T08:53:59.3268297Z * [new branch] gh/guangyey/233/base -> origin/gh/guangyey/233/base 2025-12-04T08:53:59.3268368Z * [new branch] gh/guangyey/233/head -> origin/gh/guangyey/233/head 2025-12-04T08:53:59.3268438Z * [new branch] gh/guangyey/233/orig -> origin/gh/guangyey/233/orig 2025-12-04T08:53:59.3268507Z * [new branch] gh/guangyey/234/base -> origin/gh/guangyey/234/base 2025-12-04T08:53:59.3268579Z * [new branch] gh/guangyey/234/head -> origin/gh/guangyey/234/head 2025-12-04T08:53:59.3268648Z * [new branch] gh/guangyey/234/orig -> origin/gh/guangyey/234/orig 2025-12-04T08:53:59.3268747Z * [new branch] gh/guangyey/235/base -> origin/gh/guangyey/235/base 2025-12-04T08:53:59.3268817Z * [new branch] gh/guangyey/235/head -> origin/gh/guangyey/235/head 2025-12-04T08:53:59.3268887Z * [new branch] gh/guangyey/235/orig -> origin/gh/guangyey/235/orig 2025-12-04T08:53:59.3268957Z * [new branch] gh/guangyey/236/base -> origin/gh/guangyey/236/base 2025-12-04T08:53:59.3269028Z * [new branch] gh/guangyey/236/head -> origin/gh/guangyey/236/head 2025-12-04T08:53:59.3269098Z * [new branch] gh/guangyey/236/orig -> origin/gh/guangyey/236/orig 2025-12-04T08:53:59.3269167Z * [new branch] gh/guangyey/237/base -> origin/gh/guangyey/237/base 2025-12-04T08:53:59.3269239Z * [new branch] gh/guangyey/237/head -> origin/gh/guangyey/237/head 2025-12-04T08:53:59.3269309Z * [new branch] gh/guangyey/237/orig -> origin/gh/guangyey/237/orig 2025-12-04T08:53:59.3269379Z * [new branch] gh/guangyey/238/base -> origin/gh/guangyey/238/base 2025-12-04T08:53:59.3269450Z * [new branch] gh/guangyey/238/head -> origin/gh/guangyey/238/head 2025-12-04T08:53:59.3269520Z * [new branch] gh/guangyey/239/base -> origin/gh/guangyey/239/base 2025-12-04T08:53:59.3269628Z * [new branch] gh/guangyey/239/head -> origin/gh/guangyey/239/head 2025-12-04T08:53:59.3269700Z * [new branch] gh/guangyey/239/orig -> origin/gh/guangyey/239/orig 2025-12-04T08:53:59.3269771Z * [new branch] gh/guangyey/240/base -> origin/gh/guangyey/240/base 2025-12-04T08:53:59.3269841Z * [new branch] gh/guangyey/240/head -> origin/gh/guangyey/240/head 2025-12-04T08:53:59.3269910Z * [new branch] gh/guangyey/240/orig -> origin/gh/guangyey/240/orig 2025-12-04T08:53:59.3269983Z * [new branch] gh/guangyey/241/base -> origin/gh/guangyey/241/base 2025-12-04T08:53:59.3270054Z * [new branch] gh/guangyey/241/head -> origin/gh/guangyey/241/head 2025-12-04T08:53:59.3270124Z * [new branch] gh/guangyey/241/orig -> origin/gh/guangyey/241/orig 2025-12-04T08:53:59.3270193Z * [new branch] gh/guangyey/242/base -> origin/gh/guangyey/242/base 2025-12-04T08:53:59.3270265Z * [new branch] gh/guangyey/242/head -> origin/gh/guangyey/242/head 2025-12-04T08:53:59.3270336Z * [new branch] gh/guangyey/242/orig -> origin/gh/guangyey/242/orig 2025-12-04T08:53:59.3270406Z * [new branch] gh/guangyey/243/base -> origin/gh/guangyey/243/base 2025-12-04T08:53:59.3270477Z * [new branch] gh/guangyey/243/head -> origin/gh/guangyey/243/head 2025-12-04T08:53:59.3270545Z * [new branch] gh/guangyey/243/orig -> origin/gh/guangyey/243/orig 2025-12-04T08:53:59.3270616Z * [new branch] gh/guangyey/244/base -> origin/gh/guangyey/244/base 2025-12-04T08:53:59.3270687Z * [new branch] gh/guangyey/244/head -> origin/gh/guangyey/244/head 2025-12-04T08:53:59.3270756Z * [new branch] gh/guangyey/244/orig -> origin/gh/guangyey/244/orig 2025-12-04T08:53:59.3270825Z * [new branch] gh/guangyey/245/base -> origin/gh/guangyey/245/base 2025-12-04T08:53:59.3270898Z * [new branch] gh/guangyey/245/head -> origin/gh/guangyey/245/head 2025-12-04T08:53:59.3270968Z * [new branch] gh/guangyey/245/orig -> origin/gh/guangyey/245/orig 2025-12-04T08:53:59.3271037Z * [new branch] gh/guangyey/246/base -> origin/gh/guangyey/246/base 2025-12-04T08:53:59.3271108Z * [new branch] gh/guangyey/246/head -> origin/gh/guangyey/246/head 2025-12-04T08:53:59.3271177Z * [new branch] gh/guangyey/246/orig -> origin/gh/guangyey/246/orig 2025-12-04T08:53:59.3271279Z * [new branch] gh/guangyey/247/base -> origin/gh/guangyey/247/base 2025-12-04T08:53:59.3271349Z * [new branch] gh/guangyey/247/head -> origin/gh/guangyey/247/head 2025-12-04T08:53:59.3271418Z * [new branch] gh/guangyey/247/orig -> origin/gh/guangyey/247/orig 2025-12-04T08:53:59.3271491Z * [new branch] gh/guangyey/248/base -> origin/gh/guangyey/248/base 2025-12-04T08:53:59.3271560Z * [new branch] gh/guangyey/248/head -> origin/gh/guangyey/248/head 2025-12-04T08:53:59.3271630Z * [new branch] gh/guangyey/248/orig -> origin/gh/guangyey/248/orig 2025-12-04T08:53:59.3271701Z * [new branch] gh/guangyey/249/base -> origin/gh/guangyey/249/base 2025-12-04T08:53:59.3271770Z * [new branch] gh/guangyey/249/head -> origin/gh/guangyey/249/head 2025-12-04T08:53:59.3271840Z * [new branch] gh/guangyey/249/orig -> origin/gh/guangyey/249/orig 2025-12-04T08:53:59.3271943Z * [new branch] gh/guangyey/250/base -> origin/gh/guangyey/250/base 2025-12-04T08:53:59.3272013Z * [new branch] gh/guangyey/250/head -> origin/gh/guangyey/250/head 2025-12-04T08:53:59.3272082Z * [new branch] gh/guangyey/250/orig -> origin/gh/guangyey/250/orig 2025-12-04T08:53:59.3272198Z * [new branch] gh/guangyey/251/base -> origin/gh/guangyey/251/base 2025-12-04T08:53:59.3272268Z * [new branch] gh/guangyey/251/head -> origin/gh/guangyey/251/head 2025-12-04T08:53:59.3272338Z * [new branch] gh/guangyey/251/orig -> origin/gh/guangyey/251/orig 2025-12-04T08:53:59.3272409Z * [new branch] gh/guangyey/252/base -> origin/gh/guangyey/252/base 2025-12-04T08:53:59.3272479Z * [new branch] gh/guangyey/252/head -> origin/gh/guangyey/252/head 2025-12-04T08:53:59.3272549Z * [new branch] gh/guangyey/252/orig -> origin/gh/guangyey/252/orig 2025-12-04T08:53:59.3272622Z * [new branch] gh/guangyey/253/base -> origin/gh/guangyey/253/base 2025-12-04T08:53:59.3272691Z * [new branch] gh/guangyey/253/head -> origin/gh/guangyey/253/head 2025-12-04T08:53:59.3272762Z * [new branch] gh/guangyey/253/orig -> origin/gh/guangyey/253/orig 2025-12-04T08:53:59.3272832Z * [new branch] gh/guangyey/254/base -> origin/gh/guangyey/254/base 2025-12-04T08:53:59.3272902Z * [new branch] gh/guangyey/254/head -> origin/gh/guangyey/254/head 2025-12-04T08:53:59.3272973Z * [new branch] gh/guangyey/254/orig -> origin/gh/guangyey/254/orig 2025-12-04T08:53:59.3273042Z * [new branch] gh/guangyey/255/base -> origin/gh/guangyey/255/base 2025-12-04T08:53:59.3273112Z * [new branch] gh/guangyey/255/head -> origin/gh/guangyey/255/head 2025-12-04T08:53:59.3273184Z * [new branch] gh/guangyey/255/orig -> origin/gh/guangyey/255/orig 2025-12-04T08:53:59.3273284Z * [new branch] gh/guilhermeleobas/107/base -> origin/gh/guilhermeleobas/107/base 2025-12-04T08:53:59.3273374Z * [new branch] gh/guilhermeleobas/107/head -> origin/gh/guilhermeleobas/107/head 2025-12-04T08:53:59.3273464Z * [new branch] gh/guilhermeleobas/107/orig -> origin/gh/guilhermeleobas/107/orig 2025-12-04T08:53:59.3273553Z * [new branch] gh/guilhermeleobas/108/base -> origin/gh/guilhermeleobas/108/base 2025-12-04T08:53:59.3273638Z * [new branch] gh/guilhermeleobas/108/head -> origin/gh/guilhermeleobas/108/head 2025-12-04T08:53:59.3273726Z * [new branch] gh/guilhermeleobas/108/orig -> origin/gh/guilhermeleobas/108/orig 2025-12-04T08:53:59.3273812Z * [new branch] gh/guilhermeleobas/150/base -> origin/gh/guilhermeleobas/150/base 2025-12-04T08:53:59.3273898Z * [new branch] gh/guilhermeleobas/150/head -> origin/gh/guilhermeleobas/150/head 2025-12-04T08:53:59.3274026Z * [new branch] gh/guilhermeleobas/150/orig -> origin/gh/guilhermeleobas/150/orig 2025-12-04T08:53:59.3274112Z * [new branch] gh/guilhermeleobas/168/base -> origin/gh/guilhermeleobas/168/base 2025-12-04T08:53:59.3274198Z * [new branch] gh/guilhermeleobas/168/head -> origin/gh/guilhermeleobas/168/head 2025-12-04T08:53:59.3274286Z * [new branch] gh/guilhermeleobas/168/orig -> origin/gh/guilhermeleobas/168/orig 2025-12-04T08:53:59.3274372Z * [new branch] gh/guilhermeleobas/169/base -> origin/gh/guilhermeleobas/169/base 2025-12-04T08:53:59.3274462Z * [new branch] gh/guilhermeleobas/169/head -> origin/gh/guilhermeleobas/169/head 2025-12-04T08:53:59.3274548Z * [new branch] gh/guilhermeleobas/169/orig -> origin/gh/guilhermeleobas/169/orig 2025-12-04T08:53:59.3274633Z * [new branch] gh/guilhermeleobas/170/base -> origin/gh/guilhermeleobas/170/base 2025-12-04T08:53:59.3274723Z * [new branch] gh/guilhermeleobas/170/head -> origin/gh/guilhermeleobas/170/head 2025-12-04T08:53:59.3274808Z * [new branch] gh/guilhermeleobas/170/orig -> origin/gh/guilhermeleobas/170/orig 2025-12-04T08:53:59.3274894Z * [new branch] gh/guilhermeleobas/171/base -> origin/gh/guilhermeleobas/171/base 2025-12-04T08:53:59.3275013Z * [new branch] gh/guilhermeleobas/171/head -> origin/gh/guilhermeleobas/171/head 2025-12-04T08:53:59.3275100Z * [new branch] gh/guilhermeleobas/171/orig -> origin/gh/guilhermeleobas/171/orig 2025-12-04T08:53:59.3275186Z * [new branch] gh/guilhermeleobas/173/base -> origin/gh/guilhermeleobas/173/base 2025-12-04T08:53:59.3275273Z * [new branch] gh/guilhermeleobas/173/head -> origin/gh/guilhermeleobas/173/head 2025-12-04T08:53:59.3275359Z * [new branch] gh/guilhermeleobas/173/orig -> origin/gh/guilhermeleobas/173/orig 2025-12-04T08:53:59.3275447Z * [new branch] gh/guilhermeleobas/193/base -> origin/gh/guilhermeleobas/193/base 2025-12-04T08:53:59.3275534Z * [new branch] gh/guilhermeleobas/193/head -> origin/gh/guilhermeleobas/193/head 2025-12-04T08:53:59.3275619Z * [new branch] gh/guilhermeleobas/193/orig -> origin/gh/guilhermeleobas/193/orig 2025-12-04T08:53:59.3275709Z * [new branch] gh/guilhermeleobas/204/base -> origin/gh/guilhermeleobas/204/base 2025-12-04T08:53:59.3275797Z * [new branch] gh/guilhermeleobas/204/head -> origin/gh/guilhermeleobas/204/head 2025-12-04T08:53:59.3275882Z * [new branch] gh/guilhermeleobas/204/orig -> origin/gh/guilhermeleobas/204/orig 2025-12-04T08:53:59.3275969Z * [new branch] gh/guilhermeleobas/211/base -> origin/gh/guilhermeleobas/211/base 2025-12-04T08:53:59.3276055Z * [new branch] gh/guilhermeleobas/211/head -> origin/gh/guilhermeleobas/211/head 2025-12-04T08:53:59.3276143Z * [new branch] gh/guilhermeleobas/211/orig -> origin/gh/guilhermeleobas/211/orig 2025-12-04T08:53:59.3276231Z * [new branch] gh/guilhermeleobas/226/base -> origin/gh/guilhermeleobas/226/base 2025-12-04T08:53:59.3276317Z * [new branch] gh/guilhermeleobas/226/head -> origin/gh/guilhermeleobas/226/head 2025-12-04T08:53:59.3276406Z * [new branch] gh/guilhermeleobas/226/orig -> origin/gh/guilhermeleobas/226/orig 2025-12-04T08:53:59.3276494Z * [new branch] gh/guilhermeleobas/236/base -> origin/gh/guilhermeleobas/236/base 2025-12-04T08:53:59.3276580Z * [new branch] gh/guilhermeleobas/236/head -> origin/gh/guilhermeleobas/236/head 2025-12-04T08:53:59.3276666Z * [new branch] gh/guilhermeleobas/236/orig -> origin/gh/guilhermeleobas/236/orig 2025-12-04T08:53:59.3276753Z * [new branch] gh/guilhermeleobas/247/base -> origin/gh/guilhermeleobas/247/base 2025-12-04T08:53:59.3276839Z * [new branch] gh/guilhermeleobas/247/head -> origin/gh/guilhermeleobas/247/head 2025-12-04T08:53:59.3276962Z * [new branch] gh/guilhermeleobas/247/orig -> origin/gh/guilhermeleobas/247/orig 2025-12-04T08:53:59.3277050Z * [new branch] gh/guilhermeleobas/248/base -> origin/gh/guilhermeleobas/248/base 2025-12-04T08:53:59.3277136Z * [new branch] gh/guilhermeleobas/248/head -> origin/gh/guilhermeleobas/248/head 2025-12-04T08:53:59.3277225Z * [new branch] gh/guilhermeleobas/248/orig -> origin/gh/guilhermeleobas/248/orig 2025-12-04T08:53:59.3277311Z * [new branch] gh/guilhermeleobas/250/base -> origin/gh/guilhermeleobas/250/base 2025-12-04T08:53:59.3277397Z * [new branch] gh/guilhermeleobas/250/head -> origin/gh/guilhermeleobas/250/head 2025-12-04T08:53:59.3277484Z * [new branch] gh/guilhermeleobas/250/orig -> origin/gh/guilhermeleobas/250/orig 2025-12-04T08:53:59.3277569Z * [new branch] gh/guilhermeleobas/253/base -> origin/gh/guilhermeleobas/253/base 2025-12-04T08:53:59.3277658Z * [new branch] gh/guilhermeleobas/253/head -> origin/gh/guilhermeleobas/253/head 2025-12-04T08:53:59.3277746Z * [new branch] gh/guilhermeleobas/253/orig -> origin/gh/guilhermeleobas/253/orig 2025-12-04T08:53:59.3277832Z * [new branch] gh/guilhermeleobas/254/base -> origin/gh/guilhermeleobas/254/base 2025-12-04T08:53:59.3277949Z * [new branch] gh/guilhermeleobas/254/head -> origin/gh/guilhermeleobas/254/head 2025-12-04T08:53:59.3278038Z * [new branch] gh/guilhermeleobas/254/orig -> origin/gh/guilhermeleobas/254/orig 2025-12-04T08:53:59.3278125Z * [new branch] gh/guilhermeleobas/255/base -> origin/gh/guilhermeleobas/255/base 2025-12-04T08:53:59.3278211Z * [new branch] gh/guilhermeleobas/255/head -> origin/gh/guilhermeleobas/255/head 2025-12-04T08:53:59.3278300Z * [new branch] gh/guilhermeleobas/255/orig -> origin/gh/guilhermeleobas/255/orig 2025-12-04T08:53:59.3278388Z * [new branch] gh/guilhermeleobas/256/base -> origin/gh/guilhermeleobas/256/base 2025-12-04T08:53:59.3278474Z * [new branch] gh/guilhermeleobas/256/head -> origin/gh/guilhermeleobas/256/head 2025-12-04T08:53:59.3278561Z * [new branch] gh/guilhermeleobas/256/orig -> origin/gh/guilhermeleobas/256/orig 2025-12-04T08:53:59.3278651Z * [new branch] gh/guilhermeleobas/257/base -> origin/gh/guilhermeleobas/257/base 2025-12-04T08:53:59.3278738Z * [new branch] gh/guilhermeleobas/257/head -> origin/gh/guilhermeleobas/257/head 2025-12-04T08:53:59.3278824Z * [new branch] gh/guilhermeleobas/257/orig -> origin/gh/guilhermeleobas/257/orig 2025-12-04T08:53:59.3278910Z * [new branch] gh/guilhermeleobas/258/base -> origin/gh/guilhermeleobas/258/base 2025-12-04T08:53:59.3278999Z * [new branch] gh/guilhermeleobas/258/head -> origin/gh/guilhermeleobas/258/head 2025-12-04T08:53:59.3279087Z * [new branch] gh/guilhermeleobas/258/orig -> origin/gh/guilhermeleobas/258/orig 2025-12-04T08:53:59.3279173Z * [new branch] gh/guilhermeleobas/259/base -> origin/gh/guilhermeleobas/259/base 2025-12-04T08:53:59.3279260Z * [new branch] gh/guilhermeleobas/259/head -> origin/gh/guilhermeleobas/259/head 2025-12-04T08:53:59.3279348Z * [new branch] gh/guilhermeleobas/259/orig -> origin/gh/guilhermeleobas/259/orig 2025-12-04T08:53:59.3279433Z * [new branch] gh/guilhermeleobas/260/base -> origin/gh/guilhermeleobas/260/base 2025-12-04T08:53:59.3279521Z * [new branch] gh/guilhermeleobas/260/head -> origin/gh/guilhermeleobas/260/head 2025-12-04T08:53:59.3279606Z * [new branch] gh/guilhermeleobas/260/orig -> origin/gh/guilhermeleobas/260/orig 2025-12-04T08:53:59.3279692Z * [new branch] gh/guilhermeleobas/261/base -> origin/gh/guilhermeleobas/261/base 2025-12-04T08:53:59.3279807Z * [new branch] gh/guilhermeleobas/261/head -> origin/gh/guilhermeleobas/261/head 2025-12-04T08:53:59.3279893Z * [new branch] gh/guilhermeleobas/261/orig -> origin/gh/guilhermeleobas/261/orig 2025-12-04T08:53:59.3279978Z * [new branch] gh/guilhermeleobas/262/base -> origin/gh/guilhermeleobas/262/base 2025-12-04T08:53:59.3280066Z * [new branch] gh/guilhermeleobas/262/head -> origin/gh/guilhermeleobas/262/head 2025-12-04T08:53:59.3280151Z * [new branch] gh/guilhermeleobas/262/orig -> origin/gh/guilhermeleobas/262/orig 2025-12-04T08:53:59.3280239Z * [new branch] gh/guilhermeleobas/263/base -> origin/gh/guilhermeleobas/263/base 2025-12-04T08:53:59.3280325Z * [new branch] gh/guilhermeleobas/263/head -> origin/gh/guilhermeleobas/263/head 2025-12-04T08:53:59.3280411Z * [new branch] gh/guilhermeleobas/263/orig -> origin/gh/guilhermeleobas/263/orig 2025-12-04T08:53:59.3280500Z * [new branch] gh/guilhermeleobas/264/base -> origin/gh/guilhermeleobas/264/base 2025-12-04T08:53:59.3280586Z * [new branch] gh/guilhermeleobas/264/head -> origin/gh/guilhermeleobas/264/head 2025-12-04T08:53:59.3280672Z * [new branch] gh/guilhermeleobas/264/orig -> origin/gh/guilhermeleobas/264/orig 2025-12-04T08:53:59.3280788Z * [new branch] gh/guilhermeleobas/265/base -> origin/gh/guilhermeleobas/265/base 2025-12-04T08:53:59.3280874Z * [new branch] gh/guilhermeleobas/265/head -> origin/gh/guilhermeleobas/265/head 2025-12-04T08:53:59.3280960Z * [new branch] gh/guilhermeleobas/265/orig -> origin/gh/guilhermeleobas/265/orig 2025-12-04T08:53:59.3281047Z * [new branch] gh/guilhermeleobas/266/base -> origin/gh/guilhermeleobas/266/base 2025-12-04T08:53:59.3281134Z * [new branch] gh/guilhermeleobas/266/head -> origin/gh/guilhermeleobas/266/head 2025-12-04T08:53:59.3281222Z * [new branch] gh/guilhermeleobas/266/orig -> origin/gh/guilhermeleobas/266/orig 2025-12-04T08:53:59.3281311Z * [new branch] gh/guilhermeleobas/267/base -> origin/gh/guilhermeleobas/267/base 2025-12-04T08:53:59.3281398Z * [new branch] gh/guilhermeleobas/267/head -> origin/gh/guilhermeleobas/267/head 2025-12-04T08:53:59.3281485Z * [new branch] gh/guilhermeleobas/267/orig -> origin/gh/guilhermeleobas/267/orig 2025-12-04T08:53:59.3281567Z * [new branch] gh/hameerabbasi/1/base -> origin/gh/hameerabbasi/1/base 2025-12-04T08:53:59.3281645Z * [new branch] gh/hameerabbasi/1/head -> origin/gh/hameerabbasi/1/head 2025-12-04T08:53:59.3281723Z * [new branch] gh/hameerabbasi/2/base -> origin/gh/hameerabbasi/2/base 2025-12-04T08:53:59.3281798Z * [new branch] gh/hameerabbasi/2/head -> origin/gh/hameerabbasi/2/head 2025-12-04T08:53:59.3281903Z * [new branch] gh/hameerabbasi/2/orig -> origin/gh/hameerabbasi/2/orig 2025-12-04T08:53:59.3281980Z * [new branch] gh/hameerabbasi/3/base -> origin/gh/hameerabbasi/3/base 2025-12-04T08:53:59.3282055Z * [new branch] gh/hameerabbasi/3/head -> origin/gh/hameerabbasi/3/head 2025-12-04T08:53:59.3282129Z * [new branch] gh/hameerabbasi/3/orig -> origin/gh/hameerabbasi/3/orig 2025-12-04T08:53:59.3282205Z * [new branch] gh/hameerabbasi/4/base -> origin/gh/hameerabbasi/4/base 2025-12-04T08:53:59.3282279Z * [new branch] gh/hameerabbasi/4/head -> origin/gh/hameerabbasi/4/head 2025-12-04T08:53:59.3282353Z * [new branch] gh/hameerabbasi/4/orig -> origin/gh/hameerabbasi/4/orig 2025-12-04T08:53:59.3282423Z * [new branch] gh/huydhn/1/next -> origin/gh/huydhn/1/next 2025-12-04T08:53:59.3282491Z * [new branch] gh/huydhn/2/next -> origin/gh/huydhn/2/next 2025-12-04T08:53:59.3282557Z * [new branch] gh/huydhn/3/next -> origin/gh/huydhn/3/next 2025-12-04T08:53:59.3282675Z * [new branch] gh/huydhn/4/next -> origin/gh/huydhn/4/next 2025-12-04T08:53:59.3282740Z * [new branch] gh/huydhn/5/next -> origin/gh/huydhn/5/next 2025-12-04T08:53:59.3282804Z * [new branch] gh/huydhn/6/next -> origin/gh/huydhn/6/next 2025-12-04T08:53:59.3282871Z * [new branch] gh/int3/97/base -> origin/gh/int3/97/base 2025-12-04T08:53:59.3282937Z * [new branch] gh/int3/97/head -> origin/gh/int3/97/head 2025-12-04T08:53:59.3283007Z * [new branch] gh/isuruf/101/base -> origin/gh/isuruf/101/base 2025-12-04T08:53:59.3283077Z * [new branch] gh/isuruf/101/head -> origin/gh/isuruf/101/head 2025-12-04T08:53:59.3283145Z * [new branch] gh/isuruf/146/base -> origin/gh/isuruf/146/base 2025-12-04T08:53:59.3283211Z * [new branch] gh/isuruf/146/head -> origin/gh/isuruf/146/head 2025-12-04T08:53:59.3283281Z * [new branch] gh/isuruf/146/orig -> origin/gh/isuruf/146/orig 2025-12-04T08:53:59.3283347Z * [new branch] gh/isuruf/158/base -> origin/gh/isuruf/158/base 2025-12-04T08:53:59.3283416Z * [new branch] gh/isuruf/158/head -> origin/gh/isuruf/158/head 2025-12-04T08:53:59.3283516Z * [new branch] gh/isuruf/159/base -> origin/gh/isuruf/159/base 2025-12-04T08:53:59.3283582Z * [new branch] gh/isuruf/159/head -> origin/gh/isuruf/159/head 2025-12-04T08:53:59.3283650Z * [new branch] gh/isuruf/160/base -> origin/gh/isuruf/160/base 2025-12-04T08:53:59.3283716Z * [new branch] gh/isuruf/160/head -> origin/gh/isuruf/160/head 2025-12-04T08:53:59.3283782Z * [new branch] gh/isuruf/160/orig -> origin/gh/isuruf/160/orig 2025-12-04T08:53:59.3283851Z * [new branch] gh/isuruf/81/base -> origin/gh/isuruf/81/base 2025-12-04T08:53:59.3283920Z * [new branch] gh/isuruf/81/head -> origin/gh/isuruf/81/head 2025-12-04T08:53:59.3283987Z * [new branch] gh/isuruf/81/orig -> origin/gh/isuruf/81/orig 2025-12-04T08:53:59.3284064Z * [new branch] gh/jamesjwu/176/base -> origin/gh/jamesjwu/176/base 2025-12-04T08:53:59.3284137Z * [new branch] gh/jamesjwu/176/head -> origin/gh/jamesjwu/176/head 2025-12-04T08:53:59.3284208Z * [new branch] gh/jamesjwu/176/orig -> origin/gh/jamesjwu/176/orig 2025-12-04T08:53:59.3284279Z * [new branch] gh/jamesjwu/187/base -> origin/gh/jamesjwu/187/base 2025-12-04T08:53:59.3284351Z * [new branch] gh/jamesjwu/187/head -> origin/gh/jamesjwu/187/head 2025-12-04T08:53:59.3284420Z * [new branch] gh/jamesjwu/187/orig -> origin/gh/jamesjwu/187/orig 2025-12-04T08:53:59.3284491Z * [new branch] gh/jamesjwu/196/base -> origin/gh/jamesjwu/196/base 2025-12-04T08:53:59.3284561Z * [new branch] gh/jamesjwu/196/head -> origin/gh/jamesjwu/196/head 2025-12-04T08:53:59.3284630Z * [new branch] gh/jamesjwu/196/orig -> origin/gh/jamesjwu/196/orig 2025-12-04T08:53:59.3284701Z * [new branch] gh/jamesjwu/198/base -> origin/gh/jamesjwu/198/base 2025-12-04T08:53:59.3284772Z * [new branch] gh/jamesjwu/198/head -> origin/gh/jamesjwu/198/head 2025-12-04T08:53:59.3284841Z * [new branch] gh/jamesjwu/198/orig -> origin/gh/jamesjwu/198/orig 2025-12-04T08:53:59.3284911Z * [new branch] gh/jamesjwu/207/base -> origin/gh/jamesjwu/207/base 2025-12-04T08:53:59.3284981Z * [new branch] gh/jamesjwu/207/head -> origin/gh/jamesjwu/207/head 2025-12-04T08:53:59.3285052Z * [new branch] gh/jamesjwu/207/orig -> origin/gh/jamesjwu/207/orig 2025-12-04T08:53:59.3285121Z * [new branch] gh/jamesjwu/208/base -> origin/gh/jamesjwu/208/base 2025-12-04T08:53:59.3285218Z * [new branch] gh/jamesjwu/208/head -> origin/gh/jamesjwu/208/head 2025-12-04T08:53:59.3285289Z * [new branch] gh/jamesjwu/208/orig -> origin/gh/jamesjwu/208/orig 2025-12-04T08:53:59.3285359Z * [new branch] gh/jamesjwu/52/base -> origin/gh/jamesjwu/52/base 2025-12-04T08:53:59.3285430Z * [new branch] gh/jamesjwu/52/head -> origin/gh/jamesjwu/52/head 2025-12-04T08:53:59.3285502Z * [new branch] gh/jamesjwu/53/base -> origin/gh/jamesjwu/53/base 2025-12-04T08:53:59.3285571Z * [new branch] gh/jamesjwu/53/head -> origin/gh/jamesjwu/53/head 2025-12-04T08:53:59.3285640Z * [new branch] gh/jamesjwu/54/base -> origin/gh/jamesjwu/54/base 2025-12-04T08:53:59.3285709Z * [new branch] gh/jamesjwu/54/head -> origin/gh/jamesjwu/54/head 2025-12-04T08:53:59.3285777Z * [new branch] gh/jamesjwu/55/base -> origin/gh/jamesjwu/55/base 2025-12-04T08:53:59.3285846Z * [new branch] gh/jamesjwu/55/head -> origin/gh/jamesjwu/55/head 2025-12-04T08:53:59.3285916Z * [new branch] gh/jamesjwu/56/base -> origin/gh/jamesjwu/56/base 2025-12-04T08:53:59.3285985Z * [new branch] gh/jamesjwu/56/head -> origin/gh/jamesjwu/56/head 2025-12-04T08:53:59.3286079Z * [new branch] gh/jamesjwu/57/base -> origin/gh/jamesjwu/57/base 2025-12-04T08:53:59.3286149Z * [new branch] gh/jamesjwu/57/head -> origin/gh/jamesjwu/57/head 2025-12-04T08:53:59.3286217Z * [new branch] gh/jamesjwu/58/base -> origin/gh/jamesjwu/58/base 2025-12-04T08:53:59.3286285Z * [new branch] gh/jamesjwu/58/head -> origin/gh/jamesjwu/58/head 2025-12-04T08:53:59.3286355Z * [new branch] gh/jamesjwu/59/base -> origin/gh/jamesjwu/59/base 2025-12-04T08:53:59.3286422Z * [new branch] gh/jamesjwu/59/head -> origin/gh/jamesjwu/59/head 2025-12-04T08:53:59.3286491Z * [new branch] gh/jamesjwu/60/base -> origin/gh/jamesjwu/60/base 2025-12-04T08:53:59.3286560Z * [new branch] gh/jamesjwu/60/head -> origin/gh/jamesjwu/60/head 2025-12-04T08:53:59.3286628Z * [new branch] gh/jamesjwu/61/base -> origin/gh/jamesjwu/61/base 2025-12-04T08:53:59.3286698Z * [new branch] gh/jamesjwu/61/head -> origin/gh/jamesjwu/61/head 2025-12-04T08:53:59.3286766Z * [new branch] gh/jamesjwu/62/base -> origin/gh/jamesjwu/62/base 2025-12-04T08:53:59.3286835Z * [new branch] gh/jamesjwu/62/head -> origin/gh/jamesjwu/62/head 2025-12-04T08:53:59.3286904Z * [new branch] gh/jamesjwu/63/base -> origin/gh/jamesjwu/63/base 2025-12-04T08:53:59.3286971Z * [new branch] gh/jamesjwu/63/head -> origin/gh/jamesjwu/63/head 2025-12-04T08:53:59.3287039Z * [new branch] gh/jamesjwu/64/base -> origin/gh/jamesjwu/64/base 2025-12-04T08:53:59.3287110Z * [new branch] gh/jamesjwu/64/head -> origin/gh/jamesjwu/64/head 2025-12-04T08:53:59.3287178Z * [new branch] gh/jamesjwu/65/base -> origin/gh/jamesjwu/65/base 2025-12-04T08:53:59.3287246Z * [new branch] gh/jamesjwu/65/head -> origin/gh/jamesjwu/65/head 2025-12-04T08:53:59.3287318Z * [new branch] gh/janeyx99/165/base -> origin/gh/janeyx99/165/base 2025-12-04T08:53:59.3287387Z * [new branch] gh/janeyx99/165/head -> origin/gh/janeyx99/165/head 2025-12-04T08:53:59.3287456Z * [new branch] gh/janeyx99/165/orig -> origin/gh/janeyx99/165/orig 2025-12-04T08:53:59.3287525Z * [new branch] gh/janeyx99/201/base -> origin/gh/janeyx99/201/base 2025-12-04T08:53:59.3287593Z * [new branch] gh/janeyx99/201/head -> origin/gh/janeyx99/201/head 2025-12-04T08:53:59.3287661Z * [new branch] gh/janeyx99/201/orig -> origin/gh/janeyx99/201/orig 2025-12-04T08:53:59.3287771Z * [new branch] gh/janeyx99/225/base -> origin/gh/janeyx99/225/base 2025-12-04T08:53:59.3287840Z * [new branch] gh/janeyx99/225/head -> origin/gh/janeyx99/225/head 2025-12-04T08:53:59.3287908Z * [new branch] gh/janeyx99/225/orig -> origin/gh/janeyx99/225/orig 2025-12-04T08:53:59.3287980Z * [new branch] gh/janeyx99/299/base -> origin/gh/janeyx99/299/base 2025-12-04T08:53:59.3288049Z * [new branch] gh/janeyx99/299/head -> origin/gh/janeyx99/299/head 2025-12-04T08:53:59.3288118Z * [new branch] gh/janeyx99/299/orig -> origin/gh/janeyx99/299/orig 2025-12-04T08:53:59.3288187Z * [new branch] gh/janeyx99/302/base -> origin/gh/janeyx99/302/base 2025-12-04T08:53:59.3288256Z * [new branch] gh/janeyx99/302/head -> origin/gh/janeyx99/302/head 2025-12-04T08:53:59.3288328Z * [new branch] gh/janeyx99/303/base -> origin/gh/janeyx99/303/base 2025-12-04T08:53:59.3288396Z * [new branch] gh/janeyx99/303/head -> origin/gh/janeyx99/303/head 2025-12-04T08:53:59.3288465Z * [new branch] gh/janeyx99/305/base -> origin/gh/janeyx99/305/base 2025-12-04T08:53:59.3288535Z * [new branch] gh/janeyx99/305/head -> origin/gh/janeyx99/305/head 2025-12-04T08:53:59.3288632Z * [new branch] gh/janeyx99/306/base -> origin/gh/janeyx99/306/base 2025-12-04T08:53:59.3288701Z * [new branch] gh/janeyx99/306/head -> origin/gh/janeyx99/306/head 2025-12-04T08:53:59.3288771Z * [new branch] gh/janeyx99/314/base -> origin/gh/janeyx99/314/base 2025-12-04T08:53:59.3288839Z * [new branch] gh/janeyx99/314/head -> origin/gh/janeyx99/314/head 2025-12-04T08:53:59.3288907Z * [new branch] gh/janeyx99/314/orig -> origin/gh/janeyx99/314/orig 2025-12-04T08:53:59.3288980Z * [new branch] gh/janeyx99/315/base -> origin/gh/janeyx99/315/base 2025-12-04T08:53:59.3289048Z * [new branch] gh/janeyx99/315/head -> origin/gh/janeyx99/315/head 2025-12-04T08:53:59.3289117Z * [new branch] gh/janeyx99/315/orig -> origin/gh/janeyx99/315/orig 2025-12-04T08:53:59.3289188Z * [new branch] gh/janeyx99/316/base -> origin/gh/janeyx99/316/base 2025-12-04T08:53:59.3289257Z * [new branch] gh/janeyx99/316/head -> origin/gh/janeyx99/316/head 2025-12-04T08:53:59.3289327Z * [new branch] gh/janeyx99/316/orig -> origin/gh/janeyx99/316/orig 2025-12-04T08:53:59.3289395Z * [new branch] gh/janeyx99/317/base -> origin/gh/janeyx99/317/base 2025-12-04T08:53:59.3289464Z * [new branch] gh/janeyx99/317/head -> origin/gh/janeyx99/317/head 2025-12-04T08:53:59.3289533Z * [new branch] gh/janeyx99/317/orig -> origin/gh/janeyx99/317/orig 2025-12-04T08:53:59.3289603Z * [new branch] gh/janeyx99/325/base -> origin/gh/janeyx99/325/base 2025-12-04T08:53:59.3289672Z * [new branch] gh/janeyx99/325/head -> origin/gh/janeyx99/325/head 2025-12-04T08:53:59.3289742Z * [new branch] gh/janeyx99/325/orig -> origin/gh/janeyx99/325/orig 2025-12-04T08:53:59.3289809Z * [new branch] gh/janeyx99/327/base -> origin/gh/janeyx99/327/base 2025-12-04T08:53:59.3289879Z * [new branch] gh/janeyx99/327/head -> origin/gh/janeyx99/327/head 2025-12-04T08:53:59.3289949Z * [new branch] gh/janeyx99/327/orig -> origin/gh/janeyx99/327/orig 2025-12-04T08:53:59.3290017Z * [new branch] gh/janeyx99/328/base -> origin/gh/janeyx99/328/base 2025-12-04T08:53:59.3290085Z * [new branch] gh/janeyx99/328/head -> origin/gh/janeyx99/328/head 2025-12-04T08:53:59.3290156Z * [new branch] gh/janeyx99/328/orig -> origin/gh/janeyx99/328/orig 2025-12-04T08:53:59.3290247Z * [new branch] gh/janeyx99/329/base -> origin/gh/janeyx99/329/base 2025-12-04T08:53:59.3290316Z * [new branch] gh/janeyx99/329/head -> origin/gh/janeyx99/329/head 2025-12-04T08:53:59.3290386Z * [new branch] gh/janeyx99/329/orig -> origin/gh/janeyx99/329/orig 2025-12-04T08:53:59.3290454Z * [new branch] gh/janeyx99/330/base -> origin/gh/janeyx99/330/base 2025-12-04T08:53:59.3290523Z * [new branch] gh/janeyx99/330/head -> origin/gh/janeyx99/330/head 2025-12-04T08:53:59.3290594Z * [new branch] gh/janeyx99/330/orig -> origin/gh/janeyx99/330/orig 2025-12-04T08:53:59.3290662Z * [new branch] gh/janeyx99/331/base -> origin/gh/janeyx99/331/base 2025-12-04T08:53:59.3290731Z * [new branch] gh/janeyx99/331/head -> origin/gh/janeyx99/331/head 2025-12-04T08:53:59.3290800Z * [new branch] gh/janeyx99/331/orig -> origin/gh/janeyx99/331/orig 2025-12-04T08:53:59.3290870Z * [new branch] gh/janeyx99/332/base -> origin/gh/janeyx99/332/base 2025-12-04T08:53:59.3290940Z * [new branch] gh/janeyx99/332/head -> origin/gh/janeyx99/332/head 2025-12-04T08:53:59.3291010Z * [new branch] gh/janeyx99/332/orig -> origin/gh/janeyx99/332/orig 2025-12-04T08:53:59.3291105Z * [new branch] gh/janeyx99/333/base -> origin/gh/janeyx99/333/base 2025-12-04T08:53:59.3291176Z * [new branch] gh/janeyx99/333/head -> origin/gh/janeyx99/333/head 2025-12-04T08:53:59.3291244Z * [new branch] gh/janeyx99/333/orig -> origin/gh/janeyx99/333/orig 2025-12-04T08:53:59.3291313Z * [new branch] gh/janeyx99/88/base -> origin/gh/janeyx99/88/base 2025-12-04T08:53:59.3291382Z * [new branch] gh/janeyx99/88/head -> origin/gh/janeyx99/88/head 2025-12-04T08:53:59.3291450Z * [new branch] gh/janeyx99/88/orig -> origin/gh/janeyx99/88/orig 2025-12-04T08:53:59.3291519Z * [new branch] gh/jansel/360/base -> origin/gh/jansel/360/base 2025-12-04T08:53:59.3291588Z * [new branch] gh/jansel/360/head -> origin/gh/jansel/360/head 2025-12-04T08:53:59.3291655Z * [new branch] gh/jansel/451/base -> origin/gh/jansel/451/base 2025-12-04T08:53:59.3291722Z * [new branch] gh/jansel/451/head -> origin/gh/jansel/451/head 2025-12-04T08:53:59.3291790Z * [new branch] gh/jansel/451/orig -> origin/gh/jansel/451/orig 2025-12-04T08:53:59.3291883Z * [new branch] gh/jansel/462/base -> origin/gh/jansel/462/base 2025-12-04T08:53:59.3291949Z * [new branch] gh/jansel/462/head -> origin/gh/jansel/462/head 2025-12-04T08:53:59.3292019Z * [new branch] gh/jansel/462/orig -> origin/gh/jansel/462/orig 2025-12-04T08:53:59.3292085Z * [new branch] gh/jansel/533/base -> origin/gh/jansel/533/base 2025-12-04T08:53:59.3292152Z * [new branch] gh/jansel/533/head -> origin/gh/jansel/533/head 2025-12-04T08:53:59.3292219Z * [new branch] gh/jansel/533/orig -> origin/gh/jansel/533/orig 2025-12-04T08:53:59.3292285Z * [new branch] gh/jansel/552/base -> origin/gh/jansel/552/base 2025-12-04T08:53:59.3292353Z * [new branch] gh/jansel/552/head -> origin/gh/jansel/552/head 2025-12-04T08:53:59.3292420Z * [new branch] gh/jansel/552/orig -> origin/gh/jansel/552/orig 2025-12-04T08:53:59.3292486Z * [new branch] gh/jansel/553/base -> origin/gh/jansel/553/base 2025-12-04T08:53:59.3292556Z * [new branch] gh/jansel/553/head -> origin/gh/jansel/553/head 2025-12-04T08:53:59.3292622Z * [new branch] gh/jansel/553/orig -> origin/gh/jansel/553/orig 2025-12-04T08:53:59.3292687Z * [new branch] gh/jansel/554/base -> origin/gh/jansel/554/base 2025-12-04T08:53:59.3292794Z * [new branch] gh/jansel/554/head -> origin/gh/jansel/554/head 2025-12-04T08:53:59.3292860Z * [new branch] gh/jansel/554/orig -> origin/gh/jansel/554/orig 2025-12-04T08:53:59.3292928Z * [new branch] gh/jansel/555/base -> origin/gh/jansel/555/base 2025-12-04T08:53:59.3292995Z * [new branch] gh/jansel/555/head -> origin/gh/jansel/555/head 2025-12-04T08:53:59.3293062Z * [new branch] gh/jansel/555/orig -> origin/gh/jansel/555/orig 2025-12-04T08:53:59.3293128Z * [new branch] gh/jansel/556/base -> origin/gh/jansel/556/base 2025-12-04T08:53:59.3293196Z * [new branch] gh/jansel/556/head -> origin/gh/jansel/556/head 2025-12-04T08:53:59.3293262Z * [new branch] gh/jansel/556/orig -> origin/gh/jansel/556/orig 2025-12-04T08:53:59.3293328Z * [new branch] gh/jansel/557/base -> origin/gh/jansel/557/base 2025-12-04T08:53:59.3293398Z * [new branch] gh/jansel/557/head -> origin/gh/jansel/557/head 2025-12-04T08:53:59.3293464Z * [new branch] gh/jansel/557/orig -> origin/gh/jansel/557/orig 2025-12-04T08:53:59.3293530Z * [new branch] gh/jansel/558/base -> origin/gh/jansel/558/base 2025-12-04T08:53:59.3293597Z * [new branch] gh/jansel/558/head -> origin/gh/jansel/558/head 2025-12-04T08:53:59.3293706Z * [new branch] gh/jansel/558/orig -> origin/gh/jansel/558/orig 2025-12-04T08:53:59.3293772Z * [new branch] gh/jansel/559/base -> origin/gh/jansel/559/base 2025-12-04T08:53:59.3293840Z * [new branch] gh/jansel/559/head -> origin/gh/jansel/559/head 2025-12-04T08:53:59.3293906Z * [new branch] gh/jansel/559/orig -> origin/gh/jansel/559/orig 2025-12-04T08:53:59.3293973Z * [new branch] gh/jansel/560/base -> origin/gh/jansel/560/base 2025-12-04T08:53:59.3294041Z * [new branch] gh/jansel/560/head -> origin/gh/jansel/560/head 2025-12-04T08:53:59.3294106Z * [new branch] gh/jansel/560/orig -> origin/gh/jansel/560/orig 2025-12-04T08:53:59.3294174Z * [new branch] gh/jansel/561/base -> origin/gh/jansel/561/base 2025-12-04T08:53:59.3294240Z * [new branch] gh/jansel/561/head -> origin/gh/jansel/561/head 2025-12-04T08:53:59.3294309Z * [new branch] gh/jansel/561/orig -> origin/gh/jansel/561/orig 2025-12-04T08:53:59.3294377Z * [new branch] gh/jansel/562/base -> origin/gh/jansel/562/base 2025-12-04T08:53:59.3294443Z * [new branch] gh/jansel/562/head -> origin/gh/jansel/562/head 2025-12-04T08:53:59.3294509Z * [new branch] gh/jansel/562/orig -> origin/gh/jansel/562/orig 2025-12-04T08:53:59.3294577Z * [new branch] gh/jansel/563/base -> origin/gh/jansel/563/base 2025-12-04T08:53:59.3294645Z * [new branch] gh/jansel/563/head -> origin/gh/jansel/563/head 2025-12-04T08:53:59.3294710Z * [new branch] gh/jansel/563/orig -> origin/gh/jansel/563/orig 2025-12-04T08:53:59.3294778Z * [new branch] gh/jansel/564/base -> origin/gh/jansel/564/base 2025-12-04T08:53:59.3294844Z * [new branch] gh/jansel/564/head -> origin/gh/jansel/564/head 2025-12-04T08:53:59.3294911Z * [new branch] gh/jansel/564/orig -> origin/gh/jansel/564/orig 2025-12-04T08:53:59.3294979Z * [new branch] gh/jansel/565/base -> origin/gh/jansel/565/base 2025-12-04T08:53:59.3295045Z * [new branch] gh/jansel/565/head -> origin/gh/jansel/565/head 2025-12-04T08:53:59.3295111Z * [new branch] gh/jansel/565/orig -> origin/gh/jansel/565/orig 2025-12-04T08:53:59.3295179Z * [new branch] gh/jansel/566/base -> origin/gh/jansel/566/base 2025-12-04T08:53:59.3295277Z * [new branch] gh/jansel/566/head -> origin/gh/jansel/566/head 2025-12-04T08:53:59.3295343Z * [new branch] gh/jansel/566/orig -> origin/gh/jansel/566/orig 2025-12-04T08:53:59.3295411Z * [new branch] gh/jansel/567/base -> origin/gh/jansel/567/base 2025-12-04T08:53:59.3295477Z * [new branch] gh/jansel/567/head -> origin/gh/jansel/567/head 2025-12-04T08:53:59.3295543Z * [new branch] gh/jansel/567/orig -> origin/gh/jansel/567/orig 2025-12-04T08:53:59.3295611Z * [new branch] gh/jansel/568/base -> origin/gh/jansel/568/base 2025-12-04T08:53:59.3295677Z * [new branch] gh/jansel/568/head -> origin/gh/jansel/568/head 2025-12-04T08:53:59.3295744Z * [new branch] gh/jansel/568/orig -> origin/gh/jansel/568/orig 2025-12-04T08:53:59.3295810Z * [new branch] gh/jansel/569/base -> origin/gh/jansel/569/base 2025-12-04T08:53:59.3295877Z * [new branch] gh/jansel/569/head -> origin/gh/jansel/569/head 2025-12-04T08:53:59.3295946Z * [new branch] gh/jansel/569/orig -> origin/gh/jansel/569/orig 2025-12-04T08:53:59.3296012Z * [new branch] gh/jansel/570/base -> origin/gh/jansel/570/base 2025-12-04T08:53:59.3296078Z * [new branch] gh/jansel/570/head -> origin/gh/jansel/570/head 2025-12-04T08:53:59.3296193Z * [new branch] gh/jansel/570/orig -> origin/gh/jansel/570/orig 2025-12-04T08:53:59.3296260Z * [new branch] gh/jansel/571/base -> origin/gh/jansel/571/base 2025-12-04T08:53:59.3296327Z * [new branch] gh/jansel/571/head -> origin/gh/jansel/571/head 2025-12-04T08:53:59.3296394Z * [new branch] gh/jansel/571/orig -> origin/gh/jansel/571/orig 2025-12-04T08:53:59.3296460Z * [new branch] gh/jansel/572/base -> origin/gh/jansel/572/base 2025-12-04T08:53:59.3296526Z * [new branch] gh/jansel/572/head -> origin/gh/jansel/572/head 2025-12-04T08:53:59.3296594Z * [new branch] gh/jansel/572/orig -> origin/gh/jansel/572/orig 2025-12-04T08:53:59.3296660Z * [new branch] gh/jansel/573/base -> origin/gh/jansel/573/base 2025-12-04T08:53:59.3296726Z * [new branch] gh/jansel/573/head -> origin/gh/jansel/573/head 2025-12-04T08:53:59.3296794Z * [new branch] gh/jansel/573/orig -> origin/gh/jansel/573/orig 2025-12-04T08:53:59.3296861Z * [new branch] gh/jansel/574/base -> origin/gh/jansel/574/base 2025-12-04T08:53:59.3296927Z * [new branch] gh/jansel/574/head -> origin/gh/jansel/574/head 2025-12-04T08:53:59.3296996Z * [new branch] gh/jansel/574/orig -> origin/gh/jansel/574/orig 2025-12-04T08:53:59.3297062Z * [new branch] gh/jansel/575/base -> origin/gh/jansel/575/base 2025-12-04T08:53:59.3297128Z * [new branch] gh/jansel/575/head -> origin/gh/jansel/575/head 2025-12-04T08:53:59.3297196Z * [new branch] gh/jansel/575/orig -> origin/gh/jansel/575/orig 2025-12-04T08:53:59.3297262Z * [new branch] gh/jansel/576/base -> origin/gh/jansel/576/base 2025-12-04T08:53:59.3297329Z * [new branch] gh/jansel/576/head -> origin/gh/jansel/576/head 2025-12-04T08:53:59.3297396Z * [new branch] gh/jansel/576/orig -> origin/gh/jansel/576/orig 2025-12-04T08:53:59.3297476Z * [new branch] gh/jbschlosser/247/base -> origin/gh/jbschlosser/247/base 2025-12-04T08:53:59.3297555Z * [new branch] gh/jbschlosser/247/head -> origin/gh/jbschlosser/247/head 2025-12-04T08:53:59.3297631Z * [new branch] gh/jbschlosser/247/orig -> origin/gh/jbschlosser/247/orig 2025-12-04T08:53:59.3297705Z * [new branch] gh/jbschlosser/250/base -> origin/gh/jbschlosser/250/base 2025-12-04T08:53:59.3297811Z * [new branch] gh/jbschlosser/250/head -> origin/gh/jbschlosser/250/head 2025-12-04T08:53:59.3297885Z * [new branch] gh/jbschlosser/250/orig -> origin/gh/jbschlosser/250/orig 2025-12-04T08:53:59.3297957Z * [new branch] gh/jerryzh168/1/base -> origin/gh/jerryzh168/1/base 2025-12-04T08:53:59.3298031Z * [new branch] gh/jerryzh168/1/head -> origin/gh/jerryzh168/1/head 2025-12-04T08:53:59.3298103Z * [new branch] gh/jerryzh168/1/orig -> origin/gh/jerryzh168/1/orig 2025-12-04T08:53:59.3298174Z * [new branch] gh/jiayisunx/59/base -> origin/gh/jiayisunx/59/base 2025-12-04T08:53:59.3298246Z * [new branch] gh/jiayisunx/59/head -> origin/gh/jiayisunx/59/head 2025-12-04T08:53:59.3298316Z * [new branch] gh/jiayisunx/59/orig -> origin/gh/jiayisunx/59/orig 2025-12-04T08:53:59.3298386Z * [new branch] gh/jiayisunx/61/base -> origin/gh/jiayisunx/61/base 2025-12-04T08:53:59.3298460Z * [new branch] gh/jiayisunx/61/head -> origin/gh/jiayisunx/61/head 2025-12-04T08:53:59.3298529Z * [new branch] gh/jiayisunx/61/orig -> origin/gh/jiayisunx/61/orig 2025-12-04T08:53:59.3298599Z * [new branch] gh/jiayisunx/68/base -> origin/gh/jiayisunx/68/base 2025-12-04T08:53:59.3298671Z * [new branch] gh/jiayisunx/68/head -> origin/gh/jiayisunx/68/head 2025-12-04T08:53:59.3298769Z * [new branch] gh/jiayisunx/68/orig -> origin/gh/jiayisunx/68/orig 2025-12-04T08:53:59.3298841Z * [new branch] gh/jiayisunx/77/base -> origin/gh/jiayisunx/77/base 2025-12-04T08:53:59.3298911Z * [new branch] gh/jiayisunx/77/head -> origin/gh/jiayisunx/77/head 2025-12-04T08:53:59.3298981Z * [new branch] gh/jiayisunx/77/orig -> origin/gh/jiayisunx/77/orig 2025-12-04T08:53:59.3299052Z * [new branch] gh/jiayisunx/78/base -> origin/gh/jiayisunx/78/base 2025-12-04T08:53:59.3299125Z * [new branch] gh/jiayisunx/78/head -> origin/gh/jiayisunx/78/head 2025-12-04T08:53:59.3299195Z * [new branch] gh/jiayisunx/78/orig -> origin/gh/jiayisunx/78/orig 2025-12-04T08:53:59.3299265Z * [new branch] gh/jiayisunx/79/base -> origin/gh/jiayisunx/79/base 2025-12-04T08:53:59.3299335Z * [new branch] gh/jiayisunx/79/head -> origin/gh/jiayisunx/79/head 2025-12-04T08:53:59.3299405Z * [new branch] gh/jiayisunx/79/orig -> origin/gh/jiayisunx/79/orig 2025-12-04T08:53:59.3299477Z * [new branch] gh/jiayisunx/82/base -> origin/gh/jiayisunx/82/base 2025-12-04T08:53:59.3299547Z * [new branch] gh/jiayisunx/82/head -> origin/gh/jiayisunx/82/head 2025-12-04T08:53:59.3299617Z * [new branch] gh/jiayisunx/82/orig -> origin/gh/jiayisunx/82/orig 2025-12-04T08:53:59.3299689Z * [new branch] gh/jiayisunx/83/base -> origin/gh/jiayisunx/83/base 2025-12-04T08:53:59.3299761Z * [new branch] gh/jiayisunx/83/head -> origin/gh/jiayisunx/83/head 2025-12-04T08:53:59.3299831Z * [new branch] gh/jiayisunx/83/orig -> origin/gh/jiayisunx/83/orig 2025-12-04T08:53:59.3299902Z * [new branch] gh/jiayisunx/84/base -> origin/gh/jiayisunx/84/base 2025-12-04T08:53:59.3299973Z * [new branch] gh/jiayisunx/84/head -> origin/gh/jiayisunx/84/head 2025-12-04T08:53:59.3300044Z * [new branch] gh/jiayisunx/84/orig -> origin/gh/jiayisunx/84/orig 2025-12-04T08:53:59.3300115Z * [new branch] gh/jiayisunx/85/base -> origin/gh/jiayisunx/85/base 2025-12-04T08:53:59.3300185Z * [new branch] gh/jiayisunx/85/head -> origin/gh/jiayisunx/85/head 2025-12-04T08:53:59.3300255Z * [new branch] gh/jiayisunx/85/orig -> origin/gh/jiayisunx/85/orig 2025-12-04T08:53:59.3300327Z * [new branch] gh/jiayisunx/86/base -> origin/gh/jiayisunx/86/base 2025-12-04T08:53:59.3300426Z * [new branch] gh/jiayisunx/86/head -> origin/gh/jiayisunx/86/head 2025-12-04T08:53:59.3300498Z * [new branch] gh/jiayisunx/86/orig -> origin/gh/jiayisunx/86/orig 2025-12-04T08:53:59.3300568Z * [new branch] gh/jiayisunx/87/base -> origin/gh/jiayisunx/87/base 2025-12-04T08:53:59.3300639Z * [new branch] gh/jiayisunx/87/head -> origin/gh/jiayisunx/87/head 2025-12-04T08:53:59.3300710Z * [new branch] gh/jiayisunx/87/orig -> origin/gh/jiayisunx/87/orig 2025-12-04T08:53:59.3300780Z * [new branch] gh/jiayisunx/88/base -> origin/gh/jiayisunx/88/base 2025-12-04T08:53:59.3300850Z * [new branch] gh/jiayisunx/88/head -> origin/gh/jiayisunx/88/head 2025-12-04T08:53:59.3300921Z * [new branch] gh/jiayisunx/88/orig -> origin/gh/jiayisunx/88/orig 2025-12-04T08:53:59.3300991Z * [new branch] gh/jiayisunx/89/base -> origin/gh/jiayisunx/89/base 2025-12-04T08:53:59.3301064Z * [new branch] gh/jiayisunx/89/head -> origin/gh/jiayisunx/89/head 2025-12-04T08:53:59.3301135Z * [new branch] gh/jiayisunx/89/orig -> origin/gh/jiayisunx/89/orig 2025-12-04T08:53:59.3301204Z * [new branch] gh/jiayisunx/90/base -> origin/gh/jiayisunx/90/base 2025-12-04T08:53:59.3301305Z * [new branch] gh/jiayisunx/90/head -> origin/gh/jiayisunx/90/head 2025-12-04T08:53:59.3301377Z * [new branch] gh/jiayisunx/90/orig -> origin/gh/jiayisunx/90/orig 2025-12-04T08:53:59.3301453Z * [new branch] gh/jjwu@meta.com/1/base -> origin/gh/jjwu@meta.com/1/base 2025-12-04T08:53:59.3301527Z * [new branch] gh/jjwu@meta.com/1/head -> origin/gh/jjwu@meta.com/1/head 2025-12-04T08:53:59.3301597Z * [new branch] gh/jturney/1/base -> origin/gh/jturney/1/base 2025-12-04T08:53:59.3301666Z * [new branch] gh/jturney/1/head -> origin/gh/jturney/1/head 2025-12-04T08:53:59.3301736Z * [new branch] gh/jturney/1/orig -> origin/gh/jturney/1/orig 2025-12-04T08:53:59.3301804Z * [new branch] gh/jturney/2/base -> origin/gh/jturney/2/base 2025-12-04T08:53:59.3301902Z * [new branch] gh/jturney/2/head -> origin/gh/jturney/2/head 2025-12-04T08:53:59.3301971Z * [new branch] gh/jturney/2/orig -> origin/gh/jturney/2/orig 2025-12-04T08:53:59.3302049Z * [new branch] gh/karthickai/10/base -> origin/gh/karthickai/10/base 2025-12-04T08:53:59.3302124Z * [new branch] gh/karthickai/10/head -> origin/gh/karthickai/10/head 2025-12-04T08:53:59.3302200Z * [new branch] gh/karthickai/10/orig -> origin/gh/karthickai/10/orig 2025-12-04T08:53:59.3302272Z * [new branch] gh/karthickai/11/base -> origin/gh/karthickai/11/base 2025-12-04T08:53:59.3302344Z * [new branch] gh/karthickai/11/head -> origin/gh/karthickai/11/head 2025-12-04T08:53:59.3302418Z * [new branch] gh/karthickai/11/orig -> origin/gh/karthickai/11/orig 2025-12-04T08:53:59.3302490Z * [new branch] gh/karthickai/12/base -> origin/gh/karthickai/12/base 2025-12-04T08:53:59.3302562Z * [new branch] gh/karthickai/12/head -> origin/gh/karthickai/12/head 2025-12-04T08:53:59.3302637Z * [new branch] gh/karthickai/12/orig -> origin/gh/karthickai/12/orig 2025-12-04T08:53:59.3302708Z * [new branch] gh/karthickai/13/base -> origin/gh/karthickai/13/base 2025-12-04T08:53:59.3302779Z * [new branch] gh/karthickai/13/head -> origin/gh/karthickai/13/head 2025-12-04T08:53:59.3302852Z * [new branch] gh/karthickai/13/orig -> origin/gh/karthickai/13/orig 2025-12-04T08:53:59.3302923Z * [new branch] gh/karthickai/14/base -> origin/gh/karthickai/14/base 2025-12-04T08:53:59.3302996Z * [new branch] gh/karthickai/14/head -> origin/gh/karthickai/14/head 2025-12-04T08:53:59.3303116Z * [new branch] gh/karthickai/14/orig -> origin/gh/karthickai/14/orig 2025-12-04T08:53:59.3303187Z * [new branch] gh/karthickai/15/base -> origin/gh/karthickai/15/base 2025-12-04T08:53:59.3303259Z * [new branch] gh/karthickai/15/head -> origin/gh/karthickai/15/head 2025-12-04T08:53:59.3303334Z * [new branch] gh/karthickai/15/orig -> origin/gh/karthickai/15/orig 2025-12-04T08:53:59.3303405Z * [new branch] gh/karthickai/16/base -> origin/gh/karthickai/16/base 2025-12-04T08:53:59.3303478Z * [new branch] gh/karthickai/16/head -> origin/gh/karthickai/16/head 2025-12-04T08:53:59.3303550Z * [new branch] gh/karthickai/16/orig -> origin/gh/karthickai/16/orig 2025-12-04T08:53:59.3303621Z * [new branch] gh/karthickai/17/base -> origin/gh/karthickai/17/base 2025-12-04T08:53:59.3303695Z * [new branch] gh/karthickai/17/head -> origin/gh/karthickai/17/head 2025-12-04T08:53:59.3303767Z * [new branch] gh/karthickai/17/orig -> origin/gh/karthickai/17/orig 2025-12-04T08:53:59.3303839Z * [new branch] gh/karthickai/18/base -> origin/gh/karthickai/18/base 2025-12-04T08:53:59.3303912Z * [new branch] gh/karthickai/18/head -> origin/gh/karthickai/18/head 2025-12-04T08:53:59.3304024Z * [new branch] gh/karthickai/18/orig -> origin/gh/karthickai/18/orig 2025-12-04T08:53:59.3304096Z * [new branch] gh/karthickai/19/base -> origin/gh/karthickai/19/base 2025-12-04T08:53:59.3304171Z * [new branch] gh/karthickai/19/head -> origin/gh/karthickai/19/head 2025-12-04T08:53:59.3304243Z * [new branch] gh/karthickai/19/orig -> origin/gh/karthickai/19/orig 2025-12-04T08:53:59.3304314Z * [new branch] gh/karthickai/20/base -> origin/gh/karthickai/20/base 2025-12-04T08:53:59.3304389Z * [new branch] gh/karthickai/20/head -> origin/gh/karthickai/20/head 2025-12-04T08:53:59.3304460Z * [new branch] gh/karthickai/20/orig -> origin/gh/karthickai/20/orig 2025-12-04T08:53:59.3304532Z * [new branch] gh/karthickai/21/base -> origin/gh/karthickai/21/base 2025-12-04T08:53:59.3304604Z * [new branch] gh/karthickai/21/head -> origin/gh/karthickai/21/head 2025-12-04T08:53:59.3304677Z * [new branch] gh/karthickai/21/orig -> origin/gh/karthickai/21/orig 2025-12-04T08:53:59.3304748Z * [new branch] gh/karthickai/22/base -> origin/gh/karthickai/22/base 2025-12-04T08:53:59.3304821Z * [new branch] gh/karthickai/22/head -> origin/gh/karthickai/22/head 2025-12-04T08:53:59.3304893Z * [new branch] gh/karthickai/22/orig -> origin/gh/karthickai/22/orig 2025-12-04T08:53:59.3304964Z * [new branch] gh/karthickai/23/base -> origin/gh/karthickai/23/base 2025-12-04T08:53:59.3305038Z * [new branch] gh/karthickai/23/head -> origin/gh/karthickai/23/head 2025-12-04T08:53:59.3305110Z * [new branch] gh/karthickai/23/orig -> origin/gh/karthickai/23/orig 2025-12-04T08:53:59.3305183Z * [new branch] gh/karthickai/24/base -> origin/gh/karthickai/24/base 2025-12-04T08:53:59.3305256Z * [new branch] gh/karthickai/24/head -> origin/gh/karthickai/24/head 2025-12-04T08:53:59.3305329Z * [new branch] gh/karthickai/24/orig -> origin/gh/karthickai/24/orig 2025-12-04T08:53:59.3305403Z * [new branch] gh/karthickai/25/base -> origin/gh/karthickai/25/base 2025-12-04T08:53:59.3305474Z * [new branch] gh/karthickai/25/head -> origin/gh/karthickai/25/head 2025-12-04T08:53:59.3305545Z * [new branch] gh/karthickai/25/orig -> origin/gh/karthickai/25/orig 2025-12-04T08:53:59.3305618Z * [new branch] gh/karthickai/26/base -> origin/gh/karthickai/26/base 2025-12-04T08:53:59.3305719Z * [new branch] gh/karthickai/26/head -> origin/gh/karthickai/26/head 2025-12-04T08:53:59.3305790Z * [new branch] gh/karthickai/26/orig -> origin/gh/karthickai/26/orig 2025-12-04T08:53:59.3305862Z * [new branch] gh/karthickai/6/base -> origin/gh/karthickai/6/base 2025-12-04T08:53:59.3305935Z * [new branch] gh/karthickai/6/head -> origin/gh/karthickai/6/head 2025-12-04T08:53:59.3306006Z * [new branch] gh/karthickai/6/orig -> origin/gh/karthickai/6/orig 2025-12-04T08:53:59.3306075Z * [new branch] gh/krocki/1/base -> origin/gh/krocki/1/base 2025-12-04T08:53:59.3306141Z * [new branch] gh/krocki/1/head -> origin/gh/krocki/1/head 2025-12-04T08:53:59.3306206Z * [new branch] gh/krocki/1/orig -> origin/gh/krocki/1/orig 2025-12-04T08:53:59.3306271Z * [new branch] gh/krocki/2/base -> origin/gh/krocki/2/base 2025-12-04T08:53:59.3306337Z * [new branch] gh/krocki/2/head -> origin/gh/krocki/2/head 2025-12-04T08:53:59.3306400Z * [new branch] gh/krocki/2/orig -> origin/gh/krocki/2/orig 2025-12-04T08:53:59.3306480Z * [new branch] gh/kurtamohler/60/base -> origin/gh/kurtamohler/60/base 2025-12-04T08:53:59.3306600Z * [new branch] gh/kurtamohler/60/head -> origin/gh/kurtamohler/60/head 2025-12-04T08:53:59.3306677Z * [new branch] gh/kurtamohler/60/orig -> origin/gh/kurtamohler/60/orig 2025-12-04T08:53:59.3306751Z * [new branch] gh/kurtamohler/61/base -> origin/gh/kurtamohler/61/base 2025-12-04T08:53:59.3306824Z * [new branch] gh/kurtamohler/61/head -> origin/gh/kurtamohler/61/head 2025-12-04T08:53:59.3306898Z * [new branch] gh/kurtamohler/61/orig -> origin/gh/kurtamohler/61/orig 2025-12-04T08:53:59.3306970Z * [new branch] gh/kurtamohler/62/base -> origin/gh/kurtamohler/62/base 2025-12-04T08:53:59.3307044Z * [new branch] gh/kurtamohler/62/head -> origin/gh/kurtamohler/62/head 2025-12-04T08:53:59.3307119Z * [new branch] gh/kurtamohler/62/orig -> origin/gh/kurtamohler/62/orig 2025-12-04T08:53:59.3307192Z * [new branch] gh/kurtamohler/63/base -> origin/gh/kurtamohler/63/base 2025-12-04T08:53:59.3307265Z * [new branch] gh/kurtamohler/63/head -> origin/gh/kurtamohler/63/head 2025-12-04T08:53:59.3307339Z * [new branch] gh/kurtamohler/63/orig -> origin/gh/kurtamohler/63/orig 2025-12-04T08:53:59.3307413Z * [new branch] gh/kurtamohler/64/base -> origin/gh/kurtamohler/64/base 2025-12-04T08:53:59.3307485Z * [new branch] gh/kurtamohler/64/head -> origin/gh/kurtamohler/64/head 2025-12-04T08:53:59.3307560Z * [new branch] gh/kurtamohler/64/orig -> origin/gh/kurtamohler/64/orig 2025-12-04T08:53:59.3307633Z * [new branch] gh/kurtamohler/65/base -> origin/gh/kurtamohler/65/base 2025-12-04T08:53:59.3307707Z * [new branch] gh/kurtamohler/65/head -> origin/gh/kurtamohler/65/head 2025-12-04T08:53:59.3307780Z * [new branch] gh/kurtamohler/65/orig -> origin/gh/kurtamohler/65/orig 2025-12-04T08:53:59.3307853Z * [new branch] gh/kurtamohler/66/base -> origin/gh/kurtamohler/66/base 2025-12-04T08:53:59.3307927Z * [new branch] gh/kurtamohler/66/head -> origin/gh/kurtamohler/66/head 2025-12-04T08:53:59.3308001Z * [new branch] gh/kurtamohler/66/orig -> origin/gh/kurtamohler/66/orig 2025-12-04T08:53:59.3308074Z * [new branch] gh/kurtamohler/67/base -> origin/gh/kurtamohler/67/base 2025-12-04T08:53:59.3308146Z * [new branch] gh/kurtamohler/67/head -> origin/gh/kurtamohler/67/head 2025-12-04T08:53:59.3308220Z * [new branch] gh/kurtamohler/67/orig -> origin/gh/kurtamohler/67/orig 2025-12-04T08:53:59.3308320Z * [new branch] gh/kwen2501/130/base -> origin/gh/kwen2501/130/base 2025-12-04T08:53:59.3308391Z * [new branch] gh/kwen2501/130/head -> origin/gh/kwen2501/130/head 2025-12-04T08:53:59.3308460Z * [new branch] gh/kwen2501/130/orig -> origin/gh/kwen2501/130/orig 2025-12-04T08:53:59.3308529Z * [new branch] gh/kwen2501/170/base -> origin/gh/kwen2501/170/base 2025-12-04T08:53:59.3308600Z * [new branch] gh/kwen2501/170/head -> origin/gh/kwen2501/170/head 2025-12-04T08:53:59.3308668Z * [new branch] gh/kwen2501/187/base -> origin/gh/kwen2501/187/base 2025-12-04T08:53:59.3308736Z * [new branch] gh/kwen2501/187/head -> origin/gh/kwen2501/187/head 2025-12-04T08:53:59.3308805Z * [new branch] gh/kwen2501/187/orig -> origin/gh/kwen2501/187/orig 2025-12-04T08:53:59.3308873Z * [new branch] gh/kwen2501/188/base -> origin/gh/kwen2501/188/base 2025-12-04T08:53:59.3308942Z * [new branch] gh/kwen2501/188/head -> origin/gh/kwen2501/188/head 2025-12-04T08:53:59.3309012Z * [new branch] gh/kwen2501/188/orig -> origin/gh/kwen2501/188/orig 2025-12-04T08:53:59.3309080Z * [new branch] gh/kwen2501/211/base -> origin/gh/kwen2501/211/base 2025-12-04T08:53:59.3309148Z * [new branch] gh/kwen2501/211/head -> origin/gh/kwen2501/211/head 2025-12-04T08:53:59.3309247Z * [new branch] gh/kwen2501/224/base -> origin/gh/kwen2501/224/base 2025-12-04T08:53:59.3309315Z * [new branch] gh/kwen2501/224/head -> origin/gh/kwen2501/224/head 2025-12-04T08:53:59.3309383Z * [new branch] gh/kwen2501/224/orig -> origin/gh/kwen2501/224/orig 2025-12-04T08:53:59.3309452Z * [new branch] gh/kwen2501/228/base -> origin/gh/kwen2501/228/base 2025-12-04T08:53:59.3309520Z * [new branch] gh/kwen2501/228/head -> origin/gh/kwen2501/228/head 2025-12-04T08:53:59.3309595Z * [new branch] gh/kwen2501/228/orig -> origin/gh/kwen2501/228/orig 2025-12-04T08:53:59.3309664Z * [new branch] gh/kwen2501/234/base -> origin/gh/kwen2501/234/base 2025-12-04T08:53:59.3309736Z * [new branch] gh/kwen2501/234/head -> origin/gh/kwen2501/234/head 2025-12-04T08:53:59.3309806Z * [new branch] gh/kwen2501/234/orig -> origin/gh/kwen2501/234/orig 2025-12-04T08:53:59.3309876Z * [new branch] gh/kwen2501/235/base -> origin/gh/kwen2501/235/base 2025-12-04T08:53:59.3309947Z * [new branch] gh/kwen2501/235/head -> origin/gh/kwen2501/235/head 2025-12-04T08:53:59.3310015Z * [new branch] gh/kwen2501/235/orig -> origin/gh/kwen2501/235/orig 2025-12-04T08:53:59.3310083Z * [new branch] gh/kwen2501/236/base -> origin/gh/kwen2501/236/base 2025-12-04T08:53:59.3310154Z * [new branch] gh/kwen2501/236/head -> origin/gh/kwen2501/236/head 2025-12-04T08:53:59.3310225Z * [new branch] gh/kwen2501/236/orig -> origin/gh/kwen2501/236/orig 2025-12-04T08:53:59.3310293Z * [new branch] gh/kwen2501/237/base -> origin/gh/kwen2501/237/base 2025-12-04T08:53:59.3310363Z * [new branch] gh/kwen2501/237/head -> origin/gh/kwen2501/237/head 2025-12-04T08:53:59.3310433Z * [new branch] gh/kwen2501/237/orig -> origin/gh/kwen2501/237/orig 2025-12-04T08:53:59.3310502Z * [new branch] gh/kwen2501/238/base -> origin/gh/kwen2501/238/base 2025-12-04T08:53:59.3310572Z * [new branch] gh/kwen2501/238/head -> origin/gh/kwen2501/238/head 2025-12-04T08:53:59.3310640Z * [new branch] gh/kwen2501/238/orig -> origin/gh/kwen2501/238/orig 2025-12-04T08:53:59.3310710Z * [new branch] gh/kwen2501/240/base -> origin/gh/kwen2501/240/base 2025-12-04T08:53:59.3310779Z * [new branch] gh/kwen2501/240/head -> origin/gh/kwen2501/240/head 2025-12-04T08:53:59.3310872Z * [new branch] gh/kwen2501/240/orig -> origin/gh/kwen2501/240/orig 2025-12-04T08:53:59.3310942Z * [new branch] gh/kwen2501/241/base -> origin/gh/kwen2501/241/base 2025-12-04T08:53:59.3311011Z * [new branch] gh/kwen2501/241/head -> origin/gh/kwen2501/241/head 2025-12-04T08:53:59.3311082Z * [new branch] gh/kwen2501/241/orig -> origin/gh/kwen2501/241/orig 2025-12-04T08:53:59.3311153Z * [new branch] gh/kwen2501/247/base -> origin/gh/kwen2501/247/base 2025-12-04T08:53:59.3311221Z * [new branch] gh/kwen2501/247/head -> origin/gh/kwen2501/247/head 2025-12-04T08:53:59.3311290Z * [new branch] gh/kwen2501/247/orig -> origin/gh/kwen2501/247/orig 2025-12-04T08:53:59.3311361Z * [new branch] gh/kwen2501/252/base -> origin/gh/kwen2501/252/base 2025-12-04T08:53:59.3311430Z * [new branch] gh/kwen2501/252/head -> origin/gh/kwen2501/252/head 2025-12-04T08:53:59.3311499Z * [new branch] gh/kwen2501/252/orig -> origin/gh/kwen2501/252/orig 2025-12-04T08:53:59.3311570Z * [new branch] gh/kwen2501/259/base -> origin/gh/kwen2501/259/base 2025-12-04T08:53:59.3311639Z * [new branch] gh/kwen2501/259/head -> origin/gh/kwen2501/259/head 2025-12-04T08:53:59.3311743Z * [new branch] gh/kwen2501/259/orig -> origin/gh/kwen2501/259/orig 2025-12-04T08:53:59.3311814Z * [new branch] gh/kwen2501/260/base -> origin/gh/kwen2501/260/base 2025-12-04T08:53:59.3311909Z * [new branch] gh/kwen2501/260/head -> origin/gh/kwen2501/260/head 2025-12-04T08:53:59.3311977Z * [new branch] gh/kwen2501/260/orig -> origin/gh/kwen2501/260/orig 2025-12-04T08:53:59.3312045Z * [new branch] gh/kwen2501/268/base -> origin/gh/kwen2501/268/base 2025-12-04T08:53:59.3312114Z * [new branch] gh/kwen2501/268/head -> origin/gh/kwen2501/268/head 2025-12-04T08:53:59.3312188Z * [new branch] gh/kwen2501/268/orig -> origin/gh/kwen2501/268/orig 2025-12-04T08:53:59.3312255Z * [new branch] gh/kwen2501/269/base -> origin/gh/kwen2501/269/base 2025-12-04T08:53:59.3312323Z * [new branch] gh/kwen2501/269/head -> origin/gh/kwen2501/269/head 2025-12-04T08:53:59.3312398Z * [new branch] gh/kwen2501/269/orig -> origin/gh/kwen2501/269/orig 2025-12-04T08:53:59.3312468Z * [new branch] gh/kwen2501/270/base -> origin/gh/kwen2501/270/base 2025-12-04T08:53:59.3312538Z * [new branch] gh/kwen2501/270/head -> origin/gh/kwen2501/270/head 2025-12-04T08:53:59.3312607Z * [new branch] gh/kwen2501/270/orig -> origin/gh/kwen2501/270/orig 2025-12-04T08:53:59.3312676Z * [new branch] gh/kwen2501/271/base -> origin/gh/kwen2501/271/base 2025-12-04T08:53:59.3312745Z * [new branch] gh/kwen2501/271/head -> origin/gh/kwen2501/271/head 2025-12-04T08:53:59.3312816Z * [new branch] gh/kwen2501/271/orig -> origin/gh/kwen2501/271/orig 2025-12-04T08:53:59.3312885Z * [new branch] gh/kwen2501/274/base -> origin/gh/kwen2501/274/base 2025-12-04T08:53:59.3312953Z * [new branch] gh/kwen2501/274/head -> origin/gh/kwen2501/274/head 2025-12-04T08:53:59.3313024Z * [new branch] gh/kwen2501/274/orig -> origin/gh/kwen2501/274/orig 2025-12-04T08:53:59.3313093Z * [new branch] gh/kwen2501/275/base -> origin/gh/kwen2501/275/base 2025-12-04T08:53:59.3313161Z * [new branch] gh/kwen2501/275/head -> origin/gh/kwen2501/275/head 2025-12-04T08:53:59.3313230Z * [new branch] gh/kwen2501/275/orig -> origin/gh/kwen2501/275/orig 2025-12-04T08:53:59.3313298Z * [new branch] gh/kwen2501/276/base -> origin/gh/kwen2501/276/base 2025-12-04T08:53:59.3313367Z * [new branch] gh/kwen2501/276/head -> origin/gh/kwen2501/276/head 2025-12-04T08:53:59.3313484Z * [new branch] gh/kwen2501/276/orig -> origin/gh/kwen2501/276/orig 2025-12-04T08:53:59.3313552Z * [new branch] gh/kwen2501/277/base -> origin/gh/kwen2501/277/base 2025-12-04T08:53:59.3313622Z * [new branch] gh/kwen2501/277/head -> origin/gh/kwen2501/277/head 2025-12-04T08:53:59.3313694Z * [new branch] gh/kwen2501/277/orig -> origin/gh/kwen2501/277/orig 2025-12-04T08:53:59.3313762Z * [new branch] gh/kwen2501/278/base -> origin/gh/kwen2501/278/base 2025-12-04T08:53:59.3313831Z * [new branch] gh/kwen2501/278/head -> origin/gh/kwen2501/278/head 2025-12-04T08:53:59.3313899Z * [new branch] gh/kwen2501/278/orig -> origin/gh/kwen2501/278/orig 2025-12-04T08:53:59.3313967Z * [new branch] gh/kwen2501/279/base -> origin/gh/kwen2501/279/base 2025-12-04T08:53:59.3314038Z * [new branch] gh/kwen2501/279/head -> origin/gh/kwen2501/279/head 2025-12-04T08:53:59.3314106Z * [new branch] gh/kwen2501/279/orig -> origin/gh/kwen2501/279/orig 2025-12-04T08:53:59.3314173Z * [new branch] gh/kwen2501/280/base -> origin/gh/kwen2501/280/base 2025-12-04T08:53:59.3314242Z * [new branch] gh/kwen2501/280/head -> origin/gh/kwen2501/280/head 2025-12-04T08:53:59.3314348Z * [new branch] gh/kwen2501/280/orig -> origin/gh/kwen2501/280/orig 2025-12-04T08:53:59.3314416Z * [new branch] gh/kwen2501/281/base -> origin/gh/kwen2501/281/base 2025-12-04T08:53:59.3314485Z * [new branch] gh/kwen2501/281/head -> origin/gh/kwen2501/281/head 2025-12-04T08:53:59.3314553Z * [new branch] gh/kwen2501/281/orig -> origin/gh/kwen2501/281/orig 2025-12-04T08:53:59.3314620Z * [new branch] gh/kwen2501/282/base -> origin/gh/kwen2501/282/base 2025-12-04T08:53:59.3314691Z * [new branch] gh/kwen2501/282/head -> origin/gh/kwen2501/282/head 2025-12-04T08:53:59.3314759Z * [new branch] gh/kwen2501/282/orig -> origin/gh/kwen2501/282/orig 2025-12-04T08:53:59.3314826Z * [new branch] gh/kwen2501/283/base -> origin/gh/kwen2501/283/base 2025-12-04T08:53:59.3314896Z * [new branch] gh/kwen2501/283/head -> origin/gh/kwen2501/283/head 2025-12-04T08:53:59.3314965Z * [new branch] gh/kwen2501/283/orig -> origin/gh/kwen2501/283/orig 2025-12-04T08:53:59.3315033Z * [new branch] gh/kwen2501/284/base -> origin/gh/kwen2501/284/base 2025-12-04T08:53:59.3315102Z * [new branch] gh/kwen2501/284/head -> origin/gh/kwen2501/284/head 2025-12-04T08:53:59.3315170Z * [new branch] gh/kwen2501/284/orig -> origin/gh/kwen2501/284/orig 2025-12-04T08:53:59.3315239Z * [new branch] gh/kwen2501/285/base -> origin/gh/kwen2501/285/base 2025-12-04T08:53:59.3315308Z * [new branch] gh/kwen2501/285/head -> origin/gh/kwen2501/285/head 2025-12-04T08:53:59.3315376Z * [new branch] gh/kwen2501/285/orig -> origin/gh/kwen2501/285/orig 2025-12-04T08:53:59.3315445Z * [new branch] gh/kwen2501/286/base -> origin/gh/kwen2501/286/base 2025-12-04T08:53:59.3315513Z * [new branch] gh/kwen2501/286/head -> origin/gh/kwen2501/286/head 2025-12-04T08:53:59.3315582Z * [new branch] gh/kwen2501/286/orig -> origin/gh/kwen2501/286/orig 2025-12-04T08:53:59.3315651Z * [new branch] gh/kwen2501/287/base -> origin/gh/kwen2501/287/base 2025-12-04T08:53:59.3315719Z * [new branch] gh/kwen2501/287/head -> origin/gh/kwen2501/287/head 2025-12-04T08:53:59.3315787Z * [new branch] gh/kwen2501/287/orig -> origin/gh/kwen2501/287/orig 2025-12-04T08:53:59.3315856Z * [new branch] gh/kwen2501/288/base -> origin/gh/kwen2501/288/base 2025-12-04T08:53:59.3315959Z * [new branch] gh/kwen2501/288/head -> origin/gh/kwen2501/288/head 2025-12-04T08:53:59.3316027Z * [new branch] gh/kwen2501/288/orig -> origin/gh/kwen2501/288/orig 2025-12-04T08:53:59.3316103Z * [new branch] gh/laithsakka/251/base -> origin/gh/laithsakka/251/base 2025-12-04T08:53:59.3316179Z * [new branch] gh/laithsakka/251/head -> origin/gh/laithsakka/251/head 2025-12-04T08:53:59.3316252Z * [new branch] gh/laithsakka/251/orig -> origin/gh/laithsakka/251/orig 2025-12-04T08:53:59.3316325Z * [new branch] gh/laithsakka/276/base -> origin/gh/laithsakka/276/base 2025-12-04T08:53:59.3316397Z * [new branch] gh/laithsakka/276/head -> origin/gh/laithsakka/276/head 2025-12-04T08:53:59.3316469Z * [new branch] gh/laithsakka/276/orig -> origin/gh/laithsakka/276/orig 2025-12-04T08:53:59.3316543Z * [new branch] gh/laithsakka/28/base -> origin/gh/laithsakka/28/base 2025-12-04T08:53:59.3316617Z * [new branch] gh/laithsakka/29/base -> origin/gh/laithsakka/29/base 2025-12-04T08:53:59.3316690Z * [new branch] gh/laithsakka/30/base -> origin/gh/laithsakka/30/base 2025-12-04T08:53:59.3316762Z * [new branch] gh/laithsakka/30/head -> origin/gh/laithsakka/30/head 2025-12-04T08:53:59.3316859Z * [new branch] gh/laithsakka/31/base -> origin/gh/laithsakka/31/base 2025-12-04T08:53:59.3316932Z * [new branch] gh/laithsakka/31/head -> origin/gh/laithsakka/31/head 2025-12-04T08:53:59.3317005Z * [new branch] gh/laithsakka/313/base -> origin/gh/laithsakka/313/base 2025-12-04T08:53:59.3317078Z * [new branch] gh/laithsakka/313/head -> origin/gh/laithsakka/313/head 2025-12-04T08:53:59.3317152Z * [new branch] gh/laithsakka/313/orig -> origin/gh/laithsakka/313/orig 2025-12-04T08:53:59.3317224Z * [new branch] gh/laithsakka/316/base -> origin/gh/laithsakka/316/base 2025-12-04T08:53:59.3317297Z * [new branch] gh/laithsakka/316/head -> origin/gh/laithsakka/316/head 2025-12-04T08:53:59.3317370Z * [new branch] gh/laithsakka/316/orig -> origin/gh/laithsakka/316/orig 2025-12-04T08:53:59.3317441Z * [new branch] gh/laithsakka/317/base -> origin/gh/laithsakka/317/base 2025-12-04T08:53:59.3317514Z * [new branch] gh/laithsakka/317/head -> origin/gh/laithsakka/317/head 2025-12-04T08:53:59.3317586Z * [new branch] gh/laithsakka/317/orig -> origin/gh/laithsakka/317/orig 2025-12-04T08:53:59.3317658Z * [new branch] gh/laithsakka/319/base -> origin/gh/laithsakka/319/base 2025-12-04T08:53:59.3317730Z * [new branch] gh/laithsakka/319/head -> origin/gh/laithsakka/319/head 2025-12-04T08:53:59.3317802Z * [new branch] gh/laithsakka/319/orig -> origin/gh/laithsakka/319/orig 2025-12-04T08:53:59.3317874Z * [new branch] gh/laithsakka/32/base -> origin/gh/laithsakka/32/base 2025-12-04T08:53:59.3317947Z * [new branch] gh/laithsakka/32/head -> origin/gh/laithsakka/32/head 2025-12-04T08:53:59.3318020Z * [new branch] gh/laithsakka/320/base -> origin/gh/laithsakka/320/base 2025-12-04T08:53:59.3318092Z * [new branch] gh/laithsakka/320/head -> origin/gh/laithsakka/320/head 2025-12-04T08:53:59.3318167Z * [new branch] gh/laithsakka/320/orig -> origin/gh/laithsakka/320/orig 2025-12-04T08:53:59.3318240Z * [new branch] gh/laithsakka/321/base -> origin/gh/laithsakka/321/base 2025-12-04T08:53:59.3318312Z * [new branch] gh/laithsakka/321/head -> origin/gh/laithsakka/321/head 2025-12-04T08:53:59.3318385Z * [new branch] gh/laithsakka/321/orig -> origin/gh/laithsakka/321/orig 2025-12-04T08:53:59.3318457Z * [new branch] gh/laithsakka/322/base -> origin/gh/laithsakka/322/base 2025-12-04T08:53:59.3318556Z * [new branch] gh/laithsakka/322/head -> origin/gh/laithsakka/322/head 2025-12-04T08:53:59.3318629Z * [new branch] gh/laithsakka/322/orig -> origin/gh/laithsakka/322/orig 2025-12-04T08:53:59.3318701Z * [new branch] gh/laithsakka/323/base -> origin/gh/laithsakka/323/base 2025-12-04T08:53:59.3318773Z * [new branch] gh/laithsakka/323/head -> origin/gh/laithsakka/323/head 2025-12-04T08:53:59.3318846Z * [new branch] gh/laithsakka/323/orig -> origin/gh/laithsakka/323/orig 2025-12-04T08:53:59.3318919Z * [new branch] gh/laithsakka/324/base -> origin/gh/laithsakka/324/base 2025-12-04T08:53:59.3318991Z * [new branch] gh/laithsakka/324/head -> origin/gh/laithsakka/324/head 2025-12-04T08:53:59.3319064Z * [new branch] gh/laithsakka/324/orig -> origin/gh/laithsakka/324/orig 2025-12-04T08:53:59.3319136Z * [new branch] gh/laithsakka/325/base -> origin/gh/laithsakka/325/base 2025-12-04T08:53:59.3319210Z * [new branch] gh/laithsakka/325/head -> origin/gh/laithsakka/325/head 2025-12-04T08:53:59.3319285Z * [new branch] gh/laithsakka/325/orig -> origin/gh/laithsakka/325/orig 2025-12-04T08:53:59.3319356Z * [new branch] gh/laithsakka/326/base -> origin/gh/laithsakka/326/base 2025-12-04T08:53:59.3319457Z * [new branch] gh/laithsakka/326/head -> origin/gh/laithsakka/326/head 2025-12-04T08:53:59.3319532Z * [new branch] gh/laithsakka/326/orig -> origin/gh/laithsakka/326/orig 2025-12-04T08:53:59.3319605Z * [new branch] gh/laithsakka/327/base -> origin/gh/laithsakka/327/base 2025-12-04T08:53:59.3319677Z * [new branch] gh/laithsakka/327/head -> origin/gh/laithsakka/327/head 2025-12-04T08:53:59.3319750Z * [new branch] gh/laithsakka/327/orig -> origin/gh/laithsakka/327/orig 2025-12-04T08:53:59.3319821Z * [new branch] gh/laithsakka/328/base -> origin/gh/laithsakka/328/base 2025-12-04T08:53:59.3319897Z * [new branch] gh/laithsakka/328/head -> origin/gh/laithsakka/328/head 2025-12-04T08:53:59.3319969Z * [new branch] gh/laithsakka/328/orig -> origin/gh/laithsakka/328/orig 2025-12-04T08:53:59.3320037Z * [new branch] gh/liangel/4/base -> origin/gh/liangel/4/base 2025-12-04T08:53:59.3320108Z * [new branch] gh/liangel/4/head -> origin/gh/liangel/4/head 2025-12-04T08:53:59.3320175Z * [new branch] gh/liangel/4/orig -> origin/gh/liangel/4/orig 2025-12-04T08:53:59.3320250Z * [new branch] gh/lucaskabela/1/base -> origin/gh/lucaskabela/1/base 2025-12-04T08:53:59.3320323Z * [new branch] gh/lucaskabela/1/head -> origin/gh/lucaskabela/1/head 2025-12-04T08:53:59.3320386Z * [new branch] gh/lw/4/base -> origin/gh/lw/4/base 2025-12-04T08:53:59.3320447Z * [new branch] gh/lw/4/head -> origin/gh/lw/4/head 2025-12-04T08:53:59.3320512Z * [new branch] gh/lw/4/orig -> origin/gh/lw/4/orig 2025-12-04T08:53:59.3320572Z * [new branch] gh/lw/5/base -> origin/gh/lw/5/base 2025-12-04T08:53:59.3320632Z * [new branch] gh/lw/5/head -> origin/gh/lw/5/head 2025-12-04T08:53:59.3320694Z * [new branch] gh/lw/5/orig -> origin/gh/lw/5/orig 2025-12-04T08:53:59.3320754Z * [new branch] gh/lw/6/base -> origin/gh/lw/6/base 2025-12-04T08:53:59.3320814Z * [new branch] gh/lw/6/head -> origin/gh/lw/6/head 2025-12-04T08:53:59.3320875Z * [new branch] gh/lw/6/orig -> origin/gh/lw/6/orig 2025-12-04T08:53:59.3320944Z * [new branch] gh/malfet/14/base -> origin/gh/malfet/14/base 2025-12-04T08:53:59.3321014Z * [new branch] gh/malfet/417/base -> origin/gh/malfet/417/base 2025-12-04T08:53:59.3321112Z * [new branch] gh/malfet/417/head -> origin/gh/malfet/417/head 2025-12-04T08:53:59.3321181Z * [new branch] gh/malfet/417/orig -> origin/gh/malfet/417/orig 2025-12-04T08:53:59.3321248Z * [new branch] gh/malfet/506/base -> origin/gh/malfet/506/base 2025-12-04T08:53:59.3321316Z * [new branch] gh/malfet/506/head -> origin/gh/malfet/506/head 2025-12-04T08:53:59.3321383Z * [new branch] gh/malfet/506/orig -> origin/gh/malfet/506/orig 2025-12-04T08:53:59.3321451Z * [new branch] gh/malfet/517/base -> origin/gh/malfet/517/base 2025-12-04T08:53:59.3321517Z * [new branch] gh/malfet/517/head -> origin/gh/malfet/517/head 2025-12-04T08:53:59.3321584Z * [new branch] gh/malfet/528/base -> origin/gh/malfet/528/base 2025-12-04T08:53:59.3321652Z * [new branch] gh/malfet/528/head -> origin/gh/malfet/528/head 2025-12-04T08:53:59.3321720Z * [new branch] gh/malfet/528/orig -> origin/gh/malfet/528/orig 2025-12-04T08:53:59.3321786Z * [new branch] gh/malfet/537/base -> origin/gh/malfet/537/base 2025-12-04T08:53:59.3321903Z * [new branch] gh/malfet/537/head -> origin/gh/malfet/537/head 2025-12-04T08:53:59.3321969Z * [new branch] gh/malfet/537/orig -> origin/gh/malfet/537/orig 2025-12-04T08:53:59.3322084Z * [new branch] gh/malfet/546/base -> origin/gh/malfet/546/base 2025-12-04T08:53:59.3322152Z * [new branch] gh/malfet/546/head -> origin/gh/malfet/546/head 2025-12-04T08:53:59.3322219Z * [new branch] gh/malfet/546/orig -> origin/gh/malfet/546/orig 2025-12-04T08:53:59.3322286Z * [new branch] gh/malfet/565/base -> origin/gh/malfet/565/base 2025-12-04T08:53:59.3322353Z * [new branch] gh/malfet/565/head -> origin/gh/malfet/565/head 2025-12-04T08:53:59.3322422Z * [new branch] gh/malfet/565/orig -> origin/gh/malfet/565/orig 2025-12-04T08:53:59.3322488Z * [new branch] gh/malfet/575/base -> origin/gh/malfet/575/base 2025-12-04T08:53:59.3322556Z * [new branch] gh/malfet/575/head -> origin/gh/malfet/575/head 2025-12-04T08:53:59.3322623Z * [new branch] gh/malfet/575/orig -> origin/gh/malfet/575/orig 2025-12-04T08:53:59.3322693Z * [new branch] gh/malfet/580/base -> origin/gh/malfet/580/base 2025-12-04T08:53:59.3322762Z * [new branch] gh/malfet/580/head -> origin/gh/malfet/580/head 2025-12-04T08:53:59.3322828Z * [new branch] gh/malfet/580/orig -> origin/gh/malfet/580/orig 2025-12-04T08:53:59.3322895Z * [new branch] gh/malfet/581/base -> origin/gh/malfet/581/base 2025-12-04T08:53:59.3322963Z * [new branch] gh/malfet/581/head -> origin/gh/malfet/581/head 2025-12-04T08:53:59.3323033Z * [new branch] gh/malfet/581/orig -> origin/gh/malfet/581/orig 2025-12-04T08:53:59.3323102Z * [new branch] gh/malfet/583/base -> origin/gh/malfet/583/base 2025-12-04T08:53:59.3323168Z * [new branch] gh/malfet/583/head -> origin/gh/malfet/583/head 2025-12-04T08:53:59.3323235Z * [new branch] gh/malfet/583/orig -> origin/gh/malfet/583/orig 2025-12-04T08:53:59.3323303Z * [new branch] gh/malfet/586/base -> origin/gh/malfet/586/base 2025-12-04T08:53:59.3323368Z * [new branch] gh/malfet/586/head -> origin/gh/malfet/586/head 2025-12-04T08:53:59.3323435Z * [new branch] gh/malfet/586/orig -> origin/gh/malfet/586/orig 2025-12-04T08:53:59.3323504Z * [new branch] gh/malfet/587/base -> origin/gh/malfet/587/base 2025-12-04T08:53:59.3323570Z * [new branch] gh/malfet/587/head -> origin/gh/malfet/587/head 2025-12-04T08:53:59.3323636Z * [new branch] gh/malfet/587/orig -> origin/gh/malfet/587/orig 2025-12-04T08:53:59.3323753Z * [new branch] gh/malfet/588/base -> origin/gh/malfet/588/base 2025-12-04T08:53:59.3323820Z * [new branch] gh/malfet/588/head -> origin/gh/malfet/588/head 2025-12-04T08:53:59.3323888Z * [new branch] gh/malfet/588/orig -> origin/gh/malfet/588/orig 2025-12-04T08:53:59.3323958Z * [new branch] gh/malfet/589/base -> origin/gh/malfet/589/base 2025-12-04T08:53:59.3324025Z * [new branch] gh/malfet/589/head -> origin/gh/malfet/589/head 2025-12-04T08:53:59.3324092Z * [new branch] gh/malfet/589/orig -> origin/gh/malfet/589/orig 2025-12-04T08:53:59.3324161Z * [new branch] gh/malfet/590/base -> origin/gh/malfet/590/base 2025-12-04T08:53:59.3324228Z * [new branch] gh/malfet/590/head -> origin/gh/malfet/590/head 2025-12-04T08:53:59.3324294Z * [new branch] gh/malfet/590/orig -> origin/gh/malfet/590/orig 2025-12-04T08:53:59.3324363Z * [new branch] gh/malfet/591/base -> origin/gh/malfet/591/base 2025-12-04T08:53:59.3324428Z * [new branch] gh/malfet/591/head -> origin/gh/malfet/591/head 2025-12-04T08:53:59.3324495Z * [new branch] gh/malfet/591/orig -> origin/gh/malfet/591/orig 2025-12-04T08:53:59.3324593Z * [new branch] gh/malfet/592/base -> origin/gh/malfet/592/base 2025-12-04T08:53:59.3324660Z * [new branch] gh/malfet/592/head -> origin/gh/malfet/592/head 2025-12-04T08:53:59.3324726Z * [new branch] gh/malfet/592/orig -> origin/gh/malfet/592/orig 2025-12-04T08:53:59.3324793Z * [new branch] gh/malfet/593/base -> origin/gh/malfet/593/base 2025-12-04T08:53:59.3324860Z * [new branch] gh/malfet/593/head -> origin/gh/malfet/593/head 2025-12-04T08:53:59.3324927Z * [new branch] gh/malfet/593/orig -> origin/gh/malfet/593/orig 2025-12-04T08:53:59.3324997Z * [new branch] gh/malfet/594/base -> origin/gh/malfet/594/base 2025-12-04T08:53:59.3325063Z * [new branch] gh/malfet/594/head -> origin/gh/malfet/594/head 2025-12-04T08:53:59.3325132Z * [new branch] gh/malfet/594/orig -> origin/gh/malfet/594/orig 2025-12-04T08:53:59.3325200Z * [new branch] gh/malfet/595/base -> origin/gh/malfet/595/base 2025-12-04T08:53:59.3325266Z * [new branch] gh/malfet/595/head -> origin/gh/malfet/595/head 2025-12-04T08:53:59.3325334Z * [new branch] gh/malfet/595/orig -> origin/gh/malfet/595/orig 2025-12-04T08:53:59.3325399Z * [new branch] gh/malfet/596/base -> origin/gh/malfet/596/base 2025-12-04T08:53:59.3325465Z * [new branch] gh/malfet/596/head -> origin/gh/malfet/596/head 2025-12-04T08:53:59.3325533Z * [new branch] gh/malfet/596/orig -> origin/gh/malfet/596/orig 2025-12-04T08:53:59.3325601Z * [new branch] gh/malfet/597/base -> origin/gh/malfet/597/base 2025-12-04T08:53:59.3325666Z * [new branch] gh/malfet/597/head -> origin/gh/malfet/597/head 2025-12-04T08:53:59.3325735Z * [new branch] gh/malfet/597/orig -> origin/gh/malfet/597/orig 2025-12-04T08:53:59.3325802Z * [new branch] gh/malfet/598/base -> origin/gh/malfet/598/base 2025-12-04T08:53:59.3325869Z * [new branch] gh/malfet/598/head -> origin/gh/malfet/598/head 2025-12-04T08:53:59.3325936Z * [new branch] gh/malfet/598/orig -> origin/gh/malfet/598/orig 2025-12-04T08:53:59.3326002Z * [new branch] gh/malfet/599/base -> origin/gh/malfet/599/base 2025-12-04T08:53:59.3326068Z * [new branch] gh/malfet/599/head -> origin/gh/malfet/599/head 2025-12-04T08:53:59.3326136Z * [new branch] gh/malfet/599/orig -> origin/gh/malfet/599/orig 2025-12-04T08:53:59.3326242Z * [new branch] gh/malfet/600/base -> origin/gh/malfet/600/base 2025-12-04T08:53:59.3326308Z * [new branch] gh/malfet/600/head -> origin/gh/malfet/600/head 2025-12-04T08:53:59.3326378Z * [new branch] gh/malfet/600/orig -> origin/gh/malfet/600/orig 2025-12-04T08:53:59.3326446Z * [new branch] gh/malfet/601/base -> origin/gh/malfet/601/base 2025-12-04T08:53:59.3326515Z * [new branch] gh/malfet/601/head -> origin/gh/malfet/601/head 2025-12-04T08:53:59.3326581Z * [new branch] gh/malfet/601/orig -> origin/gh/malfet/601/orig 2025-12-04T08:53:59.3326648Z * [new branch] gh/malfet/602/base -> origin/gh/malfet/602/base 2025-12-04T08:53:59.3326717Z * [new branch] gh/malfet/602/head -> origin/gh/malfet/602/head 2025-12-04T08:53:59.3326784Z * [new branch] gh/malfet/602/orig -> origin/gh/malfet/602/orig 2025-12-04T08:53:59.3326853Z * [new branch] gh/malfet/603/base -> origin/gh/malfet/603/base 2025-12-04T08:53:59.3326920Z * [new branch] gh/malfet/603/head -> origin/gh/malfet/603/head 2025-12-04T08:53:59.3326987Z * [new branch] gh/malfet/603/orig -> origin/gh/malfet/603/orig 2025-12-04T08:53:59.3327078Z * [new branch] gh/malfet/604/base -> origin/gh/malfet/604/base 2025-12-04T08:53:59.3327147Z * [new branch] gh/malfet/604/head -> origin/gh/malfet/604/head 2025-12-04T08:53:59.3327214Z * [new branch] gh/malfet/604/orig -> origin/gh/malfet/604/orig 2025-12-04T08:53:59.3327279Z * [new branch] gh/malfet/605/base -> origin/gh/malfet/605/base 2025-12-04T08:53:59.3327346Z * [new branch] gh/malfet/605/head -> origin/gh/malfet/605/head 2025-12-04T08:53:59.3327413Z * [new branch] gh/malfet/605/orig -> origin/gh/malfet/605/orig 2025-12-04T08:53:59.3327482Z * [new branch] gh/malfet/606/base -> origin/gh/malfet/606/base 2025-12-04T08:53:59.3327550Z * [new branch] gh/malfet/606/head -> origin/gh/malfet/606/head 2025-12-04T08:53:59.3327617Z * [new branch] gh/malfet/606/orig -> origin/gh/malfet/606/orig 2025-12-04T08:53:59.3327684Z * [new branch] gh/malfet/607/base -> origin/gh/malfet/607/base 2025-12-04T08:53:59.3327752Z * [new branch] gh/malfet/607/head -> origin/gh/malfet/607/head 2025-12-04T08:53:59.3327817Z * [new branch] gh/malfet/607/orig -> origin/gh/malfet/607/orig 2025-12-04T08:53:59.3327884Z * [new branch] gh/malfet/608/base -> origin/gh/malfet/608/base 2025-12-04T08:53:59.3327957Z * [new branch] gh/malfet/608/head -> origin/gh/malfet/608/head 2025-12-04T08:53:59.3328023Z * [new branch] gh/malfet/608/orig -> origin/gh/malfet/608/orig 2025-12-04T08:53:59.3328093Z * [new branch] gh/malfet/609/base -> origin/gh/malfet/609/base 2025-12-04T08:53:59.3328160Z * [new branch] gh/malfet/609/head -> origin/gh/malfet/609/head 2025-12-04T08:53:59.3328226Z * [new branch] gh/malfet/609/orig -> origin/gh/malfet/609/orig 2025-12-04T08:53:59.3328296Z * [new branch] gh/malfet/610/base -> origin/gh/malfet/610/base 2025-12-04T08:53:59.3328363Z * [new branch] gh/malfet/610/head -> origin/gh/malfet/610/head 2025-12-04T08:53:59.3328428Z * [new branch] gh/malfet/610/orig -> origin/gh/malfet/610/orig 2025-12-04T08:53:59.3328495Z * [new branch] gh/malfet/611/base -> origin/gh/malfet/611/base 2025-12-04T08:53:59.3328561Z * [new branch] gh/malfet/611/head -> origin/gh/malfet/611/head 2025-12-04T08:53:59.3328628Z * [new branch] gh/malfet/611/orig -> origin/gh/malfet/611/orig 2025-12-04T08:53:59.3328727Z * [new branch] gh/malfet/612/base -> origin/gh/malfet/612/base 2025-12-04T08:53:59.3328793Z * [new branch] gh/malfet/612/head -> origin/gh/malfet/612/head 2025-12-04T08:53:59.3328858Z * [new branch] gh/malfet/612/orig -> origin/gh/malfet/612/orig 2025-12-04T08:53:59.3328927Z * [new branch] gh/malfet/64/base -> origin/gh/malfet/64/base 2025-12-04T08:53:59.3328993Z * [new branch] gh/malfet/64/head -> origin/gh/malfet/64/head 2025-12-04T08:53:59.3329081Z * [new branch] gh/manuelcandales/11/base -> origin/gh/manuelcandales/11/base 2025-12-04T08:53:59.3329169Z * [new branch] gh/manuelcandales/11/head -> origin/gh/manuelcandales/11/head 2025-12-04T08:53:59.3329252Z * [new branch] gh/manuelcandales/11/orig -> origin/gh/manuelcandales/11/orig 2025-12-04T08:53:59.3329319Z * [new branch] gh/markkm/1/base -> origin/gh/markkm/1/base 2025-12-04T08:53:59.3329394Z * [new branch] gh/masnesral/1/base -> origin/gh/masnesral/1/base 2025-12-04T08:53:59.3329465Z * [new branch] gh/masnesral/1/head -> origin/gh/masnesral/1/head 2025-12-04T08:53:59.3329536Z * [new branch] gh/masnesral/1/orig -> origin/gh/masnesral/1/orig 2025-12-04T08:53:59.3329634Z * [new branch] gh/mhorowitz/0/base -> origin/gh/mhorowitz/0/base 2025-12-04T08:53:59.3329704Z * [new branch] gh/mhorowitz/0/head -> origin/gh/mhorowitz/0/head 2025-12-04T08:53:59.3329775Z * [new branch] gh/mhorowitz/1/base -> origin/gh/mhorowitz/1/base 2025-12-04T08:53:59.3329843Z * [new branch] gh/mhorowitz/1/head -> origin/gh/mhorowitz/1/head 2025-12-04T08:53:59.3329912Z * [new branch] gh/mhorowitz/2/base -> origin/gh/mhorowitz/2/base 2025-12-04T08:53:59.3329982Z * [new branch] gh/mhorowitz/2/head -> origin/gh/mhorowitz/2/head 2025-12-04T08:53:59.3330052Z * [new branch] gh/mhorowitz/3/base -> origin/gh/mhorowitz/3/base 2025-12-04T08:53:59.3330122Z * [new branch] gh/mhorowitz/3/head -> origin/gh/mhorowitz/3/head 2025-12-04T08:53:59.3330192Z * [new branch] gh/mhorowitz/4/base -> origin/gh/mhorowitz/4/base 2025-12-04T08:53:59.3330262Z * [new branch] gh/mhorowitz/4/head -> origin/gh/mhorowitz/4/head 2025-12-04T08:53:59.3330331Z * [new branch] gh/mhorowitz/5/base -> origin/gh/mhorowitz/5/base 2025-12-04T08:53:59.3330404Z * [new branch] gh/mhorowitz/5/head -> origin/gh/mhorowitz/5/head 2025-12-04T08:53:59.3330472Z * [new branch] gh/mhorowitz/6/base -> origin/gh/mhorowitz/6/base 2025-12-04T08:53:59.3330541Z * [new branch] gh/mhorowitz/6/head -> origin/gh/mhorowitz/6/head 2025-12-04T08:53:59.3330641Z * [new branch] gh/mikaylagawarecki/234/base -> origin/gh/mikaylagawarecki/234/base 2025-12-04T08:53:59.3330737Z * [new branch] gh/mikaylagawarecki/234/head -> origin/gh/mikaylagawarecki/234/head 2025-12-04T08:53:59.3330830Z * [new branch] gh/mikaylagawarecki/235/base -> origin/gh/mikaylagawarecki/235/base 2025-12-04T08:53:59.3330924Z * [new branch] gh/mikaylagawarecki/235/head -> origin/gh/mikaylagawarecki/235/head 2025-12-04T08:53:59.3331017Z * [new branch] gh/mikaylagawarecki/236/base -> origin/gh/mikaylagawarecki/236/base 2025-12-04T08:53:59.3331107Z * [new branch] gh/mikaylagawarecki/236/head -> origin/gh/mikaylagawarecki/236/head 2025-12-04T08:53:59.3331199Z * [new branch] gh/mikaylagawarecki/237/base -> origin/gh/mikaylagawarecki/237/base 2025-12-04T08:53:59.3331287Z * [new branch] gh/mikaylagawarecki/237/head -> origin/gh/mikaylagawarecki/237/head 2025-12-04T08:53:59.3331379Z * [new branch] gh/mikaylagawarecki/238/base -> origin/gh/mikaylagawarecki/238/base 2025-12-04T08:53:59.3331492Z * [new branch] gh/mikaylagawarecki/238/head -> origin/gh/mikaylagawarecki/238/head 2025-12-04T08:53:59.3331582Z * [new branch] gh/mikaylagawarecki/336/base -> origin/gh/mikaylagawarecki/336/base 2025-12-04T08:53:59.3331673Z * [new branch] gh/mikaylagawarecki/336/head -> origin/gh/mikaylagawarecki/336/head 2025-12-04T08:53:59.3331765Z * [new branch] gh/mikaylagawarecki/336/orig -> origin/gh/mikaylagawarecki/336/orig 2025-12-04T08:53:59.3331889Z * [new branch] gh/mikaylagawarecki/341/base -> origin/gh/mikaylagawarecki/341/base 2025-12-04T08:53:59.3331981Z * [new branch] gh/mikaylagawarecki/341/head -> origin/gh/mikaylagawarecki/341/head 2025-12-04T08:53:59.3332072Z * [new branch] gh/mikaylagawarecki/341/orig -> origin/gh/mikaylagawarecki/341/orig 2025-12-04T08:53:59.3332161Z * [new branch] gh/mikaylagawarecki/342/base -> origin/gh/mikaylagawarecki/342/base 2025-12-04T08:53:59.3332256Z * [new branch] gh/mikaylagawarecki/342/head -> origin/gh/mikaylagawarecki/342/head 2025-12-04T08:53:59.3332347Z * [new branch] gh/mikaylagawarecki/342/orig -> origin/gh/mikaylagawarecki/342/orig 2025-12-04T08:53:59.3332438Z * [new branch] gh/mikaylagawarecki/345/base -> origin/gh/mikaylagawarecki/345/base 2025-12-04T08:53:59.3332576Z * [new branch] gh/mikaylagawarecki/345/head -> origin/gh/mikaylagawarecki/345/head 2025-12-04T08:53:59.3332666Z * [new branch] gh/mikaylagawarecki/345/orig -> origin/gh/mikaylagawarecki/345/orig 2025-12-04T08:53:59.3332755Z * [new branch] gh/mikaylagawarecki/346/base -> origin/gh/mikaylagawarecki/346/base 2025-12-04T08:53:59.3332845Z * [new branch] gh/mikaylagawarecki/346/head -> origin/gh/mikaylagawarecki/346/head 2025-12-04T08:53:59.3332935Z * [new branch] gh/mikaylagawarecki/346/orig -> origin/gh/mikaylagawarecki/346/orig 2025-12-04T08:53:59.3333027Z * [new branch] gh/mikaylagawarecki/347/base -> origin/gh/mikaylagawarecki/347/base 2025-12-04T08:53:59.3333118Z * [new branch] gh/mikaylagawarecki/347/head -> origin/gh/mikaylagawarecki/347/head 2025-12-04T08:53:59.3333208Z * [new branch] gh/mikaylagawarecki/347/orig -> origin/gh/mikaylagawarecki/347/orig 2025-12-04T08:53:59.3333301Z * [new branch] gh/mikaylagawarecki/350/base -> origin/gh/mikaylagawarecki/350/base 2025-12-04T08:53:59.3333391Z * [new branch] gh/mikaylagawarecki/350/head -> origin/gh/mikaylagawarecki/350/head 2025-12-04T08:53:59.3333480Z * [new branch] gh/mikaylagawarecki/350/orig -> origin/gh/mikaylagawarecki/350/orig 2025-12-04T08:53:59.3333572Z * [new branch] gh/mikaylagawarecki/351/base -> origin/gh/mikaylagawarecki/351/base 2025-12-04T08:53:59.3333663Z * [new branch] gh/mikaylagawarecki/351/head -> origin/gh/mikaylagawarecki/351/head 2025-12-04T08:53:59.3333754Z * [new branch] gh/mikaylagawarecki/351/orig -> origin/gh/mikaylagawarecki/351/orig 2025-12-04T08:53:59.3333846Z * [new branch] gh/mikaylagawarecki/352/base -> origin/gh/mikaylagawarecki/352/base 2025-12-04T08:53:59.3333936Z * [new branch] gh/mikaylagawarecki/352/head -> origin/gh/mikaylagawarecki/352/head 2025-12-04T08:53:59.3334025Z * [new branch] gh/mikaylagawarecki/352/orig -> origin/gh/mikaylagawarecki/352/orig 2025-12-04T08:53:59.3334119Z * [new branch] gh/mikaylagawarecki/353/base -> origin/gh/mikaylagawarecki/353/base 2025-12-04T08:53:59.3334208Z * [new branch] gh/mikaylagawarecki/353/head -> origin/gh/mikaylagawarecki/353/head 2025-12-04T08:53:59.3334300Z * [new branch] gh/mikaylagawarecki/353/orig -> origin/gh/mikaylagawarecki/353/orig 2025-12-04T08:53:59.3334391Z * [new branch] gh/mikaylagawarecki/354/base -> origin/gh/mikaylagawarecki/354/base 2025-12-04T08:53:59.3334532Z * [new branch] gh/mikaylagawarecki/354/head -> origin/gh/mikaylagawarecki/354/head 2025-12-04T08:53:59.3334624Z * [new branch] gh/mikaylagawarecki/354/orig -> origin/gh/mikaylagawarecki/354/orig 2025-12-04T08:53:59.3334714Z * [new branch] gh/mikaylagawarecki/356/base -> origin/gh/mikaylagawarecki/356/base 2025-12-04T08:53:59.3334806Z * [new branch] gh/mikaylagawarecki/356/head -> origin/gh/mikaylagawarecki/356/head 2025-12-04T08:53:59.3334897Z * [new branch] gh/mikaylagawarecki/356/orig -> origin/gh/mikaylagawarecki/356/orig 2025-12-04T08:53:59.3334986Z * [new branch] gh/mikaylagawarecki/357/base -> origin/gh/mikaylagawarecki/357/base 2025-12-04T08:53:59.3335077Z * [new branch] gh/mikaylagawarecki/357/head -> origin/gh/mikaylagawarecki/357/head 2025-12-04T08:53:59.3335168Z * [new branch] gh/mikaylagawarecki/357/orig -> origin/gh/mikaylagawarecki/357/orig 2025-12-04T08:53:59.3335259Z * [new branch] gh/mikaylagawarecki/359/base -> origin/gh/mikaylagawarecki/359/base 2025-12-04T08:53:59.3335350Z * [new branch] gh/mikaylagawarecki/359/head -> origin/gh/mikaylagawarecki/359/head 2025-12-04T08:53:59.3335441Z * [new branch] gh/mikaylagawarecki/359/orig -> origin/gh/mikaylagawarecki/359/orig 2025-12-04T08:53:59.3335569Z * [new branch] gh/mikaylagawarecki/360/base -> origin/gh/mikaylagawarecki/360/base 2025-12-04T08:53:59.3335661Z * [new branch] gh/mikaylagawarecki/360/head -> origin/gh/mikaylagawarecki/360/head 2025-12-04T08:53:59.3335750Z * [new branch] gh/mikaylagawarecki/360/orig -> origin/gh/mikaylagawarecki/360/orig 2025-12-04T08:53:59.3335841Z * [new branch] gh/mikaylagawarecki/361/base -> origin/gh/mikaylagawarecki/361/base 2025-12-04T08:53:59.3335931Z * [new branch] gh/mikaylagawarecki/361/head -> origin/gh/mikaylagawarecki/361/head 2025-12-04T08:53:59.3336022Z * [new branch] gh/mikaylagawarecki/361/orig -> origin/gh/mikaylagawarecki/361/orig 2025-12-04T08:53:59.3336111Z * [new branch] gh/mikaylagawarecki/362/base -> origin/gh/mikaylagawarecki/362/base 2025-12-04T08:53:59.3336202Z * [new branch] gh/mikaylagawarecki/362/head -> origin/gh/mikaylagawarecki/362/head 2025-12-04T08:53:59.3336292Z * [new branch] gh/mikaylagawarecki/362/orig -> origin/gh/mikaylagawarecki/362/orig 2025-12-04T08:53:59.3336383Z * [new branch] gh/mikaylagawarecki/363/base -> origin/gh/mikaylagawarecki/363/base 2025-12-04T08:53:59.3336476Z * [new branch] gh/mikaylagawarecki/363/head -> origin/gh/mikaylagawarecki/363/head 2025-12-04T08:53:59.3336567Z * [new branch] gh/mikaylagawarecki/363/orig -> origin/gh/mikaylagawarecki/363/orig 2025-12-04T08:53:59.3336656Z * [new branch] gh/mikaylagawarecki/364/base -> origin/gh/mikaylagawarecki/364/base 2025-12-04T08:53:59.3336747Z * [new branch] gh/mikaylagawarecki/364/head -> origin/gh/mikaylagawarecki/364/head 2025-12-04T08:53:59.3336838Z * [new branch] gh/mikaylagawarecki/364/orig -> origin/gh/mikaylagawarecki/364/orig 2025-12-04T08:53:59.3336929Z * [new branch] gh/mikaylagawarecki/365/base -> origin/gh/mikaylagawarecki/365/base 2025-12-04T08:53:59.3337022Z * [new branch] gh/mikaylagawarecki/365/head -> origin/gh/mikaylagawarecki/365/head 2025-12-04T08:53:59.3337112Z * [new branch] gh/mikaylagawarecki/365/orig -> origin/gh/mikaylagawarecki/365/orig 2025-12-04T08:53:59.3337205Z * [new branch] gh/mikaylagawarecki/366/base -> origin/gh/mikaylagawarecki/366/base 2025-12-04T08:53:59.3337296Z * [new branch] gh/mikaylagawarecki/366/head -> origin/gh/mikaylagawarecki/366/head 2025-12-04T08:53:59.3337388Z * [new branch] gh/mikaylagawarecki/366/orig -> origin/gh/mikaylagawarecki/366/orig 2025-12-04T08:53:59.3337505Z * [new branch] gh/mikaylagawarecki/367/base -> origin/gh/mikaylagawarecki/367/base 2025-12-04T08:53:59.3337594Z * [new branch] gh/mikaylagawarecki/367/head -> origin/gh/mikaylagawarecki/367/head 2025-12-04T08:53:59.3337683Z * [new branch] gh/mikaylagawarecki/367/orig -> origin/gh/mikaylagawarecki/367/orig 2025-12-04T08:53:59.3337775Z * [new branch] gh/mikaylagawarecki/368/base -> origin/gh/mikaylagawarecki/368/base 2025-12-04T08:53:59.3337865Z * [new branch] gh/mikaylagawarecki/368/head -> origin/gh/mikaylagawarecki/368/head 2025-12-04T08:53:59.3337954Z * [new branch] gh/mikaylagawarecki/368/orig -> origin/gh/mikaylagawarecki/368/orig 2025-12-04T08:53:59.3338045Z * [new branch] gh/mikaylagawarecki/369/base -> origin/gh/mikaylagawarecki/369/base 2025-12-04T08:53:59.3338134Z * [new branch] gh/mikaylagawarecki/369/head -> origin/gh/mikaylagawarecki/369/head 2025-12-04T08:53:59.3338225Z * [new branch] gh/mikaylagawarecki/369/orig -> origin/gh/mikaylagawarecki/369/orig 2025-12-04T08:53:59.3338319Z * [new branch] gh/mikaylagawarecki/370/base -> origin/gh/mikaylagawarecki/370/base 2025-12-04T08:53:59.3338410Z * [new branch] gh/mikaylagawarecki/370/head -> origin/gh/mikaylagawarecki/370/head 2025-12-04T08:53:59.3338524Z * [new branch] gh/mikaylagawarecki/370/orig -> origin/gh/mikaylagawarecki/370/orig 2025-12-04T08:53:59.3338614Z * [new branch] gh/mikaylagawarecki/371/base -> origin/gh/mikaylagawarecki/371/base 2025-12-04T08:53:59.3338704Z * [new branch] gh/mikaylagawarecki/371/head -> origin/gh/mikaylagawarecki/371/head 2025-12-04T08:53:59.3338795Z * [new branch] gh/mikaylagawarecki/371/orig -> origin/gh/mikaylagawarecki/371/orig 2025-12-04T08:53:59.3338884Z * [new branch] gh/mikaylagawarecki/372/base -> origin/gh/mikaylagawarecki/372/base 2025-12-04T08:53:59.3338975Z * [new branch] gh/mikaylagawarecki/372/head -> origin/gh/mikaylagawarecki/372/head 2025-12-04T08:53:59.3339068Z * [new branch] gh/mikaylagawarecki/372/orig -> origin/gh/mikaylagawarecki/372/orig 2025-12-04T08:53:59.3339158Z * [new branch] gh/mikaylagawarecki/373/base -> origin/gh/mikaylagawarecki/373/base 2025-12-04T08:53:59.3339248Z * [new branch] gh/mikaylagawarecki/373/head -> origin/gh/mikaylagawarecki/373/head 2025-12-04T08:53:59.3339339Z * [new branch] gh/mikaylagawarecki/373/orig -> origin/gh/mikaylagawarecki/373/orig 2025-12-04T08:53:59.3339429Z * [new branch] gh/mikaylagawarecki/374/base -> origin/gh/mikaylagawarecki/374/base 2025-12-04T08:53:59.3339518Z * [new branch] gh/mikaylagawarecki/374/head -> origin/gh/mikaylagawarecki/374/head 2025-12-04T08:53:59.3339609Z * [new branch] gh/mikaylagawarecki/374/orig -> origin/gh/mikaylagawarecki/374/orig 2025-12-04T08:53:59.3339700Z * [new branch] gh/mikaylagawarecki/375/base -> origin/gh/mikaylagawarecki/375/base 2025-12-04T08:53:59.3339790Z * [new branch] gh/mikaylagawarecki/375/head -> origin/gh/mikaylagawarecki/375/head 2025-12-04T08:53:59.3339880Z * [new branch] gh/mikaylagawarecki/375/orig -> origin/gh/mikaylagawarecki/375/orig 2025-12-04T08:53:59.3339972Z * [new branch] gh/mikaylagawarecki/376/base -> origin/gh/mikaylagawarecki/376/base 2025-12-04T08:53:59.3340063Z * [new branch] gh/mikaylagawarecki/376/head -> origin/gh/mikaylagawarecki/376/head 2025-12-04T08:53:59.3340153Z * [new branch] gh/mikaylagawarecki/376/orig -> origin/gh/mikaylagawarecki/376/orig 2025-12-04T08:53:59.3340243Z * [new branch] gh/mikaylagawarecki/377/base -> origin/gh/mikaylagawarecki/377/base 2025-12-04T08:53:59.3340334Z * [new branch] gh/mikaylagawarecki/377/head -> origin/gh/mikaylagawarecki/377/head 2025-12-04T08:53:59.3340449Z * [new branch] gh/mikaylagawarecki/377/orig -> origin/gh/mikaylagawarecki/377/orig 2025-12-04T08:53:59.3340539Z * [new branch] gh/mikaylagawarecki/378/base -> origin/gh/mikaylagawarecki/378/base 2025-12-04T08:53:59.3340631Z * [new branch] gh/mikaylagawarecki/378/head -> origin/gh/mikaylagawarecki/378/head 2025-12-04T08:53:59.3340722Z * [new branch] gh/mikaylagawarecki/378/orig -> origin/gh/mikaylagawarecki/378/orig 2025-12-04T08:53:59.3340812Z * [new branch] gh/mikaylagawarecki/379/base -> origin/gh/mikaylagawarecki/379/base 2025-12-04T08:53:59.3340903Z * [new branch] gh/mikaylagawarecki/379/head -> origin/gh/mikaylagawarecki/379/head 2025-12-04T08:53:59.3340992Z * [new branch] gh/mikaylagawarecki/379/orig -> origin/gh/mikaylagawarecki/379/orig 2025-12-04T08:53:59.3341082Z * [new branch] gh/mikaylagawarecki/380/base -> origin/gh/mikaylagawarecki/380/base 2025-12-04T08:53:59.3341175Z * [new branch] gh/mikaylagawarecki/380/head -> origin/gh/mikaylagawarecki/380/head 2025-12-04T08:53:59.3341266Z * [new branch] gh/mikaylagawarecki/380/orig -> origin/gh/mikaylagawarecki/380/orig 2025-12-04T08:53:59.3341356Z * [new branch] gh/mikaylagawarecki/381/base -> origin/gh/mikaylagawarecki/381/base 2025-12-04T08:53:59.3341472Z * [new branch] gh/mikaylagawarecki/381/head -> origin/gh/mikaylagawarecki/381/head 2025-12-04T08:53:59.3341562Z * [new branch] gh/mikaylagawarecki/381/orig -> origin/gh/mikaylagawarecki/381/orig 2025-12-04T08:53:59.3341653Z * [new branch] gh/mikaylagawarecki/382/base -> origin/gh/mikaylagawarecki/382/base 2025-12-04T08:53:59.3341742Z * [new branch] gh/mikaylagawarecki/382/head -> origin/gh/mikaylagawarecki/382/head 2025-12-04T08:53:59.3341831Z * [new branch] gh/mikaylagawarecki/382/orig -> origin/gh/mikaylagawarecki/382/orig 2025-12-04T08:53:59.3341956Z * [new branch] gh/mikaylagawarecki/383/base -> origin/gh/mikaylagawarecki/383/base 2025-12-04T08:53:59.3342046Z * [new branch] gh/mikaylagawarecki/383/head -> origin/gh/mikaylagawarecki/383/head 2025-12-04T08:53:59.3342137Z * [new branch] gh/mikaylagawarecki/383/orig -> origin/gh/mikaylagawarecki/383/orig 2025-12-04T08:53:59.3342231Z * [new branch] gh/mikaylagawarecki/384/base -> origin/gh/mikaylagawarecki/384/base 2025-12-04T08:53:59.3342321Z * [new branch] gh/mikaylagawarecki/384/head -> origin/gh/mikaylagawarecki/384/head 2025-12-04T08:53:59.3342410Z * [new branch] gh/mikaylagawarecki/384/orig -> origin/gh/mikaylagawarecki/384/orig 2025-12-04T08:53:59.3342502Z * [new branch] gh/mikaylagawarecki/385/base -> origin/gh/mikaylagawarecki/385/base 2025-12-04T08:53:59.3342591Z * [new branch] gh/mikaylagawarecki/385/head -> origin/gh/mikaylagawarecki/385/head 2025-12-04T08:53:59.3342683Z * [new branch] gh/mikaylagawarecki/385/orig -> origin/gh/mikaylagawarecki/385/orig 2025-12-04T08:53:59.3342772Z * [new branch] gh/mikaylagawarecki/386/base -> origin/gh/mikaylagawarecki/386/base 2025-12-04T08:53:59.3342862Z * [new branch] gh/mikaylagawarecki/386/head -> origin/gh/mikaylagawarecki/386/head 2025-12-04T08:53:59.3342955Z * [new branch] gh/mikaylagawarecki/386/orig -> origin/gh/mikaylagawarecki/386/orig 2025-12-04T08:53:59.3343045Z * [new branch] gh/mikaylagawarecki/387/base -> origin/gh/mikaylagawarecki/387/base 2025-12-04T08:53:59.3343134Z * [new branch] gh/mikaylagawarecki/387/head -> origin/gh/mikaylagawarecki/387/head 2025-12-04T08:53:59.3343224Z * [new branch] gh/mikaylagawarecki/387/orig -> origin/gh/mikaylagawarecki/387/orig 2025-12-04T08:53:59.3343314Z * [new branch] gh/mikaylagawarecki/388/base -> origin/gh/mikaylagawarecki/388/base 2025-12-04T08:53:59.3343449Z * [new branch] gh/mikaylagawarecki/388/head -> origin/gh/mikaylagawarecki/388/head 2025-12-04T08:53:59.3343541Z * [new branch] gh/mikaylagawarecki/388/orig -> origin/gh/mikaylagawarecki/388/orig 2025-12-04T08:53:59.3343630Z * [new branch] gh/mikaylagawarecki/389/base -> origin/gh/mikaylagawarecki/389/base 2025-12-04T08:53:59.3343720Z * [new branch] gh/mikaylagawarecki/389/head -> origin/gh/mikaylagawarecki/389/head 2025-12-04T08:53:59.3343812Z * [new branch] gh/mikaylagawarecki/389/orig -> origin/gh/mikaylagawarecki/389/orig 2025-12-04T08:53:59.3343902Z * [new branch] gh/mikaylagawarecki/390/base -> origin/gh/mikaylagawarecki/390/base 2025-12-04T08:53:59.3343992Z * [new branch] gh/mikaylagawarecki/390/head -> origin/gh/mikaylagawarecki/390/head 2025-12-04T08:53:59.3344083Z * [new branch] gh/mikaylagawarecki/390/orig -> origin/gh/mikaylagawarecki/390/orig 2025-12-04T08:53:59.3344175Z * [new branch] gh/mikaylagawarecki/391/base -> origin/gh/mikaylagawarecki/391/base 2025-12-04T08:53:59.3344266Z * [new branch] gh/mikaylagawarecki/391/head -> origin/gh/mikaylagawarecki/391/head 2025-12-04T08:53:59.3344356Z * [new branch] gh/mikaylagawarecki/391/orig -> origin/gh/mikaylagawarecki/391/orig 2025-12-04T08:53:59.3344489Z * [new branch] gh/mikaylagawarecki/392/base -> origin/gh/mikaylagawarecki/392/base 2025-12-04T08:53:59.3344582Z * [new branch] gh/mikaylagawarecki/392/head -> origin/gh/mikaylagawarecki/392/head 2025-12-04T08:53:59.3344672Z * [new branch] gh/mikaylagawarecki/392/orig -> origin/gh/mikaylagawarecki/392/orig 2025-12-04T08:53:59.3344741Z * [new branch] gh/mlazos/41/base -> origin/gh/mlazos/41/base 2025-12-04T08:53:59.3344809Z * [new branch] gh/mlazos/41/head -> origin/gh/mlazos/41/head 2025-12-04T08:53:59.3344876Z * [new branch] gh/mlazos/41/orig -> origin/gh/mlazos/41/orig 2025-12-04T08:53:59.3344942Z * [new branch] gh/mlazos/42/base -> origin/gh/mlazos/42/base 2025-12-04T08:53:59.3345009Z * [new branch] gh/mlazos/42/head -> origin/gh/mlazos/42/head 2025-12-04T08:53:59.3345075Z * [new branch] gh/mlazos/42/orig -> origin/gh/mlazos/42/orig 2025-12-04T08:53:59.3345140Z * [new branch] gh/mlazos/43/base -> origin/gh/mlazos/43/base 2025-12-04T08:53:59.3345206Z * [new branch] gh/mlazos/43/head -> origin/gh/mlazos/43/head 2025-12-04T08:53:59.3345271Z * [new branch] gh/mlazos/43/orig -> origin/gh/mlazos/43/orig 2025-12-04T08:53:59.3345335Z * [new branch] gh/mlazos/44/base -> origin/gh/mlazos/44/base 2025-12-04T08:53:59.3345401Z * [new branch] gh/mlazos/44/head -> origin/gh/mlazos/44/head 2025-12-04T08:53:59.3345467Z * [new branch] gh/mlazos/44/orig -> origin/gh/mlazos/44/orig 2025-12-04T08:53:59.3345531Z * [new branch] gh/mlazos/47/base -> origin/gh/mlazos/47/base 2025-12-04T08:53:59.3345597Z * [new branch] gh/mlazos/47/head -> origin/gh/mlazos/47/head 2025-12-04T08:53:59.3345662Z * [new branch] gh/mlazos/47/orig -> origin/gh/mlazos/47/orig 2025-12-04T08:53:59.3345729Z * [new branch] gh/mlazos/48/base -> origin/gh/mlazos/48/base 2025-12-04T08:53:59.3345793Z * [new branch] gh/mlazos/48/head -> origin/gh/mlazos/48/head 2025-12-04T08:53:59.3345858Z * [new branch] gh/mlazos/48/orig -> origin/gh/mlazos/48/orig 2025-12-04T08:53:59.3345924Z * [new branch] gh/mlazos/49/base -> origin/gh/mlazos/49/base 2025-12-04T08:53:59.3345989Z * [new branch] gh/mlazos/49/head -> origin/gh/mlazos/49/head 2025-12-04T08:53:59.3346082Z * [new branch] gh/mlazos/49/orig -> origin/gh/mlazos/49/orig 2025-12-04T08:53:59.3346148Z * [new branch] gh/mlazos/50/base -> origin/gh/mlazos/50/base 2025-12-04T08:53:59.3346213Z * [new branch] gh/mlazos/50/head -> origin/gh/mlazos/50/head 2025-12-04T08:53:59.3346277Z * [new branch] gh/mlazos/50/orig -> origin/gh/mlazos/50/orig 2025-12-04T08:53:59.3346344Z * [new branch] gh/mlazos/51/base -> origin/gh/mlazos/51/base 2025-12-04T08:53:59.3346409Z * [new branch] gh/mlazos/51/head -> origin/gh/mlazos/51/head 2025-12-04T08:53:59.3346474Z * [new branch] gh/mlazos/51/orig -> origin/gh/mlazos/51/orig 2025-12-04T08:53:59.3346541Z * [new branch] gh/mlazos/52/base -> origin/gh/mlazos/52/base 2025-12-04T08:53:59.3346606Z * [new branch] gh/mlazos/52/head -> origin/gh/mlazos/52/head 2025-12-04T08:53:59.3346672Z * [new branch] gh/mlazos/52/orig -> origin/gh/mlazos/52/orig 2025-12-04T08:53:59.3346739Z * [new branch] gh/mlazos/53/base -> origin/gh/mlazos/53/base 2025-12-04T08:53:59.3346804Z * [new branch] gh/mlazos/53/head -> origin/gh/mlazos/53/head 2025-12-04T08:53:59.3346869Z * [new branch] gh/mlazos/53/orig -> origin/gh/mlazos/53/orig 2025-12-04T08:53:59.3346972Z * [new branch] gh/mlazos/54/base -> origin/gh/mlazos/54/base 2025-12-04T08:53:59.3347037Z * [new branch] gh/mlazos/54/head -> origin/gh/mlazos/54/head 2025-12-04T08:53:59.3347102Z * [new branch] gh/mlazos/54/orig -> origin/gh/mlazos/54/orig 2025-12-04T08:53:59.3347168Z * [new branch] gh/mlazos/55/base -> origin/gh/mlazos/55/base 2025-12-04T08:53:59.3347232Z * [new branch] gh/mlazos/55/head -> origin/gh/mlazos/55/head 2025-12-04T08:53:59.3347298Z * [new branch] gh/mlazos/55/orig -> origin/gh/mlazos/55/orig 2025-12-04T08:53:59.3347365Z * [new branch] gh/mlazos/56/base -> origin/gh/mlazos/56/base 2025-12-04T08:53:59.3347430Z * [new branch] gh/mlazos/56/head -> origin/gh/mlazos/56/head 2025-12-04T08:53:59.3347496Z * [new branch] gh/mlazos/56/orig -> origin/gh/mlazos/56/orig 2025-12-04T08:53:59.3347562Z * [new branch] gh/mlazos/57/base -> origin/gh/mlazos/57/base 2025-12-04T08:53:59.3347627Z * [new branch] gh/mlazos/57/head -> origin/gh/mlazos/57/head 2025-12-04T08:53:59.3347694Z * [new branch] gh/mlazos/57/orig -> origin/gh/mlazos/57/orig 2025-12-04T08:53:59.3347758Z * [new branch] gh/mlazos/58/base -> origin/gh/mlazos/58/base 2025-12-04T08:53:59.3347823Z * [new branch] gh/mlazos/58/head -> origin/gh/mlazos/58/head 2025-12-04T08:53:59.3347889Z * [new branch] gh/mlazos/58/orig -> origin/gh/mlazos/58/orig 2025-12-04T08:53:59.3347956Z * [new branch] gh/mlazos/59/base -> origin/gh/mlazos/59/base 2025-12-04T08:53:59.3348021Z * [new branch] gh/mlazos/59/head -> origin/gh/mlazos/59/head 2025-12-04T08:53:59.3348087Z * [new branch] gh/mlazos/59/orig -> origin/gh/mlazos/59/orig 2025-12-04T08:53:59.3348154Z * [new branch] gh/mlazos/60/base -> origin/gh/mlazos/60/base 2025-12-04T08:53:59.3348220Z * [new branch] gh/mlazos/60/head -> origin/gh/mlazos/60/head 2025-12-04T08:53:59.3348288Z * [new branch] gh/mlazos/60/orig -> origin/gh/mlazos/60/orig 2025-12-04T08:53:59.3348353Z * [new branch] gh/mlazos/61/base -> origin/gh/mlazos/61/base 2025-12-04T08:53:59.3348417Z * [new branch] gh/mlazos/61/head -> origin/gh/mlazos/61/head 2025-12-04T08:53:59.3348483Z * [new branch] gh/mlazos/61/orig -> origin/gh/mlazos/61/orig 2025-12-04T08:53:59.3348577Z * [new branch] gh/mlazos/62/base -> origin/gh/mlazos/62/base 2025-12-04T08:53:59.3348642Z * [new branch] gh/mlazos/62/head -> origin/gh/mlazos/62/head 2025-12-04T08:53:59.3348708Z * [new branch] gh/mlazos/62/orig -> origin/gh/mlazos/62/orig 2025-12-04T08:53:59.3348775Z * [new branch] gh/mlazos/63/base -> origin/gh/mlazos/63/base 2025-12-04T08:53:59.3348840Z * [new branch] gh/mlazos/63/head -> origin/gh/mlazos/63/head 2025-12-04T08:53:59.3348906Z * [new branch] gh/mlazos/63/orig -> origin/gh/mlazos/63/orig 2025-12-04T08:53:59.3348971Z * [new branch] gh/mlazos/64/base -> origin/gh/mlazos/64/base 2025-12-04T08:53:59.3349038Z * [new branch] gh/mlazos/64/head -> origin/gh/mlazos/64/head 2025-12-04T08:53:59.3349103Z * [new branch] gh/mlazos/64/orig -> origin/gh/mlazos/64/orig 2025-12-04T08:53:59.3349170Z * [new branch] gh/mlazos/65/base -> origin/gh/mlazos/65/base 2025-12-04T08:53:59.3349237Z * [new branch] gh/mlazos/65/head -> origin/gh/mlazos/65/head 2025-12-04T08:53:59.3349302Z * [new branch] gh/mlazos/65/orig -> origin/gh/mlazos/65/orig 2025-12-04T08:53:59.3349397Z * [new branch] gh/mlazos/66/base -> origin/gh/mlazos/66/base 2025-12-04T08:53:59.3349463Z * [new branch] gh/mlazos/66/head -> origin/gh/mlazos/66/head 2025-12-04T08:53:59.3349528Z * [new branch] gh/mlazos/66/orig -> origin/gh/mlazos/66/orig 2025-12-04T08:53:59.3349594Z * [new branch] gh/mlazos/67/base -> origin/gh/mlazos/67/base 2025-12-04T08:53:59.3349660Z * [new branch] gh/mlazos/67/head -> origin/gh/mlazos/67/head 2025-12-04T08:53:59.3349726Z * [new branch] gh/mlazos/67/orig -> origin/gh/mlazos/67/orig 2025-12-04T08:53:59.3349791Z * [new branch] gh/mlazos/68/base -> origin/gh/mlazos/68/base 2025-12-04T08:53:59.3349860Z * [new branch] gh/mlazos/68/head -> origin/gh/mlazos/68/head 2025-12-04T08:53:59.3349926Z * [new branch] gh/mlazos/68/orig -> origin/gh/mlazos/68/orig 2025-12-04T08:53:59.3349990Z * [new branch] gh/mlazos/69/base -> origin/gh/mlazos/69/base 2025-12-04T08:53:59.3350058Z * [new branch] gh/mlazos/69/head -> origin/gh/mlazos/69/head 2025-12-04T08:53:59.3350123Z * [new branch] gh/mlazos/69/orig -> origin/gh/mlazos/69/orig 2025-12-04T08:53:59.3350188Z * [new branch] gh/mlazos/70/base -> origin/gh/mlazos/70/base 2025-12-04T08:53:59.3350254Z * [new branch] gh/mlazos/70/head -> origin/gh/mlazos/70/head 2025-12-04T08:53:59.3350318Z * [new branch] gh/mlazos/70/orig -> origin/gh/mlazos/70/orig 2025-12-04T08:53:59.3350384Z * [new branch] gh/mlazos/71/base -> origin/gh/mlazos/71/base 2025-12-04T08:53:59.3350451Z * [new branch] gh/mlazos/71/head -> origin/gh/mlazos/71/head 2025-12-04T08:53:59.3350516Z * [new branch] gh/mlazos/71/orig -> origin/gh/mlazos/71/orig 2025-12-04T08:53:59.3350580Z * [new branch] gh/mlazos/72/base -> origin/gh/mlazos/72/base 2025-12-04T08:53:59.3350782Z * [new branch] gh/mlazos/72/head -> origin/gh/mlazos/72/head 2025-12-04T08:53:59.3350847Z * [new branch] gh/mlazos/72/orig -> origin/gh/mlazos/72/orig 2025-12-04T08:53:59.3350914Z * [new branch] gh/mlazos/73/base -> origin/gh/mlazos/73/base 2025-12-04T08:53:59.3350979Z * [new branch] gh/mlazos/73/head -> origin/gh/mlazos/73/head 2025-12-04T08:53:59.3351044Z * [new branch] gh/mlazos/73/orig -> origin/gh/mlazos/73/orig 2025-12-04T08:53:59.3351148Z * [new branch] gh/mrmiywj/1/base -> origin/gh/mrmiywj/1/base 2025-12-04T08:53:59.3351214Z * [new branch] gh/mrmiywj/1/head -> origin/gh/mrmiywj/1/head 2025-12-04T08:53:59.3351286Z * [new branch] gh/muchulee8/73/base -> origin/gh/muchulee8/73/base 2025-12-04T08:53:59.3351360Z * [new branch] gh/muchulee8/73/head -> origin/gh/muchulee8/73/head 2025-12-04T08:53:59.3351432Z * [new branch] gh/muchulee8/73/orig -> origin/gh/muchulee8/73/orig 2025-12-04T08:53:59.3351516Z * [new branch] gh/naveenthangudu/1/base -> origin/gh/naveenthangudu/1/base 2025-12-04T08:53:59.3351597Z * [new branch] gh/naveenthangudu/1/head -> origin/gh/naveenthangudu/1/head 2025-12-04T08:53:59.3351676Z * [new branch] gh/naveenthangudu/1/orig -> origin/gh/naveenthangudu/1/orig 2025-12-04T08:53:59.3351755Z * [new branch] gh/naveenthangudu/2/base -> origin/gh/naveenthangudu/2/base 2025-12-04T08:53:59.3351835Z * [new branch] gh/naveenthangudu/2/head -> origin/gh/naveenthangudu/2/head 2025-12-04T08:53:59.3351947Z * [new branch] gh/naveenthangudu/2/orig -> origin/gh/naveenthangudu/2/orig 2025-12-04T08:53:59.3352027Z * [new branch] gh/naveenthangudu/3/base -> origin/gh/naveenthangudu/3/base 2025-12-04T08:53:59.3352145Z * [new branch] gh/naveenthangudu/3/head -> origin/gh/naveenthangudu/3/head 2025-12-04T08:53:59.3352225Z * [new branch] gh/naveenthangudu/3/orig -> origin/gh/naveenthangudu/3/orig 2025-12-04T08:53:59.3352304Z * [new branch] gh/naveenthangudu/4/base -> origin/gh/naveenthangudu/4/base 2025-12-04T08:53:59.3352382Z * [new branch] gh/naveenthangudu/4/head -> origin/gh/naveenthangudu/4/head 2025-12-04T08:53:59.3352460Z * [new branch] gh/naveenthangudu/4/orig -> origin/gh/naveenthangudu/4/orig 2025-12-04T08:53:59.3352539Z * [new branch] gh/naveenthangudu/5/base -> origin/gh/naveenthangudu/5/base 2025-12-04T08:53:59.3352619Z * [new branch] gh/naveenthangudu/5/head -> origin/gh/naveenthangudu/5/head 2025-12-04T08:53:59.3352696Z * [new branch] gh/naveenthangudu/5/orig -> origin/gh/naveenthangudu/5/orig 2025-12-04T08:53:59.3352776Z * [new branch] gh/naveenthangudu/6/base -> origin/gh/naveenthangudu/6/base 2025-12-04T08:53:59.3352855Z * [new branch] gh/naveenthangudu/6/head -> origin/gh/naveenthangudu/6/head 2025-12-04T08:53:59.3352933Z * [new branch] gh/naveenthangudu/6/orig -> origin/gh/naveenthangudu/6/orig 2025-12-04T08:53:59.3353012Z * [new branch] gh/naveenthangudu/7/base -> origin/gh/naveenthangudu/7/base 2025-12-04T08:53:59.3353090Z * [new branch] gh/naveenthangudu/7/head -> origin/gh/naveenthangudu/7/head 2025-12-04T08:53:59.3353168Z * [new branch] gh/naveenthangudu/7/orig -> origin/gh/naveenthangudu/7/orig 2025-12-04T08:53:59.3353249Z * [new branch] gh/naveenthangudu/8/base -> origin/gh/naveenthangudu/8/base 2025-12-04T08:53:59.3353326Z * [new branch] gh/naveenthangudu/8/head -> origin/gh/naveenthangudu/8/head 2025-12-04T08:53:59.3353403Z * [new branch] gh/naveenthangudu/8/orig -> origin/gh/naveenthangudu/8/orig 2025-12-04T08:53:59.3353484Z * [new branch] gh/naveenthangudu/9/base -> origin/gh/naveenthangudu/9/base 2025-12-04T08:53:59.3353562Z * [new branch] gh/naveenthangudu/9/head -> origin/gh/naveenthangudu/9/head 2025-12-04T08:53:59.3353641Z * [new branch] gh/naveenthangudu/9/orig -> origin/gh/naveenthangudu/9/orig 2025-12-04T08:53:59.3353714Z * [new branch] gh/nikitaved/1/base -> origin/gh/nikitaved/1/base 2025-12-04T08:53:59.3353787Z * [new branch] gh/nikitaved/1/head -> origin/gh/nikitaved/1/head 2025-12-04T08:53:59.3353861Z * [new branch] gh/nikitaved/1/orig -> origin/gh/nikitaved/1/orig 2025-12-04T08:53:59.3353980Z * [new branch] gh/nikitaved/10/base -> origin/gh/nikitaved/10/base 2025-12-04T08:53:59.3354053Z * [new branch] gh/nikitaved/10/head -> origin/gh/nikitaved/10/head 2025-12-04T08:53:59.3354126Z * [new branch] gh/nikitaved/10/orig -> origin/gh/nikitaved/10/orig 2025-12-04T08:53:59.3354197Z * [new branch] gh/nikitaved/11/base -> origin/gh/nikitaved/11/base 2025-12-04T08:53:59.3354268Z * [new branch] gh/nikitaved/11/head -> origin/gh/nikitaved/11/head 2025-12-04T08:53:59.3354340Z * [new branch] gh/nikitaved/11/orig -> origin/gh/nikitaved/11/orig 2025-12-04T08:53:59.3354411Z * [new branch] gh/nikitaved/12/base -> origin/gh/nikitaved/12/base 2025-12-04T08:53:59.3354481Z * [new branch] gh/nikitaved/12/head -> origin/gh/nikitaved/12/head 2025-12-04T08:53:59.3354553Z * [new branch] gh/nikitaved/12/orig -> origin/gh/nikitaved/12/orig 2025-12-04T08:53:59.3354624Z * [new branch] gh/nikitaved/13/base -> origin/gh/nikitaved/13/base 2025-12-04T08:53:59.3354694Z * [new branch] gh/nikitaved/13/head -> origin/gh/nikitaved/13/head 2025-12-04T08:53:59.3354769Z * [new branch] gh/nikitaved/13/orig -> origin/gh/nikitaved/13/orig 2025-12-04T08:53:59.3354867Z * [new branch] gh/nikitaved/14/base -> origin/gh/nikitaved/14/base 2025-12-04T08:53:59.3354938Z * [new branch] gh/nikitaved/14/head -> origin/gh/nikitaved/14/head 2025-12-04T08:53:59.3355010Z * [new branch] gh/nikitaved/14/orig -> origin/gh/nikitaved/14/orig 2025-12-04T08:53:59.3355080Z * [new branch] gh/nikitaved/15/base -> origin/gh/nikitaved/15/base 2025-12-04T08:53:59.3355152Z * [new branch] gh/nikitaved/15/head -> origin/gh/nikitaved/15/head 2025-12-04T08:53:59.3355221Z * [new branch] gh/nikitaved/15/orig -> origin/gh/nikitaved/15/orig 2025-12-04T08:53:59.3355293Z * [new branch] gh/nikitaved/16/base -> origin/gh/nikitaved/16/base 2025-12-04T08:53:59.3355365Z * [new branch] gh/nikitaved/16/head -> origin/gh/nikitaved/16/head 2025-12-04T08:53:59.3355434Z * [new branch] gh/nikitaved/16/orig -> origin/gh/nikitaved/16/orig 2025-12-04T08:53:59.3355506Z * [new branch] gh/nikitaved/2/base -> origin/gh/nikitaved/2/base 2025-12-04T08:53:59.3355578Z * [new branch] gh/nikitaved/2/head -> origin/gh/nikitaved/2/head 2025-12-04T08:53:59.3355648Z * [new branch] gh/nikitaved/2/orig -> origin/gh/nikitaved/2/orig 2025-12-04T08:53:59.3355717Z * [new branch] gh/nikitaved/4/base -> origin/gh/nikitaved/4/base 2025-12-04T08:53:59.3355789Z * [new branch] gh/nikitaved/4/head -> origin/gh/nikitaved/4/head 2025-12-04T08:53:59.3355858Z * [new branch] gh/nikitaved/4/orig -> origin/gh/nikitaved/4/orig 2025-12-04T08:53:59.3355929Z * [new branch] gh/nikitaved/5/base -> origin/gh/nikitaved/5/base 2025-12-04T08:53:59.3356000Z * [new branch] gh/nikitaved/5/head -> origin/gh/nikitaved/5/head 2025-12-04T08:53:59.3356069Z * [new branch] gh/nikitaved/5/orig -> origin/gh/nikitaved/5/orig 2025-12-04T08:53:59.3356139Z * [new branch] gh/nikitaved/6/base -> origin/gh/nikitaved/6/base 2025-12-04T08:53:59.3356211Z * [new branch] gh/nikitaved/6/head -> origin/gh/nikitaved/6/head 2025-12-04T08:53:59.3356280Z * [new branch] gh/nikitaved/6/orig -> origin/gh/nikitaved/6/orig 2025-12-04T08:53:59.3356349Z * [new branch] gh/nikitaved/8/base -> origin/gh/nikitaved/8/base 2025-12-04T08:53:59.3356419Z * [new branch] gh/nikitaved/8/head -> origin/gh/nikitaved/8/head 2025-12-04T08:53:59.3356489Z * [new branch] gh/nikitaved/8/orig -> origin/gh/nikitaved/8/orig 2025-12-04T08:53:59.3356595Z * [new branch] gh/nikitaved/9/base -> origin/gh/nikitaved/9/base 2025-12-04T08:53:59.3356666Z * [new branch] gh/nikitaved/9/head -> origin/gh/nikitaved/9/head 2025-12-04T08:53:59.3356735Z * [new branch] gh/nikitaved/9/orig -> origin/gh/nikitaved/9/orig 2025-12-04T08:53:59.3356805Z * [new branch] gh/oulgen/10/base -> origin/gh/oulgen/10/base 2025-12-04T08:53:59.3356872Z * [new branch] gh/oulgen/10/head -> origin/gh/oulgen/10/head 2025-12-04T08:53:59.3356938Z * [new branch] gh/oulgen/10/orig -> origin/gh/oulgen/10/orig 2025-12-04T08:53:59.3357006Z * [new branch] gh/oulgen/11/base -> origin/gh/oulgen/11/base 2025-12-04T08:53:59.3357071Z * [new branch] gh/oulgen/11/head -> origin/gh/oulgen/11/head 2025-12-04T08:53:59.3357137Z * [new branch] gh/oulgen/11/orig -> origin/gh/oulgen/11/orig 2025-12-04T08:53:59.3357204Z * [new branch] gh/oulgen/12/base -> origin/gh/oulgen/12/base 2025-12-04T08:53:59.3357269Z * [new branch] gh/oulgen/12/head -> origin/gh/oulgen/12/head 2025-12-04T08:53:59.3357333Z * [new branch] gh/oulgen/12/orig -> origin/gh/oulgen/12/orig 2025-12-04T08:53:59.3357426Z * [new branch] gh/oulgen/13/base -> origin/gh/oulgen/13/base 2025-12-04T08:53:59.3357491Z * [new branch] gh/oulgen/13/head -> origin/gh/oulgen/13/head 2025-12-04T08:53:59.3357556Z * [new branch] gh/oulgen/13/orig -> origin/gh/oulgen/13/orig 2025-12-04T08:53:59.3357624Z * [new branch] gh/oulgen/14/base -> origin/gh/oulgen/14/base 2025-12-04T08:53:59.3357689Z * [new branch] gh/oulgen/14/head -> origin/gh/oulgen/14/head 2025-12-04T08:53:59.3357754Z * [new branch] gh/oulgen/14/orig -> origin/gh/oulgen/14/orig 2025-12-04T08:53:59.3357822Z * [new branch] gh/oulgen/15/base -> origin/gh/oulgen/15/base 2025-12-04T08:53:59.3357887Z * [new branch] gh/oulgen/15/head -> origin/gh/oulgen/15/head 2025-12-04T08:53:59.3357952Z * [new branch] gh/oulgen/15/orig -> origin/gh/oulgen/15/orig 2025-12-04T08:53:59.3358019Z * [new branch] gh/oulgen/16/base -> origin/gh/oulgen/16/base 2025-12-04T08:53:59.3358084Z * [new branch] gh/oulgen/16/head -> origin/gh/oulgen/16/head 2025-12-04T08:53:59.3358149Z * [new branch] gh/oulgen/16/orig -> origin/gh/oulgen/16/orig 2025-12-04T08:53:59.3358214Z * [new branch] gh/oulgen/17/base -> origin/gh/oulgen/17/base 2025-12-04T08:53:59.3358279Z * [new branch] gh/oulgen/17/head -> origin/gh/oulgen/17/head 2025-12-04T08:53:59.3358344Z * [new branch] gh/oulgen/17/orig -> origin/gh/oulgen/17/orig 2025-12-04T08:53:59.3358410Z * [new branch] gh/oulgen/18/base -> origin/gh/oulgen/18/base 2025-12-04T08:53:59.3358475Z * [new branch] gh/oulgen/18/head -> origin/gh/oulgen/18/head 2025-12-04T08:53:59.3358541Z * [new branch] gh/oulgen/18/orig -> origin/gh/oulgen/18/orig 2025-12-04T08:53:59.3358606Z * [new branch] gh/oulgen/19/base -> origin/gh/oulgen/19/base 2025-12-04T08:53:59.3358672Z * [new branch] gh/oulgen/19/head -> origin/gh/oulgen/19/head 2025-12-04T08:53:59.3358739Z * [new branch] gh/oulgen/19/orig -> origin/gh/oulgen/19/orig 2025-12-04T08:53:59.3358804Z * [new branch] gh/oulgen/20/base -> origin/gh/oulgen/20/base 2025-12-04T08:53:59.3358870Z * [new branch] gh/oulgen/20/head -> origin/gh/oulgen/20/head 2025-12-04T08:53:59.3358936Z * [new branch] gh/oulgen/20/orig -> origin/gh/oulgen/20/orig 2025-12-04T08:53:59.3359026Z * [new branch] gh/oulgen/21/base -> origin/gh/oulgen/21/base 2025-12-04T08:53:59.3359091Z * [new branch] gh/oulgen/21/head -> origin/gh/oulgen/21/head 2025-12-04T08:53:59.3359157Z * [new branch] gh/oulgen/21/orig -> origin/gh/oulgen/21/orig 2025-12-04T08:53:59.3359222Z * [new branch] gh/oulgen/22/base -> origin/gh/oulgen/22/base 2025-12-04T08:53:59.3359289Z * [new branch] gh/oulgen/22/head -> origin/gh/oulgen/22/head 2025-12-04T08:53:59.3359355Z * [new branch] gh/oulgen/22/orig -> origin/gh/oulgen/22/orig 2025-12-04T08:53:59.3359421Z * [new branch] gh/oulgen/23/base -> origin/gh/oulgen/23/base 2025-12-04T08:53:59.3359486Z * [new branch] gh/oulgen/23/head -> origin/gh/oulgen/23/head 2025-12-04T08:53:59.3359558Z * [new branch] gh/oulgen/23/orig -> origin/gh/oulgen/23/orig 2025-12-04T08:53:59.3359625Z * [new branch] gh/oulgen/24/base -> origin/gh/oulgen/24/base 2025-12-04T08:53:59.3359691Z * [new branch] gh/oulgen/24/head -> origin/gh/oulgen/24/head 2025-12-04T08:53:59.3359760Z * [new branch] gh/oulgen/24/orig -> origin/gh/oulgen/24/orig 2025-12-04T08:53:59.3359826Z * [new branch] gh/oulgen/25/base -> origin/gh/oulgen/25/base 2025-12-04T08:53:59.3359922Z * [new branch] gh/oulgen/25/head -> origin/gh/oulgen/25/head 2025-12-04T08:53:59.3359992Z * [new branch] gh/oulgen/25/orig -> origin/gh/oulgen/25/orig 2025-12-04T08:53:59.3360058Z * [new branch] gh/oulgen/26/base -> origin/gh/oulgen/26/base 2025-12-04T08:53:59.3360127Z * [new branch] gh/oulgen/26/head -> origin/gh/oulgen/26/head 2025-12-04T08:53:59.3360193Z * [new branch] gh/oulgen/26/orig -> origin/gh/oulgen/26/orig 2025-12-04T08:53:59.3360260Z * [new branch] gh/oulgen/4/base -> origin/gh/oulgen/4/base 2025-12-04T08:53:59.3360331Z * [new branch] gh/oulgen/4/head -> origin/gh/oulgen/4/head 2025-12-04T08:53:59.3360399Z * [new branch] gh/oulgen/4/orig -> origin/gh/oulgen/4/orig 2025-12-04T08:53:59.3360465Z * [new branch] gh/oulgen/7/base -> origin/gh/oulgen/7/base 2025-12-04T08:53:59.3360535Z * [new branch] gh/oulgen/7/head -> origin/gh/oulgen/7/head 2025-12-04T08:53:59.3360601Z * [new branch] gh/oulgen/7/orig -> origin/gh/oulgen/7/orig 2025-12-04T08:53:59.3360664Z * [new branch] gh/oulgen/8/base -> origin/gh/oulgen/8/base 2025-12-04T08:53:59.3360731Z * [new branch] gh/oulgen/8/head -> origin/gh/oulgen/8/head 2025-12-04T08:53:59.3360796Z * [new branch] gh/oulgen/8/orig -> origin/gh/oulgen/8/orig 2025-12-04T08:53:59.3360861Z * [new branch] gh/oulgen/9/base -> origin/gh/oulgen/9/base 2025-12-04T08:53:59.3360931Z * [new branch] gh/oulgen/9/head -> origin/gh/oulgen/9/head 2025-12-04T08:53:59.3360997Z * [new branch] gh/oulgen/9/orig -> origin/gh/oulgen/9/orig 2025-12-04T08:53:59.3361102Z * [new branch] gh/patvig/mtia-serialization -> origin/gh/patvig/mtia-serialization 2025-12-04T08:53:59.3361175Z * [new branch] gh/pearu/108/base -> origin/gh/pearu/108/base 2025-12-04T08:53:59.3361242Z * [new branch] gh/pearu/108/head -> origin/gh/pearu/108/head 2025-12-04T08:53:59.3361310Z * [new branch] gh/pearu/108/orig -> origin/gh/pearu/108/orig 2025-12-04T08:53:59.3361378Z * [new branch] gh/pearu/109/base -> origin/gh/pearu/109/base 2025-12-04T08:53:59.3361444Z * [new branch] gh/pearu/109/head -> origin/gh/pearu/109/head 2025-12-04T08:53:59.3361510Z * [new branch] gh/pearu/109/orig -> origin/gh/pearu/109/orig 2025-12-04T08:53:59.3361607Z * [new branch] gh/pearu/110/base -> origin/gh/pearu/110/base 2025-12-04T08:53:59.3361673Z * [new branch] gh/pearu/110/head -> origin/gh/pearu/110/head 2025-12-04T08:53:59.3361742Z * [new branch] gh/pearu/110/orig -> origin/gh/pearu/110/orig 2025-12-04T08:53:59.3361810Z * [new branch] gh/pearu/111/base -> origin/gh/pearu/111/base 2025-12-04T08:53:59.3361897Z * [new branch] gh/pearu/111/head -> origin/gh/pearu/111/head 2025-12-04T08:53:59.3361966Z * [new branch] gh/pearu/111/orig -> origin/gh/pearu/111/orig 2025-12-04T08:53:59.3362034Z * [new branch] gh/pearu/112/base -> origin/gh/pearu/112/base 2025-12-04T08:53:59.3362101Z * [new branch] gh/pearu/112/head -> origin/gh/pearu/112/head 2025-12-04T08:53:59.3362171Z * [new branch] gh/pearu/112/orig -> origin/gh/pearu/112/orig 2025-12-04T08:53:59.3362240Z * [new branch] gh/pearu/115/base -> origin/gh/pearu/115/base 2025-12-04T08:53:59.3362306Z * [new branch] gh/pearu/115/head -> origin/gh/pearu/115/head 2025-12-04T08:53:59.3362374Z * [new branch] gh/pearu/115/orig -> origin/gh/pearu/115/orig 2025-12-04T08:53:59.3362476Z * [new branch] gh/pearu/116/base -> origin/gh/pearu/116/base 2025-12-04T08:53:59.3362543Z * [new branch] gh/pearu/116/head -> origin/gh/pearu/116/head 2025-12-04T08:53:59.3362611Z * [new branch] gh/pearu/116/orig -> origin/gh/pearu/116/orig 2025-12-04T08:53:59.3362677Z * [new branch] gh/pearu/117/base -> origin/gh/pearu/117/base 2025-12-04T08:53:59.3362742Z * [new branch] gh/pearu/117/head -> origin/gh/pearu/117/head 2025-12-04T08:53:59.3362811Z * [new branch] gh/pearu/117/orig -> origin/gh/pearu/117/orig 2025-12-04T08:53:59.3362878Z * [new branch] gh/pearu/118/base -> origin/gh/pearu/118/base 2025-12-04T08:53:59.3362943Z * [new branch] gh/pearu/118/head -> origin/gh/pearu/118/head 2025-12-04T08:53:59.3363013Z * [new branch] gh/pearu/118/orig -> origin/gh/pearu/118/orig 2025-12-04T08:53:59.3363079Z * [new branch] gh/pearu/119/base -> origin/gh/pearu/119/base 2025-12-04T08:53:59.3363145Z * [new branch] gh/pearu/119/head -> origin/gh/pearu/119/head 2025-12-04T08:53:59.3363214Z * [new branch] gh/pearu/119/orig -> origin/gh/pearu/119/orig 2025-12-04T08:53:59.3363281Z * [new branch] gh/pearu/139/base -> origin/gh/pearu/139/base 2025-12-04T08:53:59.3363349Z * [new branch] gh/pearu/139/head -> origin/gh/pearu/139/head 2025-12-04T08:53:59.3363416Z * [new branch] gh/pearu/139/orig -> origin/gh/pearu/139/orig 2025-12-04T08:53:59.3363484Z * [new branch] gh/pearu/140/base -> origin/gh/pearu/140/base 2025-12-04T08:53:59.3363552Z * [new branch] gh/pearu/140/head -> origin/gh/pearu/140/head 2025-12-04T08:53:59.3363617Z * [new branch] gh/pearu/140/orig -> origin/gh/pearu/140/orig 2025-12-04T08:53:59.3363683Z * [new branch] gh/pearu/142/base -> origin/gh/pearu/142/base 2025-12-04T08:53:59.3363752Z * [new branch] gh/pearu/142/head -> origin/gh/pearu/142/head 2025-12-04T08:53:59.3363818Z * [new branch] gh/pearu/142/orig -> origin/gh/pearu/142/orig 2025-12-04T08:53:59.3363886Z * [new branch] gh/pearu/143/base -> origin/gh/pearu/143/base 2025-12-04T08:53:59.3363953Z * [new branch] gh/pearu/143/head -> origin/gh/pearu/143/head 2025-12-04T08:53:59.3364021Z * [new branch] gh/pearu/143/orig -> origin/gh/pearu/143/orig 2025-12-04T08:53:59.3364127Z * [new branch] gh/pearu/147/base -> origin/gh/pearu/147/base 2025-12-04T08:53:59.3364194Z * [new branch] gh/pearu/147/head -> origin/gh/pearu/147/head 2025-12-04T08:53:59.3364259Z * [new branch] gh/pearu/147/orig -> origin/gh/pearu/147/orig 2025-12-04T08:53:59.3364325Z * [new branch] gh/pearu/149/base -> origin/gh/pearu/149/base 2025-12-04T08:53:59.3364395Z * [new branch] gh/pearu/149/head -> origin/gh/pearu/149/head 2025-12-04T08:53:59.3364461Z * [new branch] gh/pearu/149/orig -> origin/gh/pearu/149/orig 2025-12-04T08:53:59.3364528Z * [new branch] gh/pearu/150/base -> origin/gh/pearu/150/base 2025-12-04T08:53:59.3364596Z * [new branch] gh/pearu/150/head -> origin/gh/pearu/150/head 2025-12-04T08:53:59.3364662Z * [new branch] gh/pearu/150/orig -> origin/gh/pearu/150/orig 2025-12-04T08:53:59.3364729Z * [new branch] gh/pearu/151/base -> origin/gh/pearu/151/base 2025-12-04T08:53:59.3364799Z * [new branch] gh/pearu/151/head -> origin/gh/pearu/151/head 2025-12-04T08:53:59.3364866Z * [new branch] gh/pearu/151/orig -> origin/gh/pearu/151/orig 2025-12-04T08:53:59.3364936Z * [new branch] gh/pearu/152/base -> origin/gh/pearu/152/base 2025-12-04T08:53:59.3365035Z * [new branch] gh/pearu/152/head -> origin/gh/pearu/152/head 2025-12-04T08:53:59.3365102Z * [new branch] gh/pearu/152/orig -> origin/gh/pearu/152/orig 2025-12-04T08:53:59.3365173Z * [new branch] gh/pearu/153/base -> origin/gh/pearu/153/base 2025-12-04T08:53:59.3365238Z * [new branch] gh/pearu/153/head -> origin/gh/pearu/153/head 2025-12-04T08:53:59.3365304Z * [new branch] gh/pearu/153/orig -> origin/gh/pearu/153/orig 2025-12-04T08:53:59.3365372Z * [new branch] gh/pearu/154/base -> origin/gh/pearu/154/base 2025-12-04T08:53:59.3365438Z * [new branch] gh/pearu/154/head -> origin/gh/pearu/154/head 2025-12-04T08:53:59.3365504Z * [new branch] gh/pearu/154/orig -> origin/gh/pearu/154/orig 2025-12-04T08:53:59.3365576Z * [new branch] gh/pearu/155/base -> origin/gh/pearu/155/base 2025-12-04T08:53:59.3365642Z * [new branch] gh/pearu/155/head -> origin/gh/pearu/155/head 2025-12-04T08:53:59.3365709Z * [new branch] gh/pearu/155/orig -> origin/gh/pearu/155/orig 2025-12-04T08:53:59.3365778Z * [new branch] gh/pearu/156/base -> origin/gh/pearu/156/base 2025-12-04T08:53:59.3365844Z * [new branch] gh/pearu/156/head -> origin/gh/pearu/156/head 2025-12-04T08:53:59.3365912Z * [new branch] gh/pearu/156/orig -> origin/gh/pearu/156/orig 2025-12-04T08:53:59.3365979Z * [new branch] gh/pearu/56/base -> origin/gh/pearu/56/base 2025-12-04T08:53:59.3366047Z * [new branch] gh/pearu/56/head -> origin/gh/pearu/56/head 2025-12-04T08:53:59.3366112Z * [new branch] gh/pearu/56/orig -> origin/gh/pearu/56/orig 2025-12-04T08:53:59.3366179Z * [new branch] gh/pearu/97/base -> origin/gh/pearu/97/base 2025-12-04T08:53:59.3366245Z * [new branch] gh/pearu/97/head -> origin/gh/pearu/97/head 2025-12-04T08:53:59.3366310Z * [new branch] gh/pearu/97/orig -> origin/gh/pearu/97/orig 2025-12-04T08:53:59.3366387Z * [new branch] gh/pianpwk/21/base -> origin/gh/pianpwk/21/base 2025-12-04T08:53:59.3366460Z * [new branch] gh/pianpwk/21/head -> origin/gh/pianpwk/21/head 2025-12-04T08:53:59.3366531Z * [new branch] gh/pianpwk/28/base -> origin/gh/pianpwk/28/base 2025-12-04T08:53:59.3366602Z * [new branch] gh/pianpwk/28/head -> origin/gh/pianpwk/28/head 2025-12-04T08:53:59.3366715Z * [new branch] gh/pianpwk/28/orig -> origin/gh/pianpwk/28/orig 2025-12-04T08:53:59.3366786Z * [new branch] gh/pianpwk/29/base -> origin/gh/pianpwk/29/base 2025-12-04T08:53:59.3366854Z * [new branch] gh/pianpwk/29/head -> origin/gh/pianpwk/29/head 2025-12-04T08:53:59.3366924Z * [new branch] gh/pianpwk/29/orig -> origin/gh/pianpwk/29/orig 2025-12-04T08:53:59.3366995Z * [new branch] gh/pianpwk/30/base -> origin/gh/pianpwk/30/base 2025-12-04T08:53:59.3367064Z * [new branch] gh/pianpwk/30/head -> origin/gh/pianpwk/30/head 2025-12-04T08:53:59.3367134Z * [new branch] gh/pianpwk/30/orig -> origin/gh/pianpwk/30/orig 2025-12-04T08:53:59.3367204Z * [new branch] gh/pianpwk/31/base -> origin/gh/pianpwk/31/base 2025-12-04T08:53:59.3367272Z * [new branch] gh/pianpwk/31/head -> origin/gh/pianpwk/31/head 2025-12-04T08:53:59.3367344Z * [new branch] gh/pianpwk/31/orig -> origin/gh/pianpwk/31/orig 2025-12-04T08:53:59.3367414Z * [new branch] gh/pianpwk/32/base -> origin/gh/pianpwk/32/base 2025-12-04T08:53:59.3367484Z * [new branch] gh/pianpwk/32/head -> origin/gh/pianpwk/32/head 2025-12-04T08:53:59.3367584Z * [new branch] gh/pianpwk/32/orig -> origin/gh/pianpwk/32/orig 2025-12-04T08:53:59.3367656Z * [new branch] gh/pianpwk/33/base -> origin/gh/pianpwk/33/base 2025-12-04T08:53:59.3367725Z * [new branch] gh/pianpwk/33/head -> origin/gh/pianpwk/33/head 2025-12-04T08:53:59.3367793Z * [new branch] gh/pianpwk/33/orig -> origin/gh/pianpwk/33/orig 2025-12-04T08:53:59.3367864Z * [new branch] gh/pianpwk/34/base -> origin/gh/pianpwk/34/base 2025-12-04T08:53:59.3367935Z * [new branch] gh/pianpwk/34/head -> origin/gh/pianpwk/34/head 2025-12-04T08:53:59.3368004Z * [new branch] gh/pianpwk/34/orig -> origin/gh/pianpwk/34/orig 2025-12-04T08:53:59.3368074Z * [new branch] gh/pianpwk/35/base -> origin/gh/pianpwk/35/base 2025-12-04T08:53:59.3368142Z * [new branch] gh/pianpwk/35/head -> origin/gh/pianpwk/35/head 2025-12-04T08:53:59.3368212Z * [new branch] gh/pianpwk/35/orig -> origin/gh/pianpwk/35/orig 2025-12-04T08:53:59.3368282Z * [new branch] gh/rec/141/base -> origin/gh/rec/141/base 2025-12-04T08:53:59.3371620Z * [new branch] gh/rec/141/head -> origin/gh/rec/141/head 2025-12-04T08:53:59.3371697Z * [new branch] gh/rec/153/base -> origin/gh/rec/153/base 2025-12-04T08:53:59.3371763Z * [new branch] gh/rec/153/head -> origin/gh/rec/153/head 2025-12-04T08:53:59.3371825Z * [new branch] gh/rec/153/orig -> origin/gh/rec/153/orig 2025-12-04T08:53:59.3371930Z * [new branch] gh/rec/154/base -> origin/gh/rec/154/base 2025-12-04T08:53:59.3371995Z * [new branch] gh/rec/154/head -> origin/gh/rec/154/head 2025-12-04T08:53:59.3372058Z * [new branch] gh/rec/154/orig -> origin/gh/rec/154/orig 2025-12-04T08:53:59.3372119Z * [new branch] gh/rec/164/base -> origin/gh/rec/164/base 2025-12-04T08:53:59.3372188Z * [new branch] gh/rec/164/head -> origin/gh/rec/164/head 2025-12-04T08:53:59.3372251Z * [new branch] gh/rec/164/orig -> origin/gh/rec/164/orig 2025-12-04T08:53:59.3372312Z * [new branch] gh/rec/166/base -> origin/gh/rec/166/base 2025-12-04T08:53:59.3372376Z * [new branch] gh/rec/166/head -> origin/gh/rec/166/head 2025-12-04T08:53:59.3372438Z * [new branch] gh/rec/166/orig -> origin/gh/rec/166/orig 2025-12-04T08:53:59.3372501Z * [new branch] gh/rec/167/base -> origin/gh/rec/167/base 2025-12-04T08:53:59.3372629Z * [new branch] gh/rec/167/head -> origin/gh/rec/167/head 2025-12-04T08:53:59.3372692Z * [new branch] gh/rec/167/orig -> origin/gh/rec/167/orig 2025-12-04T08:53:59.3372757Z * [new branch] gh/rec/168/base -> origin/gh/rec/168/base 2025-12-04T08:53:59.3372822Z * [new branch] gh/rec/168/head -> origin/gh/rec/168/head 2025-12-04T08:53:59.3372885Z * [new branch] gh/rec/168/orig -> origin/gh/rec/168/orig 2025-12-04T08:53:59.3372949Z * [new branch] gh/rec/169/base -> origin/gh/rec/169/base 2025-12-04T08:53:59.3373011Z * [new branch] gh/rec/169/head -> origin/gh/rec/169/head 2025-12-04T08:53:59.3373073Z * [new branch] gh/rec/169/orig -> origin/gh/rec/169/orig 2025-12-04T08:53:59.3373137Z * [new branch] gh/rec/170/base -> origin/gh/rec/170/base 2025-12-04T08:53:59.3373200Z * [new branch] gh/rec/170/head -> origin/gh/rec/170/head 2025-12-04T08:53:59.3373262Z * [new branch] gh/rec/170/orig -> origin/gh/rec/170/orig 2025-12-04T08:53:59.3373327Z * [new branch] gh/rec/171/base -> origin/gh/rec/171/base 2025-12-04T08:53:59.3373442Z * [new branch] gh/rec/171/head -> origin/gh/rec/171/head 2025-12-04T08:53:59.3373504Z * [new branch] gh/rec/171/orig -> origin/gh/rec/171/orig 2025-12-04T08:53:59.3373569Z * [new branch] gh/rec/172/base -> origin/gh/rec/172/base 2025-12-04T08:53:59.3373631Z * [new branch] gh/rec/172/head -> origin/gh/rec/172/head 2025-12-04T08:53:59.3373694Z * [new branch] gh/rec/172/orig -> origin/gh/rec/172/orig 2025-12-04T08:53:59.3373758Z * [new branch] gh/rec/173/base -> origin/gh/rec/173/base 2025-12-04T08:53:59.3373823Z * [new branch] gh/rec/173/head -> origin/gh/rec/173/head 2025-12-04T08:53:59.3373886Z * [new branch] gh/rec/173/orig -> origin/gh/rec/173/orig 2025-12-04T08:53:59.3373949Z * [new branch] gh/rec/174/base -> origin/gh/rec/174/base 2025-12-04T08:53:59.3374010Z * [new branch] gh/rec/174/head -> origin/gh/rec/174/head 2025-12-04T08:53:59.3374074Z * [new branch] gh/rec/174/orig -> origin/gh/rec/174/orig 2025-12-04T08:53:59.3374138Z * [new branch] gh/rec/175/base -> origin/gh/rec/175/base 2025-12-04T08:53:59.3374199Z * [new branch] gh/rec/175/head -> origin/gh/rec/175/head 2025-12-04T08:53:59.3374263Z * [new branch] gh/rec/175/orig -> origin/gh/rec/175/orig 2025-12-04T08:53:59.3374324Z * [new branch] gh/rec/176/base -> origin/gh/rec/176/base 2025-12-04T08:53:59.3374391Z * [new branch] gh/rec/176/head -> origin/gh/rec/176/head 2025-12-04T08:53:59.3374454Z * [new branch] gh/rec/176/orig -> origin/gh/rec/176/orig 2025-12-04T08:53:59.3374516Z * [new branch] gh/rec/177/base -> origin/gh/rec/177/base 2025-12-04T08:53:59.3374578Z * [new branch] gh/rec/177/head -> origin/gh/rec/177/head 2025-12-04T08:53:59.3374644Z * [new branch] gh/rec/177/orig -> origin/gh/rec/177/orig 2025-12-04T08:53:59.3374734Z * [new branch] gh/robert-hardwick/3/base -> origin/gh/robert-hardwick/3/base 2025-12-04T08:53:59.3374818Z * [new branch] gh/robert-hardwick/3/head -> origin/gh/robert-hardwick/3/head 2025-12-04T08:53:59.3374900Z * [new branch] gh/robert-hardwick/3/orig -> origin/gh/robert-hardwick/3/orig 2025-12-04T08:53:59.3374982Z * [new branch] gh/robert-hardwick/4/base -> origin/gh/robert-hardwick/4/base 2025-12-04T08:53:59.3375097Z * [new branch] gh/robert-hardwick/4/head -> origin/gh/robert-hardwick/4/head 2025-12-04T08:53:59.3375179Z * [new branch] gh/robert-hardwick/4/orig -> origin/gh/robert-hardwick/4/orig 2025-12-04T08:53:59.3375261Z * [new branch] gh/robert-hardwick/5/base -> origin/gh/robert-hardwick/5/base 2025-12-04T08:53:59.3375342Z * [new branch] gh/robert-hardwick/5/head -> origin/gh/robert-hardwick/5/head 2025-12-04T08:53:59.3375426Z * [new branch] gh/robert-hardwick/5/orig -> origin/gh/robert-hardwick/5/orig 2025-12-04T08:53:59.3375506Z * [new branch] gh/robert-hardwick/6/base -> origin/gh/robert-hardwick/6/base 2025-12-04T08:53:59.3375586Z * [new branch] gh/robert-hardwick/6/head -> origin/gh/robert-hardwick/6/head 2025-12-04T08:53:59.3375669Z * [new branch] gh/robert-hardwick/6/orig -> origin/gh/robert-hardwick/6/orig 2025-12-04T08:53:59.3375751Z * [new branch] gh/robert-hardwick/7/base -> origin/gh/robert-hardwick/7/base 2025-12-04T08:53:59.3375836Z * [new branch] gh/robert-hardwick/7/head -> origin/gh/robert-hardwick/7/head 2025-12-04T08:53:59.3375915Z * [new branch] gh/robert-hardwick/7/orig -> origin/gh/robert-hardwick/7/orig 2025-12-04T08:53:59.3375997Z * [new branch] gh/robert-hardwick/8/base -> origin/gh/robert-hardwick/8/base 2025-12-04T08:53:59.3376110Z * [new branch] gh/robert-hardwick/8/head -> origin/gh/robert-hardwick/8/head 2025-12-04T08:53:59.3376189Z * [new branch] gh/robert-hardwick/8/orig -> origin/gh/robert-hardwick/8/orig 2025-12-04T08:53:59.3376268Z * [new branch] gh/robert-hardwick/9/base -> origin/gh/robert-hardwick/9/base 2025-12-04T08:53:59.3376349Z * [new branch] gh/robert-hardwick/9/head -> origin/gh/robert-hardwick/9/head 2025-12-04T08:53:59.3376430Z * [new branch] gh/robert-hardwick/9/orig -> origin/gh/robert-hardwick/9/orig 2025-12-04T08:53:59.3376504Z * [new branch] gh/rtimpe/1/base -> origin/gh/rtimpe/1/base 2025-12-04T08:53:59.3376573Z * [new branch] gh/rtimpe/1/head -> origin/gh/rtimpe/1/head 2025-12-04T08:53:59.3376639Z * [new branch] gh/rtimpe/2/base -> origin/gh/rtimpe/2/base 2025-12-04T08:53:59.3376703Z * [new branch] gh/rtimpe/2/head -> origin/gh/rtimpe/2/head 2025-12-04T08:53:59.3376778Z * [new branch] gh/rtimpe/22/base -> origin/gh/rtimpe/22/base 2025-12-04T08:53:59.3376844Z * [new branch] gh/rtimpe/22/head -> origin/gh/rtimpe/22/head 2025-12-04T08:53:59.3376910Z * [new branch] gh/rtimpe/22/orig -> origin/gh/rtimpe/22/orig 2025-12-04T08:53:59.3376980Z * [new branch] gh/rtimpe/23/base -> origin/gh/rtimpe/23/base 2025-12-04T08:53:59.3377045Z * [new branch] gh/rtimpe/23/head -> origin/gh/rtimpe/23/head 2025-12-04T08:53:59.3377112Z * [new branch] gh/rtimpe/23/orig -> origin/gh/rtimpe/23/orig 2025-12-04T08:53:59.3377178Z * [new branch] gh/rtimpe/24/base -> origin/gh/rtimpe/24/base 2025-12-04T08:53:59.3377245Z * [new branch] gh/rtimpe/24/head -> origin/gh/rtimpe/24/head 2025-12-04T08:53:59.3377310Z * [new branch] gh/rtimpe/24/orig -> origin/gh/rtimpe/24/orig 2025-12-04T08:53:59.3377379Z * [new branch] gh/rtimpe/25/base -> origin/gh/rtimpe/25/base 2025-12-04T08:53:59.3377445Z * [new branch] gh/rtimpe/25/head -> origin/gh/rtimpe/25/head 2025-12-04T08:53:59.3377514Z * [new branch] gh/rtimpe/25/orig -> origin/gh/rtimpe/25/orig 2025-12-04T08:53:59.3377579Z * [new branch] gh/rtimpe/26/base -> origin/gh/rtimpe/26/base 2025-12-04T08:53:59.3377643Z * [new branch] gh/rtimpe/26/head -> origin/gh/rtimpe/26/head 2025-12-04T08:53:59.3377735Z * [new branch] gh/rtimpe/26/orig -> origin/gh/rtimpe/26/orig 2025-12-04T08:53:59.3377801Z * [new branch] gh/rtimpe/27/base -> origin/gh/rtimpe/27/base 2025-12-04T08:53:59.3377866Z * [new branch] gh/rtimpe/27/head -> origin/gh/rtimpe/27/head 2025-12-04T08:53:59.3377933Z * [new branch] gh/rtimpe/27/orig -> origin/gh/rtimpe/27/orig 2025-12-04T08:53:59.3378004Z * [new branch] gh/rtimpe/28/base -> origin/gh/rtimpe/28/base 2025-12-04T08:53:59.3378069Z * [new branch] gh/rtimpe/28/head -> origin/gh/rtimpe/28/head 2025-12-04T08:53:59.3378136Z * [new branch] gh/rtimpe/28/orig -> origin/gh/rtimpe/28/orig 2025-12-04T08:53:59.3378201Z * [new branch] gh/rtimpe/29/base -> origin/gh/rtimpe/29/base 2025-12-04T08:53:59.3378269Z * [new branch] gh/rtimpe/29/head -> origin/gh/rtimpe/29/head 2025-12-04T08:53:59.3378337Z * [new branch] gh/rtimpe/29/orig -> origin/gh/rtimpe/29/orig 2025-12-04T08:53:59.3378404Z * [new branch] gh/rtimpe/3/base -> origin/gh/rtimpe/3/base 2025-12-04T08:53:59.3378470Z * [new branch] gh/rtimpe/3/head -> origin/gh/rtimpe/3/head 2025-12-04T08:53:59.3378536Z * [new branch] gh/rtimpe/30/base -> origin/gh/rtimpe/30/base 2025-12-04T08:53:59.3378645Z * [new branch] gh/rtimpe/30/head -> origin/gh/rtimpe/30/head 2025-12-04T08:53:59.3378711Z * [new branch] gh/rtimpe/30/orig -> origin/gh/rtimpe/30/orig 2025-12-04T08:53:59.3378775Z * [new branch] gh/rtimpe/31/base -> origin/gh/rtimpe/31/base 2025-12-04T08:53:59.3378842Z * [new branch] gh/rtimpe/31/head -> origin/gh/rtimpe/31/head 2025-12-04T08:53:59.3378907Z * [new branch] gh/rtimpe/31/orig -> origin/gh/rtimpe/31/orig 2025-12-04T08:53:59.3378972Z * [new branch] gh/rtimpe/32/base -> origin/gh/rtimpe/32/base 2025-12-04T08:53:59.3379042Z * [new branch] gh/rtimpe/32/head -> origin/gh/rtimpe/32/head 2025-12-04T08:53:59.3379108Z * [new branch] gh/rtimpe/32/orig -> origin/gh/rtimpe/32/orig 2025-12-04T08:53:59.3379173Z * [new branch] gh/rtimpe/33/base -> origin/gh/rtimpe/33/base 2025-12-04T08:53:59.3379242Z * [new branch] gh/rtimpe/33/head -> origin/gh/rtimpe/33/head 2025-12-04T08:53:59.3379308Z * [new branch] gh/rtimpe/33/orig -> origin/gh/rtimpe/33/orig 2025-12-04T08:53:59.3379373Z * [new branch] gh/rtimpe/34/base -> origin/gh/rtimpe/34/base 2025-12-04T08:53:59.3379439Z * [new branch] gh/rtimpe/34/head -> origin/gh/rtimpe/34/head 2025-12-04T08:53:59.3379504Z * [new branch] gh/rtimpe/34/orig -> origin/gh/rtimpe/34/orig 2025-12-04T08:53:59.3379569Z * [new branch] gh/rtimpe/35/base -> origin/gh/rtimpe/35/base 2025-12-04T08:53:59.3379637Z * [new branch] gh/rtimpe/35/head -> origin/gh/rtimpe/35/head 2025-12-04T08:53:59.3379702Z * [new branch] gh/rtimpe/35/orig -> origin/gh/rtimpe/35/orig 2025-12-04T08:53:59.3379768Z * [new branch] gh/rtimpe/4/base -> origin/gh/rtimpe/4/base 2025-12-04T08:53:59.3379835Z * [new branch] gh/rtimpe/4/head -> origin/gh/rtimpe/4/head 2025-12-04T08:53:59.3379915Z * [new branch] gh/ruisizhang123/1/base -> origin/gh/ruisizhang123/1/base 2025-12-04T08:53:59.3379993Z * [new branch] gh/ruisizhang123/1/head -> origin/gh/ruisizhang123/1/head 2025-12-04T08:53:59.3380068Z * [new branch] gh/ruisizhang123/1/orig -> origin/gh/ruisizhang123/1/orig 2025-12-04T08:53:59.3380143Z * [new branch] gh/ruisizhang123/4/base -> origin/gh/ruisizhang123/4/base 2025-12-04T08:53:59.3380218Z * [new branch] gh/ruisizhang123/4/head -> origin/gh/ruisizhang123/4/head 2025-12-04T08:53:59.3380326Z * [new branch] gh/ruisizhang123/4/orig -> origin/gh/ruisizhang123/4/orig 2025-12-04T08:53:59.3380401Z * [new branch] gh/ruisizhang123/5/base -> origin/gh/ruisizhang123/5/base 2025-12-04T08:53:59.3380475Z * [new branch] gh/ruisizhang123/5/head -> origin/gh/ruisizhang123/5/head 2025-12-04T08:53:59.3380550Z * [new branch] gh/ruisizhang123/5/orig -> origin/gh/ruisizhang123/5/orig 2025-12-04T08:53:59.3380625Z * [new branch] gh/ruisizhang123/6/base -> origin/gh/ruisizhang123/6/base 2025-12-04T08:53:59.3380700Z * [new branch] gh/ruisizhang123/6/head -> origin/gh/ruisizhang123/6/head 2025-12-04T08:53:59.3380774Z * [new branch] gh/ruisizhang123/6/orig -> origin/gh/ruisizhang123/6/orig 2025-12-04T08:53:59.3380848Z * [new branch] gh/ruisizhang123/7/base -> origin/gh/ruisizhang123/7/base 2025-12-04T08:53:59.3380926Z * [new branch] gh/ruisizhang123/7/head -> origin/gh/ruisizhang123/7/head 2025-12-04T08:53:59.3381000Z * [new branch] gh/ruisizhang123/7/orig -> origin/gh/ruisizhang123/7/orig 2025-12-04T08:53:59.3381074Z * [new branch] gh/ruisizhang123/8/base -> origin/gh/ruisizhang123/8/base 2025-12-04T08:53:59.3381149Z * [new branch] gh/ruisizhang123/8/head -> origin/gh/ruisizhang123/8/head 2025-12-04T08:53:59.3381251Z * [new branch] gh/ruisizhang123/8/orig -> origin/gh/ruisizhang123/8/orig 2025-12-04T08:53:59.3381326Z * [new branch] gh/ruisizhang123/9/base -> origin/gh/ruisizhang123/9/base 2025-12-04T08:53:59.3381400Z * [new branch] gh/ruisizhang123/9/head -> origin/gh/ruisizhang123/9/head 2025-12-04T08:53:59.3381475Z * [new branch] gh/ruisizhang123/9/orig -> origin/gh/ruisizhang123/9/orig 2025-12-04T08:53:59.3381552Z * [new branch] gh/seemethere/52/base -> origin/gh/seemethere/52/base 2025-12-04T08:53:59.3381627Z * [new branch] gh/seemethere/52/head -> origin/gh/seemethere/52/head 2025-12-04T08:53:59.3381699Z * [new branch] gh/seemethere/52/orig -> origin/gh/seemethere/52/orig 2025-12-04T08:53:59.3381773Z * [new branch] gh/seemethere/53/base -> origin/gh/seemethere/53/base 2025-12-04T08:53:59.3381846Z * [new branch] gh/seemethere/53/head -> origin/gh/seemethere/53/head 2025-12-04T08:53:59.3381945Z * [new branch] gh/seemethere/53/orig -> origin/gh/seemethere/53/orig 2025-12-04T08:53:59.3382017Z * [new branch] gh/seemethere/54/base -> origin/gh/seemethere/54/base 2025-12-04T08:53:59.3382091Z * [new branch] gh/seemethere/54/head -> origin/gh/seemethere/54/head 2025-12-04T08:53:59.3382163Z * [new branch] gh/seemethere/54/orig -> origin/gh/seemethere/54/orig 2025-12-04T08:53:59.3382236Z * [new branch] gh/seemethere/55/base -> origin/gh/seemethere/55/base 2025-12-04T08:53:59.3382310Z * [new branch] gh/seemethere/55/head -> origin/gh/seemethere/55/head 2025-12-04T08:53:59.3382381Z * [new branch] gh/seemethere/55/orig -> origin/gh/seemethere/55/orig 2025-12-04T08:53:59.3382455Z * [new branch] gh/seemethere/59/base -> origin/gh/seemethere/59/base 2025-12-04T08:53:59.3382528Z * [new branch] gh/seemethere/59/head -> origin/gh/seemethere/59/head 2025-12-04T08:53:59.3382599Z * [new branch] gh/seemethere/59/orig -> origin/gh/seemethere/59/orig 2025-12-04T08:53:59.3382672Z * [new branch] gh/seemethere/62/base -> origin/gh/seemethere/62/base 2025-12-04T08:53:59.3382743Z * [new branch] gh/seemethere/62/head -> origin/gh/seemethere/62/head 2025-12-04T08:53:59.3382816Z * [new branch] gh/seemethere/62/orig -> origin/gh/seemethere/62/orig 2025-12-04T08:53:59.3382888Z * [new branch] gh/seemethere/63/base -> origin/gh/seemethere/63/base 2025-12-04T08:53:59.3383005Z * [new branch] gh/seemethere/63/head -> origin/gh/seemethere/63/head 2025-12-04T08:53:59.3383078Z * [new branch] gh/seemethere/63/orig -> origin/gh/seemethere/63/orig 2025-12-04T08:53:59.3383149Z * [new branch] gh/seemethere/71/base -> origin/gh/seemethere/71/base 2025-12-04T08:53:59.3383221Z * [new branch] gh/seemethere/71/head -> origin/gh/seemethere/71/head 2025-12-04T08:53:59.3383295Z * [new branch] gh/seemethere/71/orig -> origin/gh/seemethere/71/orig 2025-12-04T08:53:59.3383365Z * [new branch] gh/seemethere/72/base -> origin/gh/seemethere/72/base 2025-12-04T08:53:59.3383437Z * [new branch] gh/seemethere/72/head -> origin/gh/seemethere/72/head 2025-12-04T08:53:59.3383510Z * [new branch] gh/seemethere/72/orig -> origin/gh/seemethere/72/orig 2025-12-04T08:53:59.3383581Z * [new branch] gh/seemethere/73/base -> origin/gh/seemethere/73/base 2025-12-04T08:53:59.3383654Z * [new branch] gh/seemethere/73/head -> origin/gh/seemethere/73/head 2025-12-04T08:53:59.3383727Z * [new branch] gh/seemethere/73/orig -> origin/gh/seemethere/73/orig 2025-12-04T08:53:59.3383798Z * [new branch] gh/seemethere/74/base -> origin/gh/seemethere/74/base 2025-12-04T08:53:59.3383907Z * [new branch] gh/seemethere/74/head -> origin/gh/seemethere/74/head 2025-12-04T08:53:59.3383980Z * [new branch] gh/seemethere/74/orig -> origin/gh/seemethere/74/orig 2025-12-04T08:53:59.3384051Z * [new branch] gh/seemethere/75/base -> origin/gh/seemethere/75/base 2025-12-04T08:53:59.3384123Z * [new branch] gh/seemethere/75/head -> origin/gh/seemethere/75/head 2025-12-04T08:53:59.3384197Z * [new branch] gh/seemethere/75/orig -> origin/gh/seemethere/75/orig 2025-12-04T08:53:59.3384271Z * [new branch] gh/seemethere/76/base -> origin/gh/seemethere/76/base 2025-12-04T08:53:59.3384343Z * [new branch] gh/seemethere/76/head -> origin/gh/seemethere/76/head 2025-12-04T08:53:59.3384415Z * [new branch] gh/seemethere/76/orig -> origin/gh/seemethere/76/orig 2025-12-04T08:53:59.3384491Z * [new branch] gh/shunting314/145/base -> origin/gh/shunting314/145/base 2025-12-04T08:53:59.3384568Z * [new branch] gh/shunting314/145/head -> origin/gh/shunting314/145/head 2025-12-04T08:53:59.3384642Z * [new branch] gh/shunting314/145/orig -> origin/gh/shunting314/145/orig 2025-12-04T08:53:59.3384716Z * [new branch] gh/shunting314/176/base -> origin/gh/shunting314/176/base 2025-12-04T08:53:59.3384789Z * [new branch] gh/shunting314/176/head -> origin/gh/shunting314/176/head 2025-12-04T08:53:59.3384863Z * [new branch] gh/shunting314/176/orig -> origin/gh/shunting314/176/orig 2025-12-04T08:53:59.3384937Z * [new branch] gh/shunting314/249/base -> origin/gh/shunting314/249/base 2025-12-04T08:53:59.3385011Z * [new branch] gh/shunting314/249/head -> origin/gh/shunting314/249/head 2025-12-04T08:53:59.3385084Z * [new branch] gh/shunting314/249/orig -> origin/gh/shunting314/249/orig 2025-12-04T08:53:59.3385158Z * [new branch] gh/shunting314/253/base -> origin/gh/shunting314/253/base 2025-12-04T08:53:59.3385233Z * [new branch] gh/shunting314/253/head -> origin/gh/shunting314/253/head 2025-12-04T08:53:59.3385307Z * [new branch] gh/shunting314/253/orig -> origin/gh/shunting314/253/orig 2025-12-04T08:53:59.3385380Z * [new branch] gh/shunting314/256/base -> origin/gh/shunting314/256/base 2025-12-04T08:53:59.3385456Z * [new branch] gh/shunting314/256/head -> origin/gh/shunting314/256/head 2025-12-04T08:53:59.3385529Z * [new branch] gh/shunting314/256/orig -> origin/gh/shunting314/256/orig 2025-12-04T08:53:59.3385627Z * [new branch] gh/shunting314/257/base -> origin/gh/shunting314/257/base 2025-12-04T08:53:59.3385702Z * [new branch] gh/shunting314/257/head -> origin/gh/shunting314/257/head 2025-12-04T08:53:59.3385775Z * [new branch] gh/shunting314/257/orig -> origin/gh/shunting314/257/orig 2025-12-04T08:53:59.3385848Z * [new branch] gh/shunting314/258/base -> origin/gh/shunting314/258/base 2025-12-04T08:53:59.3385921Z * [new branch] gh/shunting314/258/head -> origin/gh/shunting314/258/head 2025-12-04T08:53:59.3385995Z * [new branch] gh/shunting314/258/orig -> origin/gh/shunting314/258/orig 2025-12-04T08:53:59.3386070Z * [new branch] gh/shunting314/259/base -> origin/gh/shunting314/259/base 2025-12-04T08:53:59.3386142Z * [new branch] gh/shunting314/259/head -> origin/gh/shunting314/259/head 2025-12-04T08:53:59.3386214Z * [new branch] gh/shunting314/259/orig -> origin/gh/shunting314/259/orig 2025-12-04T08:53:59.3386290Z * [new branch] gh/shunting314/260/base -> origin/gh/shunting314/260/base 2025-12-04T08:53:59.3386363Z * [new branch] gh/shunting314/260/head -> origin/gh/shunting314/260/head 2025-12-04T08:53:59.3386437Z * [new branch] gh/shunting314/260/orig -> origin/gh/shunting314/260/orig 2025-12-04T08:53:59.3386538Z * [new branch] gh/shunting314/261/base -> origin/gh/shunting314/261/base 2025-12-04T08:53:59.3386611Z * [new branch] gh/shunting314/261/head -> origin/gh/shunting314/261/head 2025-12-04T08:53:59.3386684Z * [new branch] gh/shunting314/261/orig -> origin/gh/shunting314/261/orig 2025-12-04T08:53:59.3386759Z * [new branch] gh/shunting314/262/base -> origin/gh/shunting314/262/base 2025-12-04T08:53:59.3386831Z * [new branch] gh/shunting314/262/head -> origin/gh/shunting314/262/head 2025-12-04T08:53:59.3386906Z * [new branch] gh/shunting314/262/orig -> origin/gh/shunting314/262/orig 2025-12-04T08:53:59.3386984Z * [new branch] gh/shunting314/263/base -> origin/gh/shunting314/263/base 2025-12-04T08:53:59.3387057Z * [new branch] gh/shunting314/263/head -> origin/gh/shunting314/263/head 2025-12-04T08:53:59.3387131Z * [new branch] gh/shunting314/263/orig -> origin/gh/shunting314/263/orig 2025-12-04T08:53:59.3387207Z * [new branch] gh/shunting314/264/base -> origin/gh/shunting314/264/base 2025-12-04T08:53:59.3387281Z * [new branch] gh/shunting314/264/head -> origin/gh/shunting314/264/head 2025-12-04T08:53:59.3387353Z * [new branch] gh/shunting314/264/orig -> origin/gh/shunting314/264/orig 2025-12-04T08:53:59.3387426Z * [new branch] gh/shunting314/265/base -> origin/gh/shunting314/265/base 2025-12-04T08:53:59.3387499Z * [new branch] gh/shunting314/265/head -> origin/gh/shunting314/265/head 2025-12-04T08:53:59.3387575Z * [new branch] gh/shunting314/265/orig -> origin/gh/shunting314/265/orig 2025-12-04T08:53:59.3387648Z * [new branch] gh/shunting314/266/base -> origin/gh/shunting314/266/base 2025-12-04T08:53:59.3387722Z * [new branch] gh/shunting314/266/head -> origin/gh/shunting314/266/head 2025-12-04T08:53:59.3387797Z * [new branch] gh/shunting314/266/orig -> origin/gh/shunting314/266/orig 2025-12-04T08:53:59.3387870Z * [new branch] gh/shunting314/267/base -> origin/gh/shunting314/267/base 2025-12-04T08:53:59.3387943Z * [new branch] gh/shunting314/267/head -> origin/gh/shunting314/267/head 2025-12-04T08:53:59.3388017Z * [new branch] gh/shunting314/267/orig -> origin/gh/shunting314/267/orig 2025-12-04T08:53:59.3388091Z * [new branch] gh/shunting314/268/base -> origin/gh/shunting314/268/base 2025-12-04T08:53:59.3388163Z * [new branch] gh/shunting314/268/head -> origin/gh/shunting314/268/head 2025-12-04T08:53:59.3388270Z * [new branch] gh/shunting314/268/orig -> origin/gh/shunting314/268/orig 2025-12-04T08:53:59.3388343Z * [new branch] gh/shunting314/269/base -> origin/gh/shunting314/269/base 2025-12-04T08:53:59.3388417Z * [new branch] gh/shunting314/269/head -> origin/gh/shunting314/269/head 2025-12-04T08:53:59.3388495Z * [new branch] gh/shunting314/269/orig -> origin/gh/shunting314/269/orig 2025-12-04T08:53:59.3388568Z * [new branch] gh/silverguo/1/base -> origin/gh/silverguo/1/base 2025-12-04T08:53:59.3388638Z * [new branch] gh/silverguo/1/head -> origin/gh/silverguo/1/head 2025-12-04T08:53:59.3388711Z * [new branch] gh/silverguo/2/base -> origin/gh/silverguo/2/base 2025-12-04T08:53:59.3388781Z * [new branch] gh/silverguo/2/head -> origin/gh/silverguo/2/head 2025-12-04T08:53:59.3388852Z * [new branch] gh/silverguo/3/base -> origin/gh/silverguo/3/base 2025-12-04T08:53:59.3388924Z * [new branch] gh/silverguo/3/head -> origin/gh/silverguo/3/head 2025-12-04T08:53:59.3388993Z * [new branch] gh/silverguo/4/base -> origin/gh/silverguo/4/base 2025-12-04T08:53:59.3389062Z * [new branch] gh/silverguo/4/head -> origin/gh/silverguo/4/head 2025-12-04T08:53:59.3389182Z * [new branch] gh/slayton58/39/base -> origin/gh/slayton58/39/base 2025-12-04T08:53:59.3389253Z * [new branch] gh/slayton58/39/head -> origin/gh/slayton58/39/head 2025-12-04T08:53:59.3389325Z * [new branch] gh/slayton58/39/orig -> origin/gh/slayton58/39/orig 2025-12-04T08:53:59.3389394Z * [new branch] gh/slayton58/42/base -> origin/gh/slayton58/42/base 2025-12-04T08:53:59.3389464Z * [new branch] gh/slayton58/42/head -> origin/gh/slayton58/42/head 2025-12-04T08:53:59.3389537Z * [new branch] gh/slayton58/42/orig -> origin/gh/slayton58/42/orig 2025-12-04T08:53:59.3389607Z * [new branch] gh/slayton58/43/base -> origin/gh/slayton58/43/base 2025-12-04T08:53:59.3389677Z * [new branch] gh/slayton58/43/head -> origin/gh/slayton58/43/head 2025-12-04T08:53:59.3389748Z * [new branch] gh/slayton58/43/orig -> origin/gh/slayton58/43/orig 2025-12-04T08:53:59.3389819Z * [new branch] gh/slayton58/44/base -> origin/gh/slayton58/44/base 2025-12-04T08:53:59.3389889Z * [new branch] gh/slayton58/44/head -> origin/gh/slayton58/44/head 2025-12-04T08:53:59.3389960Z * [new branch] gh/slayton58/44/orig -> origin/gh/slayton58/44/orig 2025-12-04T08:53:59.3390030Z * [new branch] gh/slayton58/45/base -> origin/gh/slayton58/45/base 2025-12-04T08:53:59.3390100Z * [new branch] gh/slayton58/45/head -> origin/gh/slayton58/45/head 2025-12-04T08:53:59.3390172Z * [new branch] gh/slayton58/45/orig -> origin/gh/slayton58/45/orig 2025-12-04T08:53:59.3390241Z * [new branch] gh/slayton58/46/base -> origin/gh/slayton58/46/base 2025-12-04T08:53:59.3390310Z * [new branch] gh/slayton58/46/head -> origin/gh/slayton58/46/head 2025-12-04T08:53:59.3390379Z * [new branch] gh/slayton58/46/orig -> origin/gh/slayton58/46/orig 2025-12-04T08:53:59.3390450Z * [new branch] gh/slayton58/6/base -> origin/gh/slayton58/6/base 2025-12-04T08:53:59.3390518Z * [new branch] gh/slayton58/6/head -> origin/gh/slayton58/6/head 2025-12-04T08:53:59.3390747Z * [new branch] gh/slayton58/7/base -> origin/gh/slayton58/7/base 2025-12-04T08:53:59.3390816Z * [new branch] gh/slayton58/7/head -> origin/gh/slayton58/7/head 2025-12-04T08:53:59.3390889Z * [new branch] gh/soulitzer/269/base -> origin/gh/soulitzer/269/base 2025-12-04T08:53:59.3391280Z * [new branch] gh/soulitzer/269/head -> origin/gh/soulitzer/269/head 2025-12-04T08:53:59.3391352Z * [new branch] gh/soulitzer/269/orig -> origin/gh/soulitzer/269/orig 2025-12-04T08:53:59.3391425Z * [new branch] gh/soulitzer/276/base -> origin/gh/soulitzer/276/base 2025-12-04T08:53:59.3391498Z * [new branch] gh/soulitzer/276/head -> origin/gh/soulitzer/276/head 2025-12-04T08:53:59.3391570Z * [new branch] gh/soulitzer/276/orig -> origin/gh/soulitzer/276/orig 2025-12-04T08:53:59.3391643Z * [new branch] gh/soulitzer/287/base -> origin/gh/soulitzer/287/base 2025-12-04T08:53:59.3391713Z * [new branch] gh/soulitzer/287/head -> origin/gh/soulitzer/287/head 2025-12-04T08:53:59.3391787Z * [new branch] gh/soulitzer/287/orig -> origin/gh/soulitzer/287/orig 2025-12-04T08:53:59.3391888Z * [new branch] gh/soulitzer/296/base -> origin/gh/soulitzer/296/base 2025-12-04T08:53:59.3391963Z * [new branch] gh/soulitzer/296/head -> origin/gh/soulitzer/296/head 2025-12-04T08:53:59.3392034Z * [new branch] gh/soulitzer/296/orig -> origin/gh/soulitzer/296/orig 2025-12-04T08:53:59.3392106Z * [new branch] gh/soulitzer/299/base -> origin/gh/soulitzer/299/base 2025-12-04T08:53:59.3392228Z * [new branch] gh/soulitzer/299/head -> origin/gh/soulitzer/299/head 2025-12-04T08:53:59.3392300Z * [new branch] gh/soulitzer/299/orig -> origin/gh/soulitzer/299/orig 2025-12-04T08:53:59.3392373Z * [new branch] gh/soulitzer/300/base -> origin/gh/soulitzer/300/base 2025-12-04T08:53:59.3392444Z * [new branch] gh/soulitzer/300/head -> origin/gh/soulitzer/300/head 2025-12-04T08:53:59.3392515Z * [new branch] gh/soulitzer/300/orig -> origin/gh/soulitzer/300/orig 2025-12-04T08:53:59.3392587Z * [new branch] gh/soulitzer/301/base -> origin/gh/soulitzer/301/base 2025-12-04T08:53:59.3392660Z * [new branch] gh/soulitzer/301/head -> origin/gh/soulitzer/301/head 2025-12-04T08:53:59.3392731Z * [new branch] gh/soulitzer/301/orig -> origin/gh/soulitzer/301/orig 2025-12-04T08:53:59.3392803Z * [new branch] gh/soulitzer/313/base -> origin/gh/soulitzer/313/base 2025-12-04T08:53:59.3392875Z * [new branch] gh/soulitzer/313/head -> origin/gh/soulitzer/313/head 2025-12-04T08:53:59.3392947Z * [new branch] gh/soulitzer/313/orig -> origin/gh/soulitzer/313/orig 2025-12-04T08:53:59.3393018Z * [new branch] gh/soulitzer/319/base -> origin/gh/soulitzer/319/base 2025-12-04T08:53:59.3393088Z * [new branch] gh/soulitzer/319/head -> origin/gh/soulitzer/319/head 2025-12-04T08:53:59.3393162Z * [new branch] gh/soulitzer/319/orig -> origin/gh/soulitzer/319/orig 2025-12-04T08:53:59.3393234Z * [new branch] gh/soulitzer/320/base -> origin/gh/soulitzer/320/base 2025-12-04T08:53:59.3393309Z * [new branch] gh/soulitzer/320/head -> origin/gh/soulitzer/320/head 2025-12-04T08:53:59.3393382Z * [new branch] gh/soulitzer/320/orig -> origin/gh/soulitzer/320/orig 2025-12-04T08:53:59.3393453Z * [new branch] gh/soulitzer/336/base -> origin/gh/soulitzer/336/base 2025-12-04T08:53:59.3393532Z * [new branch] gh/soulitzer/336/head -> origin/gh/soulitzer/336/head 2025-12-04T08:53:59.3393604Z * [new branch] gh/soulitzer/336/orig -> origin/gh/soulitzer/336/orig 2025-12-04T08:53:59.3393674Z * [new branch] gh/soulitzer/347/base -> origin/gh/soulitzer/347/base 2025-12-04T08:53:59.3393746Z * [new branch] gh/soulitzer/347/head -> origin/gh/soulitzer/347/head 2025-12-04T08:53:59.3393817Z * [new branch] gh/soulitzer/347/orig -> origin/gh/soulitzer/347/orig 2025-12-04T08:53:59.3393888Z * [new branch] gh/soulitzer/349/base -> origin/gh/soulitzer/349/base 2025-12-04T08:53:59.3394012Z * [new branch] gh/soulitzer/349/head -> origin/gh/soulitzer/349/head 2025-12-04T08:53:59.3394085Z * [new branch] gh/soulitzer/349/orig -> origin/gh/soulitzer/349/orig 2025-12-04T08:53:59.3394156Z * [new branch] gh/soulitzer/350/base -> origin/gh/soulitzer/350/base 2025-12-04T08:53:59.3394230Z * [new branch] gh/soulitzer/350/head -> origin/gh/soulitzer/350/head 2025-12-04T08:53:59.3394304Z * [new branch] gh/soulitzer/350/orig -> origin/gh/soulitzer/350/orig 2025-12-04T08:53:59.3394375Z * [new branch] gh/soulitzer/351/base -> origin/gh/soulitzer/351/base 2025-12-04T08:53:59.3394447Z * [new branch] gh/soulitzer/351/head -> origin/gh/soulitzer/351/head 2025-12-04T08:53:59.3394520Z * [new branch] gh/soulitzer/351/orig -> origin/gh/soulitzer/351/orig 2025-12-04T08:53:59.3394598Z * [new branch] gh/soulitzer/353/base -> origin/gh/soulitzer/353/base 2025-12-04T08:53:59.3394671Z * [new branch] gh/soulitzer/353/head -> origin/gh/soulitzer/353/head 2025-12-04T08:53:59.3394744Z * [new branch] gh/soulitzer/353/orig -> origin/gh/soulitzer/353/orig 2025-12-04T08:53:59.3394815Z * [new branch] gh/soulitzer/358/base -> origin/gh/soulitzer/358/base 2025-12-04T08:53:59.3394919Z * [new branch] gh/soulitzer/358/head -> origin/gh/soulitzer/358/head 2025-12-04T08:53:59.3394991Z * [new branch] gh/soulitzer/358/orig -> origin/gh/soulitzer/358/orig 2025-12-04T08:53:59.3395062Z * [new branch] gh/soulitzer/359/base -> origin/gh/soulitzer/359/base 2025-12-04T08:53:59.3395135Z * [new branch] gh/soulitzer/359/head -> origin/gh/soulitzer/359/head 2025-12-04T08:53:59.3395206Z * [new branch] gh/soulitzer/359/orig -> origin/gh/soulitzer/359/orig 2025-12-04T08:53:59.3395279Z * [new branch] gh/soulitzer/374/base -> origin/gh/soulitzer/374/base 2025-12-04T08:53:59.3395352Z * [new branch] gh/soulitzer/374/head -> origin/gh/soulitzer/374/head 2025-12-04T08:53:59.3395423Z * [new branch] gh/soulitzer/374/orig -> origin/gh/soulitzer/374/orig 2025-12-04T08:53:59.3395495Z * [new branch] gh/soulitzer/375/base -> origin/gh/soulitzer/375/base 2025-12-04T08:53:59.3395568Z * [new branch] gh/soulitzer/375/head -> origin/gh/soulitzer/375/head 2025-12-04T08:53:59.3395639Z * [new branch] gh/soulitzer/375/orig -> origin/gh/soulitzer/375/orig 2025-12-04T08:53:59.3395710Z * [new branch] gh/soulitzer/380/base -> origin/gh/soulitzer/380/base 2025-12-04T08:53:59.3395781Z * [new branch] gh/soulitzer/380/head -> origin/gh/soulitzer/380/head 2025-12-04T08:53:59.3395852Z * [new branch] gh/soulitzer/380/orig -> origin/gh/soulitzer/380/orig 2025-12-04T08:53:59.3395926Z * [new branch] gh/soulitzer/385/base -> origin/gh/soulitzer/385/base 2025-12-04T08:53:59.3395997Z * [new branch] gh/soulitzer/385/head -> origin/gh/soulitzer/385/head 2025-12-04T08:53:59.3396068Z * [new branch] gh/soulitzer/385/orig -> origin/gh/soulitzer/385/orig 2025-12-04T08:53:59.3396143Z * [new branch] gh/soulitzer/386/base -> origin/gh/soulitzer/386/base 2025-12-04T08:53:59.3396214Z * [new branch] gh/soulitzer/386/head -> origin/gh/soulitzer/386/head 2025-12-04T08:53:59.3396286Z * [new branch] gh/soulitzer/386/orig -> origin/gh/soulitzer/386/orig 2025-12-04T08:53:59.3396359Z * [new branch] gh/soulitzer/387/base -> origin/gh/soulitzer/387/base 2025-12-04T08:53:59.3396431Z * [new branch] gh/soulitzer/387/head -> origin/gh/soulitzer/387/head 2025-12-04T08:53:59.3396502Z * [new branch] gh/soulitzer/387/orig -> origin/gh/soulitzer/387/orig 2025-12-04T08:53:59.3396612Z * [new branch] gh/soulitzer/388/base -> origin/gh/soulitzer/388/base 2025-12-04T08:53:59.3396683Z * [new branch] gh/soulitzer/388/head -> origin/gh/soulitzer/388/head 2025-12-04T08:53:59.3396754Z * [new branch] gh/soulitzer/388/orig -> origin/gh/soulitzer/388/orig 2025-12-04T08:53:59.3396829Z * [new branch] gh/soulitzer/389/base -> origin/gh/soulitzer/389/base 2025-12-04T08:53:59.3396900Z * [new branch] gh/soulitzer/389/head -> origin/gh/soulitzer/389/head 2025-12-04T08:53:59.3396971Z * [new branch] gh/soulitzer/389/orig -> origin/gh/soulitzer/389/orig 2025-12-04T08:53:59.3397044Z * [new branch] gh/soulitzer/390/base -> origin/gh/soulitzer/390/base 2025-12-04T08:53:59.3397115Z * [new branch] gh/soulitzer/390/head -> origin/gh/soulitzer/390/head 2025-12-04T08:53:59.3397185Z * [new branch] gh/soulitzer/390/orig -> origin/gh/soulitzer/390/orig 2025-12-04T08:53:59.3397260Z * [new branch] gh/soulitzer/391/base -> origin/gh/soulitzer/391/base 2025-12-04T08:53:59.3397331Z * [new branch] gh/soulitzer/391/head -> origin/gh/soulitzer/391/head 2025-12-04T08:53:59.3397403Z * [new branch] gh/soulitzer/391/orig -> origin/gh/soulitzer/391/orig 2025-12-04T08:53:59.3397503Z * [new branch] gh/soulitzer/392/base -> origin/gh/soulitzer/392/base 2025-12-04T08:53:59.3397575Z * [new branch] gh/soulitzer/392/head -> origin/gh/soulitzer/392/head 2025-12-04T08:53:59.3397648Z * [new branch] gh/soulitzer/392/orig -> origin/gh/soulitzer/392/orig 2025-12-04T08:53:59.3397719Z * [new branch] gh/swolchok/728/next -> origin/gh/swolchok/728/next 2025-12-04T08:53:59.3397789Z * [new branch] gh/swolchok/819/base -> origin/gh/swolchok/819/base 2025-12-04T08:53:59.3397862Z * [new branch] gh/swolchok/819/head -> origin/gh/swolchok/819/head 2025-12-04T08:53:59.3397931Z * [new branch] gh/swolchok/819/orig -> origin/gh/swolchok/819/orig 2025-12-04T08:53:59.3398001Z * [new branch] gh/swolchok/824/base -> origin/gh/swolchok/824/base 2025-12-04T08:53:59.3398073Z * [new branch] gh/swolchok/824/head -> origin/gh/swolchok/824/head 2025-12-04T08:53:59.3398143Z * [new branch] gh/swolchok/824/orig -> origin/gh/swolchok/824/orig 2025-12-04T08:53:59.3398213Z * [new branch] gh/swolchok/829/base -> origin/gh/swolchok/829/base 2025-12-04T08:53:59.3398282Z * [new branch] gh/swolchok/829/head -> origin/gh/swolchok/829/head 2025-12-04T08:53:59.3398351Z * [new branch] gh/swolchok/829/orig -> origin/gh/swolchok/829/orig 2025-12-04T08:53:59.3398420Z * [new branch] gh/swolchok/839/base -> origin/gh/swolchok/839/base 2025-12-04T08:53:59.3398493Z * [new branch] gh/swolchok/839/head -> origin/gh/swolchok/839/head 2025-12-04T08:53:59.3398563Z * [new branch] gh/swolchok/839/orig -> origin/gh/swolchok/839/orig 2025-12-04T08:53:59.3398633Z * [new branch] gh/swolchok/841/base -> origin/gh/swolchok/841/base 2025-12-04T08:53:59.3398702Z * [new branch] gh/swolchok/841/head -> origin/gh/swolchok/841/head 2025-12-04T08:53:59.3398773Z * [new branch] gh/swolchok/841/orig -> origin/gh/swolchok/841/orig 2025-12-04T08:53:59.3398842Z * [new branch] gh/swolchok/842/base -> origin/gh/swolchok/842/base 2025-12-04T08:53:59.3398914Z * [new branch] gh/swolchok/842/head -> origin/gh/swolchok/842/head 2025-12-04T08:53:59.3398983Z * [new branch] gh/swolchok/842/orig -> origin/gh/swolchok/842/orig 2025-12-04T08:53:59.3399053Z * [new branch] gh/swolchok/845/base -> origin/gh/swolchok/845/base 2025-12-04T08:53:59.3399162Z * [new branch] gh/swolchok/845/head -> origin/gh/swolchok/845/head 2025-12-04T08:53:59.3399232Z * [new branch] gh/swolchok/845/orig -> origin/gh/swolchok/845/orig 2025-12-04T08:53:59.3399304Z * [new branch] gh/swolchok/848/base -> origin/gh/swolchok/848/base 2025-12-04T08:53:59.3399375Z * [new branch] gh/swolchok/848/head -> origin/gh/swolchok/848/head 2025-12-04T08:53:59.3399445Z * [new branch] gh/swolchok/848/orig -> origin/gh/swolchok/848/orig 2025-12-04T08:53:59.3399518Z * [new branch] gh/swolchok/856/base -> origin/gh/swolchok/856/base 2025-12-04T08:53:59.3399587Z * [new branch] gh/swolchok/856/head -> origin/gh/swolchok/856/head 2025-12-04T08:53:59.3399657Z * [new branch] gh/swolchok/856/orig -> origin/gh/swolchok/856/orig 2025-12-04T08:53:59.3399726Z * [new branch] gh/swolchok/860/base -> origin/gh/swolchok/860/base 2025-12-04T08:53:59.3399797Z * [new branch] gh/swolchok/860/head -> origin/gh/swolchok/860/head 2025-12-04T08:53:59.3399866Z * [new branch] gh/swolchok/860/orig -> origin/gh/swolchok/860/orig 2025-12-04T08:53:59.3399937Z * [new branch] gh/swolchok/861/base -> origin/gh/swolchok/861/base 2025-12-04T08:53:59.3400006Z * [new branch] gh/swolchok/861/head -> origin/gh/swolchok/861/head 2025-12-04T08:53:59.3400107Z * [new branch] gh/swolchok/861/orig -> origin/gh/swolchok/861/orig 2025-12-04T08:53:59.3400178Z * [new branch] gh/swolchok/862/base -> origin/gh/swolchok/862/base 2025-12-04T08:53:59.3400248Z * [new branch] gh/swolchok/862/head -> origin/gh/swolchok/862/head 2025-12-04T08:53:59.3400319Z * [new branch] gh/swolchok/862/orig -> origin/gh/swolchok/862/orig 2025-12-04T08:53:59.3400391Z * [new branch] gh/swolchok/863/base -> origin/gh/swolchok/863/base 2025-12-04T08:53:59.3400464Z * [new branch] gh/swolchok/863/head -> origin/gh/swolchok/863/head 2025-12-04T08:53:59.3400534Z * [new branch] gh/swolchok/863/orig -> origin/gh/swolchok/863/orig 2025-12-04T08:53:59.3400603Z * [new branch] gh/swolchok/864/base -> origin/gh/swolchok/864/base 2025-12-04T08:53:59.3400674Z * [new branch] gh/swolchok/864/head -> origin/gh/swolchok/864/head 2025-12-04T08:53:59.3400744Z * [new branch] gh/swolchok/864/orig -> origin/gh/swolchok/864/orig 2025-12-04T08:53:59.3400814Z * [new branch] gh/swolchok/865/base -> origin/gh/swolchok/865/base 2025-12-04T08:53:59.3400883Z * [new branch] gh/swolchok/865/head -> origin/gh/swolchok/865/head 2025-12-04T08:53:59.3400953Z * [new branch] gh/swolchok/865/orig -> origin/gh/swolchok/865/orig 2025-12-04T08:53:59.3401022Z * [new branch] gh/swolchok/866/base -> origin/gh/swolchok/866/base 2025-12-04T08:53:59.3401092Z * [new branch] gh/swolchok/866/head -> origin/gh/swolchok/866/head 2025-12-04T08:53:59.3401163Z * [new branch] gh/swolchok/866/orig -> origin/gh/swolchok/866/orig 2025-12-04T08:53:59.3401232Z * [new branch] gh/swolchok/867/base -> origin/gh/swolchok/867/base 2025-12-04T08:53:59.3401303Z * [new branch] gh/swolchok/867/head -> origin/gh/swolchok/867/head 2025-12-04T08:53:59.3401373Z * [new branch] gh/swolchok/867/orig -> origin/gh/swolchok/867/orig 2025-12-04T08:53:59.3401442Z * [new branch] gh/swolchok/868/base -> origin/gh/swolchok/868/base 2025-12-04T08:53:59.3401511Z * [new branch] gh/swolchok/868/head -> origin/gh/swolchok/868/head 2025-12-04T08:53:59.3401584Z * [new branch] gh/swolchok/868/orig -> origin/gh/swolchok/868/orig 2025-12-04T08:53:59.3401655Z * [new branch] gh/swolchok/869/base -> origin/gh/swolchok/869/base 2025-12-04T08:53:59.3401752Z * [new branch] gh/swolchok/869/head -> origin/gh/swolchok/869/head 2025-12-04T08:53:59.3401823Z * [new branch] gh/swolchok/869/orig -> origin/gh/swolchok/869/orig 2025-12-04T08:53:59.3401935Z * [new branch] gh/swolchok/870/base -> origin/gh/swolchok/870/base 2025-12-04T08:53:59.3402007Z * [new branch] gh/swolchok/870/head -> origin/gh/swolchok/870/head 2025-12-04T08:53:59.3402078Z * [new branch] gh/swolchok/870/orig -> origin/gh/swolchok/870/orig 2025-12-04T08:53:59.3402147Z * [new branch] gh/swolchok/871/base -> origin/gh/swolchok/871/base 2025-12-04T08:53:59.3402218Z * [new branch] gh/swolchok/871/head -> origin/gh/swolchok/871/head 2025-12-04T08:53:59.3402287Z * [new branch] gh/swolchok/871/orig -> origin/gh/swolchok/871/orig 2025-12-04T08:53:59.3402357Z * [new branch] gh/teja-rao/4/base -> origin/gh/teja-rao/4/base 2025-12-04T08:53:59.3402429Z * [new branch] gh/teja-rao/4/head -> origin/gh/teja-rao/4/head 2025-12-04T08:53:59.3402497Z * [new branch] gh/teja-rao/4/orig -> origin/gh/teja-rao/4/orig 2025-12-04T08:53:59.3402566Z * [new branch] gh/tianyu-l/2/base -> origin/gh/tianyu-l/2/base 2025-12-04T08:53:59.3402684Z * [new branch] gh/tianyu-l/2/head -> origin/gh/tianyu-l/2/head 2025-12-04T08:53:59.3402752Z * [new branch] gh/tianyu-l/2/orig -> origin/gh/tianyu-l/2/orig 2025-12-04T08:53:59.3402818Z * [new branch] gh/tianyu-l/3/base -> origin/gh/tianyu-l/3/base 2025-12-04T08:53:59.3402885Z * [new branch] gh/tianyu-l/3/orig -> origin/gh/tianyu-l/3/orig 2025-12-04T08:53:59.3402952Z * [new branch] gh/tianyu-l/4/base -> origin/gh/tianyu-l/4/base 2025-12-04T08:53:59.3403019Z * [new branch] gh/tianyu-l/4/head -> origin/gh/tianyu-l/4/head 2025-12-04T08:53:59.3403088Z * [new branch] gh/tianyu-l/4/orig -> origin/gh/tianyu-l/4/orig 2025-12-04T08:53:59.3403177Z * [new branch] gh/tugsbayasgalan/10/base -> origin/gh/tugsbayasgalan/10/base 2025-12-04T08:53:59.3403262Z * [new branch] gh/tugsbayasgalan/10/head -> origin/gh/tugsbayasgalan/10/head 2025-12-04T08:53:59.3403347Z * [new branch] gh/tugsbayasgalan/10/orig -> origin/gh/tugsbayasgalan/10/orig 2025-12-04T08:53:59.3403428Z * [new branch] gh/tugsbayasgalan/13/base -> origin/gh/tugsbayasgalan/13/base 2025-12-04T08:53:59.3403511Z * [new branch] gh/tugsbayasgalan/13/head -> origin/gh/tugsbayasgalan/13/head 2025-12-04T08:53:59.3403594Z * [new branch] gh/tugsbayasgalan/13/orig -> origin/gh/tugsbayasgalan/13/orig 2025-12-04T08:53:59.3403675Z * [new branch] gh/tugsbayasgalan/17/base -> origin/gh/tugsbayasgalan/17/base 2025-12-04T08:53:59.3403759Z * [new branch] gh/tugsbayasgalan/17/head -> origin/gh/tugsbayasgalan/17/head 2025-12-04T08:53:59.3403839Z * [new branch] gh/tugsbayasgalan/17/orig -> origin/gh/tugsbayasgalan/17/orig 2025-12-04T08:53:59.3403921Z * [new branch] gh/tugsbayasgalan/2/base -> origin/gh/tugsbayasgalan/2/base 2025-12-04T08:53:59.3404003Z * [new branch] gh/tugsbayasgalan/2/head -> origin/gh/tugsbayasgalan/2/head 2025-12-04T08:53:59.3404082Z * [new branch] gh/tugsbayasgalan/2/orig -> origin/gh/tugsbayasgalan/2/orig 2025-12-04T08:53:59.3404164Z * [new branch] gh/tugsbayasgalan/28/base -> origin/gh/tugsbayasgalan/28/base 2025-12-04T08:53:59.3404247Z * [new branch] gh/tugsbayasgalan/28/head -> origin/gh/tugsbayasgalan/28/head 2025-12-04T08:53:59.3404327Z * [new branch] gh/tugsbayasgalan/28/orig -> origin/gh/tugsbayasgalan/28/orig 2025-12-04T08:53:59.3404407Z * [new branch] gh/tugsbayasgalan/32/base -> origin/gh/tugsbayasgalan/32/base 2025-12-04T08:53:59.3404534Z * [new branch] gh/tugsbayasgalan/32/head -> origin/gh/tugsbayasgalan/32/head 2025-12-04T08:53:59.3404615Z * [new branch] gh/tugsbayasgalan/32/orig -> origin/gh/tugsbayasgalan/32/orig 2025-12-04T08:53:59.3404695Z * [new branch] gh/tugsbayasgalan/35/base -> origin/gh/tugsbayasgalan/35/base 2025-12-04T08:53:59.3404780Z * [new branch] gh/tugsbayasgalan/35/head -> origin/gh/tugsbayasgalan/35/head 2025-12-04T08:53:59.3404861Z * [new branch] gh/tugsbayasgalan/35/orig -> origin/gh/tugsbayasgalan/35/orig 2025-12-04T08:53:59.3404941Z * [new branch] gh/tugsbayasgalan/36/base -> origin/gh/tugsbayasgalan/36/base 2025-12-04T08:53:59.3405024Z * [new branch] gh/tugsbayasgalan/36/head -> origin/gh/tugsbayasgalan/36/head 2025-12-04T08:53:59.3405105Z * [new branch] gh/tugsbayasgalan/36/orig -> origin/gh/tugsbayasgalan/36/orig 2025-12-04T08:53:59.3405187Z * [new branch] gh/tugsbayasgalan/37/base -> origin/gh/tugsbayasgalan/37/base 2025-12-04T08:53:59.3405269Z * [new branch] gh/tugsbayasgalan/37/head -> origin/gh/tugsbayasgalan/37/head 2025-12-04T08:53:59.3405349Z * [new branch] gh/tugsbayasgalan/37/orig -> origin/gh/tugsbayasgalan/37/orig 2025-12-04T08:53:59.3405454Z * [new branch] gh/tugsbayasgalan/43/base -> origin/gh/tugsbayasgalan/43/base 2025-12-04T08:53:59.3405535Z * [new branch] gh/tugsbayasgalan/43/head -> origin/gh/tugsbayasgalan/43/head 2025-12-04T08:53:59.3405615Z * [new branch] gh/tugsbayasgalan/43/orig -> origin/gh/tugsbayasgalan/43/orig 2025-12-04T08:53:59.3405697Z * [new branch] gh/tugsbayasgalan/48/base -> origin/gh/tugsbayasgalan/48/base 2025-12-04T08:53:59.3405777Z * [new branch] gh/tugsbayasgalan/48/head -> origin/gh/tugsbayasgalan/48/head 2025-12-04T08:53:59.3405859Z * [new branch] gh/tugsbayasgalan/48/orig -> origin/gh/tugsbayasgalan/48/orig 2025-12-04T08:53:59.3405943Z * [new branch] gh/tugsbayasgalan/51/base -> origin/gh/tugsbayasgalan/51/base 2025-12-04T08:53:59.3406024Z * [new branch] gh/tugsbayasgalan/51/head -> origin/gh/tugsbayasgalan/51/head 2025-12-04T08:53:59.3406104Z * [new branch] gh/tugsbayasgalan/51/orig -> origin/gh/tugsbayasgalan/51/orig 2025-12-04T08:53:59.3406186Z * [new branch] gh/tugsbayasgalan/52/base -> origin/gh/tugsbayasgalan/52/base 2025-12-04T08:53:59.3406267Z * [new branch] gh/tugsbayasgalan/52/head -> origin/gh/tugsbayasgalan/52/head 2025-12-04T08:53:59.3406349Z * [new branch] gh/tugsbayasgalan/52/orig -> origin/gh/tugsbayasgalan/52/orig 2025-12-04T08:53:59.3406432Z * [new branch] gh/tugsbayasgalan/53/base -> origin/gh/tugsbayasgalan/53/base 2025-12-04T08:53:59.3406513Z * [new branch] gh/tugsbayasgalan/53/head -> origin/gh/tugsbayasgalan/53/head 2025-12-04T08:53:59.3406597Z * [new branch] gh/tugsbayasgalan/53/orig -> origin/gh/tugsbayasgalan/53/orig 2025-12-04T08:53:59.3406678Z * [new branch] gh/tugsbayasgalan/55/base -> origin/gh/tugsbayasgalan/55/base 2025-12-04T08:53:59.3406758Z * [new branch] gh/tugsbayasgalan/55/head -> origin/gh/tugsbayasgalan/55/head 2025-12-04T08:53:59.3406844Z * [new branch] gh/tugsbayasgalan/55/orig -> origin/gh/tugsbayasgalan/55/orig 2025-12-04T08:53:59.3406923Z * [new branch] gh/tugsbayasgalan/59/base -> origin/gh/tugsbayasgalan/59/base 2025-12-04T08:53:59.3407004Z * [new branch] gh/tugsbayasgalan/59/head -> origin/gh/tugsbayasgalan/59/head 2025-12-04T08:53:59.3407091Z * [new branch] gh/tugsbayasgalan/59/orig -> origin/gh/tugsbayasgalan/59/orig 2025-12-04T08:53:59.3407171Z * [new branch] gh/tugsbayasgalan/6/base -> origin/gh/tugsbayasgalan/6/base 2025-12-04T08:53:59.3407279Z * [new branch] gh/tugsbayasgalan/6/head -> origin/gh/tugsbayasgalan/6/head 2025-12-04T08:53:59.3407359Z * [new branch] gh/tugsbayasgalan/6/orig -> origin/gh/tugsbayasgalan/6/orig 2025-12-04T08:53:59.3407442Z * [new branch] gh/tugsbayasgalan/60/base -> origin/gh/tugsbayasgalan/60/base 2025-12-04T08:53:59.3407523Z * [new branch] gh/tugsbayasgalan/60/head -> origin/gh/tugsbayasgalan/60/head 2025-12-04T08:53:59.3407607Z * [new branch] gh/tugsbayasgalan/60/orig -> origin/gh/tugsbayasgalan/60/orig 2025-12-04T08:53:59.3407687Z * [new branch] gh/tugsbayasgalan/61/base -> origin/gh/tugsbayasgalan/61/base 2025-12-04T08:53:59.3407769Z * [new branch] gh/tugsbayasgalan/61/head -> origin/gh/tugsbayasgalan/61/head 2025-12-04T08:53:59.3407850Z * [new branch] gh/tugsbayasgalan/61/orig -> origin/gh/tugsbayasgalan/61/orig 2025-12-04T08:53:59.3407930Z * [new branch] gh/tugsbayasgalan/63/base -> origin/gh/tugsbayasgalan/63/base 2025-12-04T08:53:59.3408014Z * [new branch] gh/tugsbayasgalan/63/head -> origin/gh/tugsbayasgalan/63/head 2025-12-04T08:53:59.3408095Z * [new branch] gh/tugsbayasgalan/63/orig -> origin/gh/tugsbayasgalan/63/orig 2025-12-04T08:53:59.3408177Z * [new branch] gh/tugsbayasgalan/67/base -> origin/gh/tugsbayasgalan/67/base 2025-12-04T08:53:59.3408287Z * [new branch] gh/tugsbayasgalan/67/head -> origin/gh/tugsbayasgalan/67/head 2025-12-04T08:53:59.3408369Z * [new branch] gh/tugsbayasgalan/67/orig -> origin/gh/tugsbayasgalan/67/orig 2025-12-04T08:53:59.3408450Z * [new branch] gh/tugsbayasgalan/68/base -> origin/gh/tugsbayasgalan/68/base 2025-12-04T08:53:59.3408532Z * [new branch] gh/tugsbayasgalan/68/head -> origin/gh/tugsbayasgalan/68/head 2025-12-04T08:53:59.3408613Z * [new branch] gh/tugsbayasgalan/68/orig -> origin/gh/tugsbayasgalan/68/orig 2025-12-04T08:53:59.3408695Z * [new branch] gh/tugsbayasgalan/7/base -> origin/gh/tugsbayasgalan/7/base 2025-12-04T08:53:59.3408775Z * [new branch] gh/tugsbayasgalan/7/head -> origin/gh/tugsbayasgalan/7/head 2025-12-04T08:53:59.3408853Z * [new branch] gh/tugsbayasgalan/7/orig -> origin/gh/tugsbayasgalan/7/orig 2025-12-04T08:53:59.3408935Z * [new branch] gh/tugsbayasgalan/70/base -> origin/gh/tugsbayasgalan/70/base 2025-12-04T08:53:59.3409018Z * [new branch] gh/tugsbayasgalan/70/head -> origin/gh/tugsbayasgalan/70/head 2025-12-04T08:53:59.3409098Z * [new branch] gh/tugsbayasgalan/70/orig -> origin/gh/tugsbayasgalan/70/orig 2025-12-04T08:53:59.3409179Z * [new branch] gh/tugsbayasgalan/71/base -> origin/gh/tugsbayasgalan/71/base 2025-12-04T08:53:59.3409262Z * [new branch] gh/tugsbayasgalan/71/head -> origin/gh/tugsbayasgalan/71/head 2025-12-04T08:53:59.3409343Z * [new branch] gh/tugsbayasgalan/71/orig -> origin/gh/tugsbayasgalan/71/orig 2025-12-04T08:53:59.3409427Z * [new branch] gh/tugsbayasgalan/72/base -> origin/gh/tugsbayasgalan/72/base 2025-12-04T08:53:59.3409509Z * [new branch] gh/tugsbayasgalan/72/head -> origin/gh/tugsbayasgalan/72/head 2025-12-04T08:53:59.3409591Z * [new branch] gh/tugsbayasgalan/72/orig -> origin/gh/tugsbayasgalan/72/orig 2025-12-04T08:53:59.3409676Z * [new branch] gh/tugsbayasgalan/73/base -> origin/gh/tugsbayasgalan/73/base 2025-12-04T08:53:59.3409758Z * [new branch] gh/tugsbayasgalan/73/head -> origin/gh/tugsbayasgalan/73/head 2025-12-04T08:53:59.3409839Z * [new branch] gh/tugsbayasgalan/73/orig -> origin/gh/tugsbayasgalan/73/orig 2025-12-04T08:53:59.3409922Z * [new branch] gh/tugsbayasgalan/74/base -> origin/gh/tugsbayasgalan/74/base 2025-12-04T08:53:59.3410003Z * [new branch] gh/tugsbayasgalan/74/head -> origin/gh/tugsbayasgalan/74/head 2025-12-04T08:53:59.3410123Z * [new branch] gh/tugsbayasgalan/74/orig -> origin/gh/tugsbayasgalan/74/orig 2025-12-04T08:53:59.3410206Z * [new branch] gh/tugsbayasgalan/75/base -> origin/gh/tugsbayasgalan/75/base 2025-12-04T08:53:59.3410287Z * [new branch] gh/tugsbayasgalan/75/head -> origin/gh/tugsbayasgalan/75/head 2025-12-04T08:53:59.3410369Z * [new branch] gh/tugsbayasgalan/75/orig -> origin/gh/tugsbayasgalan/75/orig 2025-12-04T08:53:59.3410450Z * [new branch] gh/tugsbayasgalan/76/base -> origin/gh/tugsbayasgalan/76/base 2025-12-04T08:53:59.3410531Z * [new branch] gh/tugsbayasgalan/76/head -> origin/gh/tugsbayasgalan/76/head 2025-12-04T08:53:59.3410611Z * [new branch] gh/tugsbayasgalan/76/orig -> origin/gh/tugsbayasgalan/76/orig 2025-12-04T08:53:59.3410693Z * [new branch] gh/tugsbayasgalan/77/base -> origin/gh/tugsbayasgalan/77/base 2025-12-04T08:53:59.3410775Z * [new branch] gh/tugsbayasgalan/77/head -> origin/gh/tugsbayasgalan/77/head 2025-12-04T08:53:59.3410856Z * [new branch] gh/tugsbayasgalan/77/orig -> origin/gh/tugsbayasgalan/77/orig 2025-12-04T08:53:59.3410936Z * [new branch] gh/tugsbayasgalan/78/base -> origin/gh/tugsbayasgalan/78/base 2025-12-04T08:53:59.3411022Z * [new branch] gh/tugsbayasgalan/78/head -> origin/gh/tugsbayasgalan/78/head 2025-12-04T08:53:59.3411136Z * [new branch] gh/tugsbayasgalan/78/orig -> origin/gh/tugsbayasgalan/78/orig 2025-12-04T08:53:59.3411217Z * [new branch] gh/tugsbayasgalan/79/base -> origin/gh/tugsbayasgalan/79/base 2025-12-04T08:53:59.3411298Z * [new branch] gh/tugsbayasgalan/79/head -> origin/gh/tugsbayasgalan/79/head 2025-12-04T08:53:59.3411380Z * [new branch] gh/tugsbayasgalan/79/orig -> origin/gh/tugsbayasgalan/79/orig 2025-12-04T08:53:59.3411460Z * [new branch] gh/tugsbayasgalan/8/base -> origin/gh/tugsbayasgalan/8/base 2025-12-04T08:53:59.3411540Z * [new branch] gh/tugsbayasgalan/8/head -> origin/gh/tugsbayasgalan/8/head 2025-12-04T08:53:59.3411621Z * [new branch] gh/tugsbayasgalan/8/orig -> origin/gh/tugsbayasgalan/8/orig 2025-12-04T08:53:59.3411702Z * [new branch] gh/tugsbayasgalan/80/base -> origin/gh/tugsbayasgalan/80/base 2025-12-04T08:53:59.3411784Z * [new branch] gh/tugsbayasgalan/80/head -> origin/gh/tugsbayasgalan/80/head 2025-12-04T08:53:59.3411906Z * [new branch] gh/tugsbayasgalan/80/orig -> origin/gh/tugsbayasgalan/80/orig 2025-12-04T08:53:59.3411987Z * [new branch] gh/tugsbayasgalan/81/base -> origin/gh/tugsbayasgalan/81/base 2025-12-04T08:53:59.3412068Z * [new branch] gh/tugsbayasgalan/81/head -> origin/gh/tugsbayasgalan/81/head 2025-12-04T08:53:59.3412151Z * [new branch] gh/tugsbayasgalan/81/orig -> origin/gh/tugsbayasgalan/81/orig 2025-12-04T08:53:59.3412233Z * [new branch] gh/tugsbayasgalan/82/base -> origin/gh/tugsbayasgalan/82/base 2025-12-04T08:53:59.3412313Z * [new branch] gh/tugsbayasgalan/82/head -> origin/gh/tugsbayasgalan/82/head 2025-12-04T08:53:59.3412395Z * [new branch] gh/tugsbayasgalan/82/orig -> origin/gh/tugsbayasgalan/82/orig 2025-12-04T08:53:59.3412477Z * [new branch] gh/tugsbayasgalan/83/base -> origin/gh/tugsbayasgalan/83/base 2025-12-04T08:53:59.3412559Z * [new branch] gh/tugsbayasgalan/83/head -> origin/gh/tugsbayasgalan/83/head 2025-12-04T08:53:59.3412639Z * [new branch] gh/tugsbayasgalan/83/orig -> origin/gh/tugsbayasgalan/83/orig 2025-12-04T08:53:59.3412720Z * [new branch] gh/tugsbayasgalan/84/base -> origin/gh/tugsbayasgalan/84/base 2025-12-04T08:53:59.3412801Z * [new branch] gh/tugsbayasgalan/84/head -> origin/gh/tugsbayasgalan/84/head 2025-12-04T08:53:59.3412883Z * [new branch] gh/tugsbayasgalan/84/orig -> origin/gh/tugsbayasgalan/84/orig 2025-12-04T08:53:59.3413009Z * [new branch] gh/tugsbayasgalan/85/base -> origin/gh/tugsbayasgalan/85/base 2025-12-04T08:53:59.3413091Z * [new branch] gh/tugsbayasgalan/85/head -> origin/gh/tugsbayasgalan/85/head 2025-12-04T08:53:59.3413172Z * [new branch] gh/tugsbayasgalan/85/orig -> origin/gh/tugsbayasgalan/85/orig 2025-12-04T08:53:59.3413253Z * [new branch] gh/tugsbayasgalan/86/base -> origin/gh/tugsbayasgalan/86/base 2025-12-04T08:53:59.3413334Z * [new branch] gh/tugsbayasgalan/86/head -> origin/gh/tugsbayasgalan/86/head 2025-12-04T08:53:59.3413415Z * [new branch] gh/tugsbayasgalan/86/orig -> origin/gh/tugsbayasgalan/86/orig 2025-12-04T08:53:59.3413495Z * [new branch] gh/tugsbayasgalan/87/base -> origin/gh/tugsbayasgalan/87/base 2025-12-04T08:53:59.3413579Z * [new branch] gh/tugsbayasgalan/87/head -> origin/gh/tugsbayasgalan/87/head 2025-12-04T08:53:59.3413661Z * [new branch] gh/tugsbayasgalan/87/orig -> origin/gh/tugsbayasgalan/87/orig 2025-12-04T08:53:59.3413741Z * [new branch] gh/tugsbayasgalan/88/base -> origin/gh/tugsbayasgalan/88/base 2025-12-04T08:53:59.3413823Z * [new branch] gh/tugsbayasgalan/88/head -> origin/gh/tugsbayasgalan/88/head 2025-12-04T08:53:59.3413940Z * [new branch] gh/tugsbayasgalan/88/orig -> origin/gh/tugsbayasgalan/88/orig 2025-12-04T08:53:59.3414021Z * [new branch] gh/tugsbayasgalan/89/base -> origin/gh/tugsbayasgalan/89/base 2025-12-04T08:53:59.3414102Z * [new branch] gh/tugsbayasgalan/89/head -> origin/gh/tugsbayasgalan/89/head 2025-12-04T08:53:59.3414183Z * [new branch] gh/tugsbayasgalan/89/orig -> origin/gh/tugsbayasgalan/89/orig 2025-12-04T08:53:59.3414264Z * [new branch] gh/tugsbayasgalan/9/base -> origin/gh/tugsbayasgalan/9/base 2025-12-04T08:53:59.3414343Z * [new branch] gh/tugsbayasgalan/9/head -> origin/gh/tugsbayasgalan/9/head 2025-12-04T08:53:59.3414424Z * [new branch] gh/tugsbayasgalan/9/orig -> origin/gh/tugsbayasgalan/9/orig 2025-12-04T08:53:59.3414507Z * [new branch] gh/tugsbayasgalan/90/base -> origin/gh/tugsbayasgalan/90/base 2025-12-04T08:53:59.3414587Z * [new branch] gh/tugsbayasgalan/90/head -> origin/gh/tugsbayasgalan/90/head 2025-12-04T08:53:59.3414668Z * [new branch] gh/tugsbayasgalan/90/orig -> origin/gh/tugsbayasgalan/90/orig 2025-12-04T08:53:59.3414749Z * [new branch] gh/tugsbayasgalan/91/base -> origin/gh/tugsbayasgalan/91/base 2025-12-04T08:53:59.3414829Z * [new branch] gh/tugsbayasgalan/91/head -> origin/gh/tugsbayasgalan/91/head 2025-12-04T08:53:59.3414909Z * [new branch] gh/tugsbayasgalan/91/orig -> origin/gh/tugsbayasgalan/91/orig 2025-12-04T08:53:59.3414991Z * [new branch] gh/tugsbayasgalan/92/base -> origin/gh/tugsbayasgalan/92/base 2025-12-04T08:53:59.3415073Z * [new branch] gh/tugsbayasgalan/92/head -> origin/gh/tugsbayasgalan/92/head 2025-12-04T08:53:59.3415154Z * [new branch] gh/tugsbayasgalan/92/orig -> origin/gh/tugsbayasgalan/92/orig 2025-12-04T08:53:59.3415235Z * [new branch] gh/tugsbayasgalan/93/base -> origin/gh/tugsbayasgalan/93/base 2025-12-04T08:53:59.3415317Z * [new branch] gh/tugsbayasgalan/93/head -> origin/gh/tugsbayasgalan/93/head 2025-12-04T08:53:59.3415398Z * [new branch] gh/tugsbayasgalan/93/orig -> origin/gh/tugsbayasgalan/93/orig 2025-12-04T08:53:59.3415466Z * [new branch] gh/v0i0/14/base -> origin/gh/v0i0/14/base 2025-12-04T08:53:59.3415530Z * [new branch] gh/v0i0/14/head -> origin/gh/v0i0/14/head 2025-12-04T08:53:59.3415595Z * [new branch] gh/v0i0/14/orig -> origin/gh/v0i0/14/orig 2025-12-04T08:53:59.3415659Z * [new branch] gh/v0i0/15/base -> origin/gh/v0i0/15/base 2025-12-04T08:53:59.3415752Z * [new branch] gh/v0i0/15/head -> origin/gh/v0i0/15/head 2025-12-04T08:53:59.3415815Z * [new branch] gh/v0i0/15/orig -> origin/gh/v0i0/15/orig 2025-12-04T08:53:59.3415876Z * [new branch] gh/v0i0/16/base -> origin/gh/v0i0/16/base 2025-12-04T08:53:59.3415938Z * [new branch] gh/v0i0/16/head -> origin/gh/v0i0/16/head 2025-12-04T08:53:59.3416000Z * [new branch] gh/v0i0/16/orig -> origin/gh/v0i0/16/orig 2025-12-04T08:53:59.3416061Z * [new branch] gh/v0i0/17/base -> origin/gh/v0i0/17/base 2025-12-04T08:53:59.3416122Z * [new branch] gh/v0i0/17/head -> origin/gh/v0i0/17/head 2025-12-04T08:53:59.3416183Z * [new branch] gh/v0i0/17/orig -> origin/gh/v0i0/17/orig 2025-12-04T08:53:59.3416245Z * [new branch] gh/v0i0/18/base -> origin/gh/v0i0/18/base 2025-12-04T08:53:59.3416307Z * [new branch] gh/v0i0/18/head -> origin/gh/v0i0/18/head 2025-12-04T08:53:59.3416371Z * [new branch] gh/v0i0/18/orig -> origin/gh/v0i0/18/orig 2025-12-04T08:53:59.3416434Z * [new branch] gh/v0i0/19/base -> origin/gh/v0i0/19/base 2025-12-04T08:53:59.3416524Z * [new branch] gh/v0i0/19/head -> origin/gh/v0i0/19/head 2025-12-04T08:53:59.3416587Z * [new branch] gh/v0i0/19/orig -> origin/gh/v0i0/19/orig 2025-12-04T08:53:59.3416666Z * [new branch] gh/vishal9-team/1/base -> origin/gh/vishal9-team/1/base 2025-12-04T08:53:59.3416743Z * [new branch] gh/vishal9-team/1/head -> origin/gh/vishal9-team/1/head 2025-12-04T08:53:59.3416818Z * [new branch] gh/vishal9-team/2/base -> origin/gh/vishal9-team/2/base 2025-12-04T08:53:59.3416892Z * [new branch] gh/vishal9-team/2/head -> origin/gh/vishal9-team/2/head 2025-12-04T08:53:59.3416965Z * [new branch] gh/vishal9-team/2/orig -> origin/gh/vishal9-team/2/orig 2025-12-04T08:53:59.3417039Z * [new branch] gh/vishal9-team/3/base -> origin/gh/vishal9-team/3/base 2025-12-04T08:53:59.3417111Z * [new branch] gh/vishal9-team/3/head -> origin/gh/vishal9-team/3/head 2025-12-04T08:53:59.3417185Z * [new branch] gh/vishal9-team/3/orig -> origin/gh/vishal9-team/3/orig 2025-12-04T08:53:59.3417259Z * [new branch] gh/vishal9-team/4/base -> origin/gh/vishal9-team/4/base 2025-12-04T08:53:59.3417331Z * [new branch] gh/vishal9-team/4/head -> origin/gh/vishal9-team/4/head 2025-12-04T08:53:59.3417404Z * [new branch] gh/vishal9-team/4/orig -> origin/gh/vishal9-team/4/orig 2025-12-04T08:53:59.3417468Z * [new branch] gh/vkuzo/1/next -> origin/gh/vkuzo/1/next 2025-12-04T08:53:59.3417533Z * [new branch] gh/vkuzo/2/next -> origin/gh/vkuzo/2/next 2025-12-04T08:53:59.3417599Z * [new branch] gh/vkuzo/3/next -> origin/gh/vkuzo/3/next 2025-12-04T08:53:59.3417672Z * [new branch] gh/wconstab/424/base -> origin/gh/wconstab/424/base 2025-12-04T08:53:59.3417744Z * [new branch] gh/wconstab/424/head -> origin/gh/wconstab/424/head 2025-12-04T08:53:59.3417818Z * [new branch] gh/wconstab/424/orig -> origin/gh/wconstab/424/orig 2025-12-04T08:53:59.3417887Z * [new branch] gh/wconstab/435/base -> origin/gh/wconstab/435/base 2025-12-04T08:53:59.3417957Z * [new branch] gh/wconstab/435/head -> origin/gh/wconstab/435/head 2025-12-04T08:53:59.3418028Z * [new branch] gh/wconstab/435/orig -> origin/gh/wconstab/435/orig 2025-12-04T08:53:59.3418098Z * [new branch] gh/wconstab/444/base -> origin/gh/wconstab/444/base 2025-12-04T08:53:59.3418167Z * [new branch] gh/wconstab/444/head -> origin/gh/wconstab/444/head 2025-12-04T08:53:59.3418263Z * [new branch] gh/wconstab/444/orig -> origin/gh/wconstab/444/orig 2025-12-04T08:53:59.3418332Z * [new branch] gh/wconstab/447/base -> origin/gh/wconstab/447/base 2025-12-04T08:53:59.3418401Z * [new branch] gh/wconstab/447/head -> origin/gh/wconstab/447/head 2025-12-04T08:53:59.3418473Z * [new branch] gh/wconstab/447/orig -> origin/gh/wconstab/447/orig 2025-12-04T08:53:59.3418542Z * [new branch] gh/wconstab/448/base -> origin/gh/wconstab/448/base 2025-12-04T08:53:59.3418611Z * [new branch] gh/wconstab/448/head -> origin/gh/wconstab/448/head 2025-12-04T08:53:59.3418681Z * [new branch] gh/wconstab/448/orig -> origin/gh/wconstab/448/orig 2025-12-04T08:53:59.3418750Z * [new branch] gh/wconstab/449/base -> origin/gh/wconstab/449/base 2025-12-04T08:53:59.3418821Z * [new branch] gh/wconstab/449/head -> origin/gh/wconstab/449/head 2025-12-04T08:53:59.3418891Z * [new branch] gh/wconstab/449/orig -> origin/gh/wconstab/449/orig 2025-12-04T08:53:59.3418961Z * [new branch] gh/wconstab/450/base -> origin/gh/wconstab/450/base 2025-12-04T08:53:59.3419031Z * [new branch] gh/wconstab/450/head -> origin/gh/wconstab/450/head 2025-12-04T08:53:59.3419139Z * [new branch] gh/wconstab/450/orig -> origin/gh/wconstab/450/orig 2025-12-04T08:53:59.3419209Z * [new branch] gh/wconstab/451/base -> origin/gh/wconstab/451/base 2025-12-04T08:53:59.3419282Z * [new branch] gh/wconstab/451/head -> origin/gh/wconstab/451/head 2025-12-04T08:53:59.3419351Z * [new branch] gh/wconstab/451/orig -> origin/gh/wconstab/451/orig 2025-12-04T08:53:59.3419420Z * [new branch] gh/wconstab/452/base -> origin/gh/wconstab/452/base 2025-12-04T08:53:59.3419491Z * [new branch] gh/wconstab/452/head -> origin/gh/wconstab/452/head 2025-12-04T08:53:59.3419563Z * [new branch] gh/wconstab/452/orig -> origin/gh/wconstab/452/orig 2025-12-04T08:53:59.3419632Z * [new branch] gh/wconstab/453/base -> origin/gh/wconstab/453/base 2025-12-04T08:53:59.3419703Z * [new branch] gh/wconstab/453/head -> origin/gh/wconstab/453/head 2025-12-04T08:53:59.3419773Z * [new branch] gh/wconstab/453/orig -> origin/gh/wconstab/453/orig 2025-12-04T08:53:59.3419842Z * [new branch] gh/wconstab/454/base -> origin/gh/wconstab/454/base 2025-12-04T08:53:59.3419912Z * [new branch] gh/wconstab/454/head -> origin/gh/wconstab/454/head 2025-12-04T08:53:59.3419984Z * [new branch] gh/wconstab/454/orig -> origin/gh/wconstab/454/orig 2025-12-04T08:53:59.3420053Z * [new branch] gh/wconstab/455/base -> origin/gh/wconstab/455/base 2025-12-04T08:53:59.3420124Z * [new branch] gh/wconstab/455/head -> origin/gh/wconstab/455/head 2025-12-04T08:53:59.3420195Z * [new branch] gh/wconstab/455/orig -> origin/gh/wconstab/455/orig 2025-12-04T08:53:59.3420264Z * [new branch] gh/wconstab/456/base -> origin/gh/wconstab/456/base 2025-12-04T08:53:59.3420335Z * [new branch] gh/wconstab/456/head -> origin/gh/wconstab/456/head 2025-12-04T08:53:59.3420406Z * [new branch] gh/wconstab/456/orig -> origin/gh/wconstab/456/orig 2025-12-04T08:53:59.3420476Z * [new branch] gh/wconstab/457/base -> origin/gh/wconstab/457/base 2025-12-04T08:53:59.3420546Z * [new branch] gh/wconstab/457/head -> origin/gh/wconstab/457/head 2025-12-04T08:53:59.3420616Z * [new branch] gh/wconstab/457/orig -> origin/gh/wconstab/457/orig 2025-12-04T08:53:59.3420687Z * [new branch] gh/wconstab/458/base -> origin/gh/wconstab/458/base 2025-12-04T08:53:59.3420785Z * [new branch] gh/wconstab/458/head -> origin/gh/wconstab/458/head 2025-12-04T08:53:59.3420854Z * [new branch] gh/wconstab/458/orig -> origin/gh/wconstab/458/orig 2025-12-04T08:53:59.3420926Z * [new branch] gh/wconstab/459/base -> origin/gh/wconstab/459/base 2025-12-04T08:53:59.3420995Z * [new branch] gh/wconstab/459/head -> origin/gh/wconstab/459/head 2025-12-04T08:53:59.3421066Z * [new branch] gh/wconstab/459/orig -> origin/gh/wconstab/459/orig 2025-12-04T08:53:59.3421137Z * [new branch] gh/wconstab/460/base -> origin/gh/wconstab/460/base 2025-12-04T08:53:59.3421207Z * [new branch] gh/wconstab/460/head -> origin/gh/wconstab/460/head 2025-12-04T08:53:59.3421276Z * [new branch] gh/wconstab/460/orig -> origin/gh/wconstab/460/orig 2025-12-04T08:53:59.3421347Z * [new branch] gh/wconstab/461/base -> origin/gh/wconstab/461/base 2025-12-04T08:53:59.3421418Z * [new branch] gh/wconstab/461/head -> origin/gh/wconstab/461/head 2025-12-04T08:53:59.3421487Z * [new branch] gh/wconstab/461/orig -> origin/gh/wconstab/461/orig 2025-12-04T08:53:59.3421559Z * [new branch] gh/wconstab/462/base -> origin/gh/wconstab/462/base 2025-12-04T08:53:59.3421628Z * [new branch] gh/wconstab/462/head -> origin/gh/wconstab/462/head 2025-12-04T08:53:59.3421723Z * [new branch] gh/wconstab/462/orig -> origin/gh/wconstab/462/orig 2025-12-04T08:53:59.3421795Z * [new branch] gh/wconstab/463/base -> origin/gh/wconstab/463/base 2025-12-04T08:53:59.3421903Z * [new branch] gh/wconstab/463/head -> origin/gh/wconstab/463/head 2025-12-04T08:53:59.3421977Z * [new branch] gh/wconstab/463/orig -> origin/gh/wconstab/463/orig 2025-12-04T08:53:59.3422047Z * [new branch] gh/wconstab/464/base -> origin/gh/wconstab/464/base 2025-12-04T08:53:59.3422119Z * [new branch] gh/wconstab/464/head -> origin/gh/wconstab/464/head 2025-12-04T08:53:59.3422190Z * [new branch] gh/wconstab/464/orig -> origin/gh/wconstab/464/orig 2025-12-04T08:53:59.3422260Z * [new branch] gh/wconstab/465/base -> origin/gh/wconstab/465/base 2025-12-04T08:53:59.3422331Z * [new branch] gh/wconstab/465/head -> origin/gh/wconstab/465/head 2025-12-04T08:53:59.3422403Z * [new branch] gh/wconstab/465/orig -> origin/gh/wconstab/465/orig 2025-12-04T08:53:59.3422472Z * [new branch] gh/wconstab/466/base -> origin/gh/wconstab/466/base 2025-12-04T08:53:59.3422542Z * [new branch] gh/wconstab/466/head -> origin/gh/wconstab/466/head 2025-12-04T08:53:59.3422613Z * [new branch] gh/wconstab/466/orig -> origin/gh/wconstab/466/orig 2025-12-04T08:53:59.3422683Z * [new branch] gh/wconstab/467/base -> origin/gh/wconstab/467/base 2025-12-04T08:53:59.3422756Z * [new branch] gh/wconstab/467/head -> origin/gh/wconstab/467/head 2025-12-04T08:53:59.3422829Z * [new branch] gh/wconstab/467/orig -> origin/gh/wconstab/467/orig 2025-12-04T08:53:59.3422900Z * [new branch] gh/wconstab/468/base -> origin/gh/wconstab/468/base 2025-12-04T08:53:59.3422969Z * [new branch] gh/wconstab/468/head -> origin/gh/wconstab/468/head 2025-12-04T08:53:59.3423042Z * [new branch] gh/wconstab/468/orig -> origin/gh/wconstab/468/orig 2025-12-04T08:53:59.3423114Z * [new branch] gh/weifengpy/39/base -> origin/gh/weifengpy/39/base 2025-12-04T08:53:59.3423185Z * [new branch] gh/weifengpy/39/head -> origin/gh/weifengpy/39/head 2025-12-04T08:53:59.3423257Z * [new branch] gh/weifengpy/39/orig -> origin/gh/weifengpy/39/orig 2025-12-04T08:53:59.3423328Z * [new branch] gh/weifengpy/40/base -> origin/gh/weifengpy/40/base 2025-12-04T08:53:59.3423437Z * [new branch] gh/weifengpy/40/head -> origin/gh/weifengpy/40/head 2025-12-04T08:53:59.3423507Z * [new branch] gh/weifengpy/40/orig -> origin/gh/weifengpy/40/orig 2025-12-04T08:53:59.3423577Z * [new branch] gh/weifengpy/41/base -> origin/gh/weifengpy/41/base 2025-12-04T08:53:59.3423650Z * [new branch] gh/weifengpy/41/head -> origin/gh/weifengpy/41/head 2025-12-04T08:53:59.3423721Z * [new branch] gh/weifengpy/41/orig -> origin/gh/weifengpy/41/orig 2025-12-04T08:53:59.3423801Z * [new branch] gh/williamwen42/250/base -> origin/gh/williamwen42/250/base 2025-12-04T08:53:59.3423882Z * [new branch] gh/williamwen42/250/head -> origin/gh/williamwen42/250/head 2025-12-04T08:53:59.3423960Z * [new branch] gh/williamwen42/250/orig -> origin/gh/williamwen42/250/orig 2025-12-04T08:53:59.3424039Z * [new branch] gh/williamwen42/279/base -> origin/gh/williamwen42/279/base 2025-12-04T08:53:59.3424118Z * [new branch] gh/williamwen42/279/head -> origin/gh/williamwen42/279/head 2025-12-04T08:53:59.3424196Z * [new branch] gh/williamwen42/279/orig -> origin/gh/williamwen42/279/orig 2025-12-04T08:53:59.3424272Z * [new branch] gh/williamwen42/282/base -> origin/gh/williamwen42/282/base 2025-12-04T08:53:59.3424392Z * [new branch] gh/williamwen42/282/head -> origin/gh/williamwen42/282/head 2025-12-04T08:53:59.3424470Z * [new branch] gh/williamwen42/282/orig -> origin/gh/williamwen42/282/orig 2025-12-04T08:53:59.3424546Z * [new branch] gh/williamwen42/287/base -> origin/gh/williamwen42/287/base 2025-12-04T08:53:59.3424623Z * [new branch] gh/williamwen42/287/head -> origin/gh/williamwen42/287/head 2025-12-04T08:53:59.3424700Z * [new branch] gh/williamwen42/287/orig -> origin/gh/williamwen42/287/orig 2025-12-04T08:53:59.3424778Z * [new branch] gh/williamwen42/288/base -> origin/gh/williamwen42/288/base 2025-12-04T08:53:59.3424858Z * [new branch] gh/williamwen42/288/head -> origin/gh/williamwen42/288/head 2025-12-04T08:53:59.3424934Z * [new branch] gh/williamwen42/288/orig -> origin/gh/williamwen42/288/orig 2025-12-04T08:53:59.3425014Z * [new branch] gh/williamwen42/296/base -> origin/gh/williamwen42/296/base 2025-12-04T08:53:59.3425091Z * [new branch] gh/williamwen42/296/head -> origin/gh/williamwen42/296/head 2025-12-04T08:53:59.3425167Z * [new branch] gh/williamwen42/296/orig -> origin/gh/williamwen42/296/orig 2025-12-04T08:53:59.3425245Z * [new branch] gh/williamwen42/297/base -> origin/gh/williamwen42/297/base 2025-12-04T08:53:59.3425321Z * [new branch] gh/williamwen42/297/head -> origin/gh/williamwen42/297/head 2025-12-04T08:53:59.3425397Z * [new branch] gh/williamwen42/297/orig -> origin/gh/williamwen42/297/orig 2025-12-04T08:53:59.3425476Z * [new branch] gh/williamwen42/306/base -> origin/gh/williamwen42/306/base 2025-12-04T08:53:59.3425552Z * [new branch] gh/williamwen42/306/head -> origin/gh/williamwen42/306/head 2025-12-04T08:53:59.3425627Z * [new branch] gh/williamwen42/306/orig -> origin/gh/williamwen42/306/orig 2025-12-04T08:53:59.3425707Z * [new branch] gh/williamwen42/309/base -> origin/gh/williamwen42/309/base 2025-12-04T08:53:59.3425784Z * [new branch] gh/williamwen42/309/head -> origin/gh/williamwen42/309/head 2025-12-04T08:53:59.3425860Z * [new branch] gh/williamwen42/309/orig -> origin/gh/williamwen42/309/orig 2025-12-04T08:53:59.3425938Z * [new branch] gh/williamwen42/310/base -> origin/gh/williamwen42/310/base 2025-12-04T08:53:59.3426014Z * [new branch] gh/williamwen42/310/head -> origin/gh/williamwen42/310/head 2025-12-04T08:53:59.3426120Z * [new branch] gh/williamwen42/310/orig -> origin/gh/williamwen42/310/orig 2025-12-04T08:53:59.3426198Z * [new branch] gh/williamwen42/311/base -> origin/gh/williamwen42/311/base 2025-12-04T08:53:59.3426274Z * [new branch] gh/williamwen42/311/head -> origin/gh/williamwen42/311/head 2025-12-04T08:53:59.3426350Z * [new branch] gh/williamwen42/311/orig -> origin/gh/williamwen42/311/orig 2025-12-04T08:53:59.3426429Z * [new branch] gh/williamwen42/319/base -> origin/gh/williamwen42/319/base 2025-12-04T08:53:59.3426505Z * [new branch] gh/williamwen42/319/head -> origin/gh/williamwen42/319/head 2025-12-04T08:53:59.3426582Z * [new branch] gh/williamwen42/319/orig -> origin/gh/williamwen42/319/orig 2025-12-04T08:53:59.3426658Z * [new branch] gh/williamwen42/325/base -> origin/gh/williamwen42/325/base 2025-12-04T08:53:59.3426736Z * [new branch] gh/williamwen42/325/head -> origin/gh/williamwen42/325/head 2025-12-04T08:53:59.3426818Z * [new branch] gh/williamwen42/325/orig -> origin/gh/williamwen42/325/orig 2025-12-04T08:53:59.3426894Z * [new branch] gh/williamwen42/326/base -> origin/gh/williamwen42/326/base 2025-12-04T08:53:59.3426971Z * [new branch] gh/williamwen42/326/head -> origin/gh/williamwen42/326/head 2025-12-04T08:53:59.3427076Z * [new branch] gh/williamwen42/326/orig -> origin/gh/williamwen42/326/orig 2025-12-04T08:53:59.3427152Z * [new branch] gh/williamwen42/327/base -> origin/gh/williamwen42/327/base 2025-12-04T08:53:59.3427232Z * [new branch] gh/williamwen42/327/head -> origin/gh/williamwen42/327/head 2025-12-04T08:53:59.3427308Z * [new branch] gh/williamwen42/327/orig -> origin/gh/williamwen42/327/orig 2025-12-04T08:53:59.3427386Z * [new branch] gh/williamwen42/328/base -> origin/gh/williamwen42/328/base 2025-12-04T08:53:59.3427463Z * [new branch] gh/williamwen42/328/head -> origin/gh/williamwen42/328/head 2025-12-04T08:53:59.3427541Z * [new branch] gh/williamwen42/328/orig -> origin/gh/williamwen42/328/orig 2025-12-04T08:53:59.3427619Z * [new branch] gh/williamwen42/329/base -> origin/gh/williamwen42/329/base 2025-12-04T08:53:59.3427695Z * [new branch] gh/williamwen42/329/head -> origin/gh/williamwen42/329/head 2025-12-04T08:53:59.3427773Z * [new branch] gh/williamwen42/329/orig -> origin/gh/williamwen42/329/orig 2025-12-04T08:53:59.3427850Z * [new branch] gh/williamwen42/330/base -> origin/gh/williamwen42/330/base 2025-12-04T08:53:59.3427927Z * [new branch] gh/williamwen42/330/head -> origin/gh/williamwen42/330/head 2025-12-04T08:53:59.3428004Z * [new branch] gh/williamwen42/330/orig -> origin/gh/williamwen42/330/orig 2025-12-04T08:53:59.3428082Z * [new branch] gh/williamwen42/331/base -> origin/gh/williamwen42/331/base 2025-12-04T08:53:59.3428161Z * [new branch] gh/williamwen42/331/head -> origin/gh/williamwen42/331/head 2025-12-04T08:53:59.3428237Z * [new branch] gh/williamwen42/331/orig -> origin/gh/williamwen42/331/orig 2025-12-04T08:53:59.3428315Z * [new branch] gh/williamwen42/332/base -> origin/gh/williamwen42/332/base 2025-12-04T08:53:59.3428394Z * [new branch] gh/williamwen42/332/head -> origin/gh/williamwen42/332/head 2025-12-04T08:53:59.3428470Z * [new branch] gh/williamwen42/332/orig -> origin/gh/williamwen42/332/orig 2025-12-04T08:53:59.3428548Z * [new branch] gh/williamwen42/333/base -> origin/gh/williamwen42/333/base 2025-12-04T08:53:59.3428625Z * [new branch] gh/williamwen42/333/head -> origin/gh/williamwen42/333/head 2025-12-04T08:53:59.3428703Z * [new branch] gh/williamwen42/333/orig -> origin/gh/williamwen42/333/orig 2025-12-04T08:53:59.3428780Z * [new branch] gh/williamwen42/334/base -> origin/gh/williamwen42/334/base 2025-12-04T08:53:59.3428886Z * [new branch] gh/williamwen42/334/head -> origin/gh/williamwen42/334/head 2025-12-04T08:53:59.3428964Z * [new branch] gh/williamwen42/334/orig -> origin/gh/williamwen42/334/orig 2025-12-04T08:53:59.3429040Z * [new branch] gh/williamwen42/335/base -> origin/gh/williamwen42/335/base 2025-12-04T08:53:59.3429118Z * [new branch] gh/williamwen42/335/head -> origin/gh/williamwen42/335/head 2025-12-04T08:53:59.3429195Z * [new branch] gh/williamwen42/335/orig -> origin/gh/williamwen42/335/orig 2025-12-04T08:53:59.3429270Z * [new branch] gh/williamwen42/336/base -> origin/gh/williamwen42/336/base 2025-12-04T08:53:59.3429347Z * [new branch] gh/williamwen42/336/head -> origin/gh/williamwen42/336/head 2025-12-04T08:53:59.3429424Z * [new branch] gh/williamwen42/336/orig -> origin/gh/williamwen42/336/orig 2025-12-04T08:53:59.3429502Z * [new branch] gh/williamwen42/337/base -> origin/gh/williamwen42/337/base 2025-12-04T08:53:59.3429578Z * [new branch] gh/williamwen42/337/head -> origin/gh/williamwen42/337/head 2025-12-04T08:53:59.3429657Z * [new branch] gh/williamwen42/337/orig -> origin/gh/williamwen42/337/orig 2025-12-04T08:53:59.3429768Z * [new branch] gh/williamwen42/338/base -> origin/gh/williamwen42/338/base 2025-12-04T08:53:59.3429844Z * [new branch] gh/williamwen42/338/head -> origin/gh/williamwen42/338/head 2025-12-04T08:53:59.3429923Z * [new branch] gh/williamwen42/338/orig -> origin/gh/williamwen42/338/orig 2025-12-04T08:53:59.3430000Z * [new branch] gh/williamwen42/339/base -> origin/gh/williamwen42/339/base 2025-12-04T08:53:59.3430075Z * [new branch] gh/williamwen42/339/head -> origin/gh/williamwen42/339/head 2025-12-04T08:53:59.3430153Z * [new branch] gh/williamwen42/339/orig -> origin/gh/williamwen42/339/orig 2025-12-04T08:53:59.3430230Z * [new branch] gh/williamwen42/340/base -> origin/gh/williamwen42/340/base 2025-12-04T08:53:59.3430308Z * [new branch] gh/williamwen42/340/head -> origin/gh/williamwen42/340/head 2025-12-04T08:53:59.3430384Z * [new branch] gh/williamwen42/340/orig -> origin/gh/williamwen42/340/orig 2025-12-04T08:53:59.3430461Z * [new branch] gh/williamwen42/341/base -> origin/gh/williamwen42/341/base 2025-12-04T08:53:59.3430539Z * [new branch] gh/williamwen42/341/head -> origin/gh/williamwen42/341/head 2025-12-04T08:53:59.3430614Z * [new branch] gh/williamwen42/341/orig -> origin/gh/williamwen42/341/orig 2025-12-04T08:53:59.3430690Z * [new branch] gh/williamwen42/342/base -> origin/gh/williamwen42/342/base 2025-12-04T08:53:59.3430768Z * [new branch] gh/williamwen42/342/head -> origin/gh/williamwen42/342/head 2025-12-04T08:53:59.3430845Z * [new branch] gh/williamwen42/342/orig -> origin/gh/williamwen42/342/orig 2025-12-04T08:53:59.3430921Z * [new branch] gh/williamwen42/343/base -> origin/gh/williamwen42/343/base 2025-12-04T08:53:59.3430999Z * [new branch] gh/williamwen42/343/head -> origin/gh/williamwen42/343/head 2025-12-04T08:53:59.3431077Z * [new branch] gh/williamwen42/343/orig -> origin/gh/williamwen42/343/orig 2025-12-04T08:53:59.3431153Z * [new branch] gh/williamwen42/344/base -> origin/gh/williamwen42/344/base 2025-12-04T08:53:59.3431229Z * [new branch] gh/williamwen42/344/head -> origin/gh/williamwen42/344/head 2025-12-04T08:53:59.3431305Z * [new branch] gh/williamwen42/344/orig -> origin/gh/williamwen42/344/orig 2025-12-04T08:53:59.3431381Z * [new branch] gh/williamwen42/345/base -> origin/gh/williamwen42/345/base 2025-12-04T08:53:59.3431457Z * [new branch] gh/williamwen42/345/head -> origin/gh/williamwen42/345/head 2025-12-04T08:53:59.3431563Z * [new branch] gh/williamwen42/345/orig -> origin/gh/williamwen42/345/orig 2025-12-04T08:53:59.3431641Z * [new branch] gh/williamwen42/346/base -> origin/gh/williamwen42/346/base 2025-12-04T08:53:59.3431717Z * [new branch] gh/williamwen42/346/head -> origin/gh/williamwen42/346/head 2025-12-04T08:53:59.3431794Z * [new branch] gh/williamwen42/346/orig -> origin/gh/williamwen42/346/orig 2025-12-04T08:53:59.3431902Z * [new branch] gh/williamwen42/347/base -> origin/gh/williamwen42/347/base 2025-12-04T08:53:59.3431979Z * [new branch] gh/williamwen42/347/head -> origin/gh/williamwen42/347/head 2025-12-04T08:53:59.3432055Z * [new branch] gh/williamwen42/347/orig -> origin/gh/williamwen42/347/orig 2025-12-04T08:53:59.3432133Z * [new branch] gh/williamwen42/348/base -> origin/gh/williamwen42/348/base 2025-12-04T08:53:59.3432211Z * [new branch] gh/williamwen42/348/head -> origin/gh/williamwen42/348/head 2025-12-04T08:53:59.3432287Z * [new branch] gh/williamwen42/348/orig -> origin/gh/williamwen42/348/orig 2025-12-04T08:53:59.3432364Z * [new branch] gh/williamwen42/349/base -> origin/gh/williamwen42/349/base 2025-12-04T08:53:59.3432483Z * [new branch] gh/williamwen42/349/head -> origin/gh/williamwen42/349/head 2025-12-04T08:53:59.3432561Z * [new branch] gh/williamwen42/349/orig -> origin/gh/williamwen42/349/orig 2025-12-04T08:53:59.3432638Z * [new branch] gh/williamwen42/350/base -> origin/gh/williamwen42/350/base 2025-12-04T08:53:59.3432714Z * [new branch] gh/williamwen42/350/head -> origin/gh/williamwen42/350/head 2025-12-04T08:53:59.3432790Z * [new branch] gh/williamwen42/350/orig -> origin/gh/williamwen42/350/orig 2025-12-04T08:53:59.3432869Z * [new branch] gh/williamwen42/351/base -> origin/gh/williamwen42/351/base 2025-12-04T08:53:59.3432947Z * [new branch] gh/williamwen42/351/head -> origin/gh/williamwen42/351/head 2025-12-04T08:53:59.3433022Z * [new branch] gh/williamwen42/351/orig -> origin/gh/williamwen42/351/orig 2025-12-04T08:53:59.3433100Z * [new branch] gh/williamwen42/352/base -> origin/gh/williamwen42/352/base 2025-12-04T08:53:59.3433177Z * [new branch] gh/williamwen42/352/head -> origin/gh/williamwen42/352/head 2025-12-04T08:53:59.3433256Z * [new branch] gh/williamwen42/352/orig -> origin/gh/williamwen42/352/orig 2025-12-04T08:53:59.3433332Z * [new branch] gh/williamwen42/353/base -> origin/gh/williamwen42/353/base 2025-12-04T08:53:59.3433408Z * [new branch] gh/williamwen42/353/head -> origin/gh/williamwen42/353/head 2025-12-04T08:53:59.3433487Z * [new branch] gh/williamwen42/353/orig -> origin/gh/williamwen42/353/orig 2025-12-04T08:53:59.3433564Z * [new branch] gh/williamwen42/354/base -> origin/gh/williamwen42/354/base 2025-12-04T08:53:59.3433640Z * [new branch] gh/williamwen42/354/head -> origin/gh/williamwen42/354/head 2025-12-04T08:53:59.3433715Z * [new branch] gh/williamwen42/354/orig -> origin/gh/williamwen42/354/orig 2025-12-04T08:53:59.3433793Z * [new branch] gh/williamwen42/355/base -> origin/gh/williamwen42/355/base 2025-12-04T08:53:59.3433868Z * [new branch] gh/williamwen42/355/head -> origin/gh/williamwen42/355/head 2025-12-04T08:53:59.3433946Z * [new branch] gh/williamwen42/355/orig -> origin/gh/williamwen42/355/orig 2025-12-04T08:53:59.3434022Z * [new branch] gh/williamwen42/356/base -> origin/gh/williamwen42/356/base 2025-12-04T08:53:59.3434098Z * [new branch] gh/williamwen42/356/head -> origin/gh/williamwen42/356/head 2025-12-04T08:53:59.3434175Z * [new branch] gh/williamwen42/356/orig -> origin/gh/williamwen42/356/orig 2025-12-04T08:53:59.3434288Z * [new branch] gh/williamwen42/357/base -> origin/gh/williamwen42/357/base 2025-12-04T08:53:59.3434364Z * [new branch] gh/williamwen42/357/head -> origin/gh/williamwen42/357/head 2025-12-04T08:53:59.3434441Z * [new branch] gh/williamwen42/357/orig -> origin/gh/williamwen42/357/orig 2025-12-04T08:53:59.3434520Z * [new branch] gh/williamwen42/358/base -> origin/gh/williamwen42/358/base 2025-12-04T08:53:59.3434597Z * [new branch] gh/williamwen42/358/head -> origin/gh/williamwen42/358/head 2025-12-04T08:53:59.3434673Z * [new branch] gh/williamwen42/358/orig -> origin/gh/williamwen42/358/orig 2025-12-04T08:53:59.3434742Z * [new branch] gh/xmfan/169/base -> origin/gh/xmfan/169/base 2025-12-04T08:53:59.3434809Z * [new branch] gh/xmfan/169/head -> origin/gh/xmfan/169/head 2025-12-04T08:53:59.3434876Z * [new branch] gh/xmfan/170/base -> origin/gh/xmfan/170/base 2025-12-04T08:53:59.3434944Z * [new branch] gh/xmfan/170/head -> origin/gh/xmfan/170/head 2025-12-04T08:53:59.3435010Z * [new branch] gh/xmfan/274/base -> origin/gh/xmfan/274/base 2025-12-04T08:53:59.3435074Z * [new branch] gh/xmfan/274/head -> origin/gh/xmfan/274/head 2025-12-04T08:53:59.3435166Z * [new branch] gh/xmfan/274/orig -> origin/gh/xmfan/274/orig 2025-12-04T08:53:59.3435232Z * [new branch] gh/xmfan/277/base -> origin/gh/xmfan/277/base 2025-12-04T08:53:59.3435296Z * [new branch] gh/xmfan/277/head -> origin/gh/xmfan/277/head 2025-12-04T08:53:59.3435361Z * [new branch] gh/xmfan/277/orig -> origin/gh/xmfan/277/orig 2025-12-04T08:53:59.3435427Z * [new branch] gh/xmfan/301/base -> origin/gh/xmfan/301/base 2025-12-04T08:53:59.3435491Z * [new branch] gh/xmfan/301/head -> origin/gh/xmfan/301/head 2025-12-04T08:53:59.3435558Z * [new branch] gh/xmfan/301/orig -> origin/gh/xmfan/301/orig 2025-12-04T08:53:59.3435624Z * [new branch] gh/xmfan/304/base -> origin/gh/xmfan/304/base 2025-12-04T08:53:59.3435688Z * [new branch] gh/xmfan/304/head -> origin/gh/xmfan/304/head 2025-12-04T08:53:59.3435754Z * [new branch] gh/xmfan/304/orig -> origin/gh/xmfan/304/orig 2025-12-04T08:53:59.3435819Z * [new branch] gh/xmfan/309/base -> origin/gh/xmfan/309/base 2025-12-04T08:53:59.3435884Z * [new branch] gh/xmfan/309/head -> origin/gh/xmfan/309/head 2025-12-04T08:53:59.3435948Z * [new branch] gh/xmfan/309/orig -> origin/gh/xmfan/309/orig 2025-12-04T08:53:59.3436014Z * [new branch] gh/xmfan/310/base -> origin/gh/xmfan/310/base 2025-12-04T08:53:59.3436078Z * [new branch] gh/xmfan/310/head -> origin/gh/xmfan/310/head 2025-12-04T08:53:59.3436143Z * [new branch] gh/xmfan/310/orig -> origin/gh/xmfan/310/orig 2025-12-04T08:53:59.3436209Z * [new branch] gh/xmfan/311/base -> origin/gh/xmfan/311/base 2025-12-04T08:53:59.3436273Z * [new branch] gh/xmfan/311/head -> origin/gh/xmfan/311/head 2025-12-04T08:53:59.3436340Z * [new branch] gh/xmfan/311/orig -> origin/gh/xmfan/311/orig 2025-12-04T08:53:59.3436405Z * [new branch] gh/xmfan/312/base -> origin/gh/xmfan/312/base 2025-12-04T08:53:59.3436470Z * [new branch] gh/xmfan/312/head -> origin/gh/xmfan/312/head 2025-12-04T08:53:59.3436535Z * [new branch] gh/xmfan/312/orig -> origin/gh/xmfan/312/orig 2025-12-04T08:53:59.3436600Z * [new branch] gh/xmfan/313/base -> origin/gh/xmfan/313/base 2025-12-04T08:53:59.3436664Z * [new branch] gh/xmfan/313/head -> origin/gh/xmfan/313/head 2025-12-04T08:53:59.3436755Z * [new branch] gh/xmfan/313/orig -> origin/gh/xmfan/313/orig 2025-12-04T08:53:59.3436833Z * [new branch] gh/xuanzhang816/27/base -> origin/gh/xuanzhang816/27/base 2025-12-04T08:53:59.3436909Z * [new branch] gh/xuanzhang816/27/head -> origin/gh/xuanzhang816/27/head 2025-12-04T08:53:59.3436987Z * [new branch] gh/xuanzhang816/27/orig -> origin/gh/xuanzhang816/27/orig 2025-12-04T08:53:59.3437063Z * [new branch] gh/xuanzhang816/32/base -> origin/gh/xuanzhang816/32/base 2025-12-04T08:53:59.3437136Z * [new branch] gh/xuanzhang816/32/head -> origin/gh/xuanzhang816/32/head 2025-12-04T08:53:59.3437212Z * [new branch] gh/xuanzhang816/32/orig -> origin/gh/xuanzhang816/32/orig 2025-12-04T08:53:59.3437286Z * [new branch] gh/xuanzhang816/33/base -> origin/gh/xuanzhang816/33/base 2025-12-04T08:53:59.3437360Z * [new branch] gh/xuanzhang816/33/head -> origin/gh/xuanzhang816/33/head 2025-12-04T08:53:59.3437437Z * [new branch] gh/xuanzhang816/33/orig -> origin/gh/xuanzhang816/33/orig 2025-12-04T08:53:59.3437511Z * [new branch] gh/xuanzhang816/34/base -> origin/gh/xuanzhang816/34/base 2025-12-04T08:53:59.3437584Z * [new branch] gh/xuanzhang816/34/head -> origin/gh/xuanzhang816/34/head 2025-12-04T08:53:59.3437689Z * [new branch] gh/xuanzhang816/34/orig -> origin/gh/xuanzhang816/34/orig 2025-12-04T08:53:59.3437762Z * [new branch] gh/xuanzhang816/35/base -> origin/gh/xuanzhang816/35/base 2025-12-04T08:53:59.3437837Z * [new branch] gh/xuanzhang816/35/head -> origin/gh/xuanzhang816/35/head 2025-12-04T08:53:59.3437911Z * [new branch] gh/xuanzhang816/35/orig -> origin/gh/xuanzhang816/35/orig 2025-12-04T08:53:59.3437983Z * [new branch] gh/yanbing-j/11/base -> origin/gh/yanbing-j/11/base 2025-12-04T08:53:59.3438059Z * [new branch] gh/yanbing-j/11/head -> origin/gh/yanbing-j/11/head 2025-12-04T08:53:59.3438129Z * [new branch] gh/yanbing-j/11/orig -> origin/gh/yanbing-j/11/orig 2025-12-04T08:53:59.3438200Z * [new branch] gh/yanbing-j/12/base -> origin/gh/yanbing-j/12/base 2025-12-04T08:53:59.3438272Z * [new branch] gh/yanbing-j/12/head -> origin/gh/yanbing-j/12/head 2025-12-04T08:53:59.3438341Z * [new branch] gh/yanbing-j/12/orig -> origin/gh/yanbing-j/12/orig 2025-12-04T08:53:59.3438409Z * [new branch] gh/yanbing-j/13/base -> origin/gh/yanbing-j/13/base 2025-12-04T08:53:59.3438477Z * [new branch] gh/yanbing-j/13/head -> origin/gh/yanbing-j/13/head 2025-12-04T08:53:59.3438545Z * [new branch] gh/yanbing-j/13/orig -> origin/gh/yanbing-j/13/orig 2025-12-04T08:53:59.3438613Z * [new branch] gh/yanbing-j/14/base -> origin/gh/yanbing-j/14/base 2025-12-04T08:53:59.3438683Z * [new branch] gh/yanbing-j/14/head -> origin/gh/yanbing-j/14/head 2025-12-04T08:53:59.3438751Z * [new branch] gh/yanbing-j/14/orig -> origin/gh/yanbing-j/14/orig 2025-12-04T08:53:59.3438819Z * [new branch] gh/yanbing-j/15/base -> origin/gh/yanbing-j/15/base 2025-12-04T08:53:59.3438888Z * [new branch] gh/yanbing-j/15/head -> origin/gh/yanbing-j/15/head 2025-12-04T08:53:59.3438957Z * [new branch] gh/yanbing-j/15/orig -> origin/gh/yanbing-j/15/orig 2025-12-04T08:53:59.3439025Z * [new branch] gh/yanbing-j/18/base -> origin/gh/yanbing-j/18/base 2025-12-04T08:53:59.3439096Z * [new branch] gh/yanbing-j/18/head -> origin/gh/yanbing-j/18/head 2025-12-04T08:53:59.3439165Z * [new branch] gh/yanbing-j/18/orig -> origin/gh/yanbing-j/18/orig 2025-12-04T08:53:59.3439233Z * [new branch] gh/yanbing-j/19/base -> origin/gh/yanbing-j/19/base 2025-12-04T08:53:59.3439366Z * [new branch] gh/yanbing-j/19/head -> origin/gh/yanbing-j/19/head 2025-12-04T08:53:59.3439434Z * [new branch] gh/yanbing-j/19/orig -> origin/gh/yanbing-j/19/orig 2025-12-04T08:53:59.3439503Z * [new branch] gh/yanbing-j/20/base -> origin/gh/yanbing-j/20/base 2025-12-04T08:53:59.3439571Z * [new branch] gh/yanbing-j/20/head -> origin/gh/yanbing-j/20/head 2025-12-04T08:53:59.3439641Z * [new branch] gh/yanbing-j/20/orig -> origin/gh/yanbing-j/20/orig 2025-12-04T08:53:59.3439711Z * [new branch] gh/yanbing-j/21/base -> origin/gh/yanbing-j/21/base 2025-12-04T08:53:59.3439779Z * [new branch] gh/yanbing-j/21/head -> origin/gh/yanbing-j/21/head 2025-12-04T08:53:59.3439846Z * [new branch] gh/yanbing-j/22/base -> origin/gh/yanbing-j/22/base 2025-12-04T08:53:59.3439915Z * [new branch] gh/yanbing-j/22/head -> origin/gh/yanbing-j/22/head 2025-12-04T08:53:59.3439984Z * [new branch] gh/yanbing-j/22/orig -> origin/gh/yanbing-j/22/orig 2025-12-04T08:53:59.3440052Z * [new branch] gh/yanbing-j/23/base -> origin/gh/yanbing-j/23/base 2025-12-04T08:53:59.3440121Z * [new branch] gh/yanbing-j/23/head -> origin/gh/yanbing-j/23/head 2025-12-04T08:53:59.3440215Z * [new branch] gh/yanbing-j/23/orig -> origin/gh/yanbing-j/23/orig 2025-12-04T08:53:59.3440283Z * [new branch] gh/yanbing-j/24/base -> origin/gh/yanbing-j/24/base 2025-12-04T08:53:59.3440353Z * [new branch] gh/yanbing-j/24/head -> origin/gh/yanbing-j/24/head 2025-12-04T08:53:59.3440421Z * [new branch] gh/yanbing-j/24/orig -> origin/gh/yanbing-j/24/orig 2025-12-04T08:53:59.3440488Z * [new branch] gh/yanbing-j/25/base -> origin/gh/yanbing-j/25/base 2025-12-04T08:53:59.3440557Z * [new branch] gh/yanbing-j/25/head -> origin/gh/yanbing-j/25/head 2025-12-04T08:53:59.3440627Z * [new branch] gh/yanbing-j/25/orig -> origin/gh/yanbing-j/25/orig 2025-12-04T08:53:59.3440694Z * [new branch] gh/yanbing-j/26/base -> origin/gh/yanbing-j/26/base 2025-12-04T08:53:59.3440765Z * [new branch] gh/yanbing-j/26/head -> origin/gh/yanbing-j/26/head 2025-12-04T08:53:59.3440834Z * [new branch] gh/yanbing-j/26/orig -> origin/gh/yanbing-j/26/orig 2025-12-04T08:53:59.3440912Z * [new branch] gh/yang-yu-hang/1/base -> origin/gh/yang-yu-hang/1/base 2025-12-04T08:53:59.3440986Z * [new branch] gh/yang-yu-hang/1/head -> origin/gh/yang-yu-hang/1/head 2025-12-04T08:53:59.3441058Z * [new branch] gh/yang-yu-hang/1/orig -> origin/gh/yang-yu-hang/1/orig 2025-12-04T08:53:59.3441133Z * [new branch] gh/yang-yu-hang/2/base -> origin/gh/yang-yu-hang/2/base 2025-12-04T08:53:59.3441204Z * [new branch] gh/yang-yu-hang/2/head -> origin/gh/yang-yu-hang/2/head 2025-12-04T08:53:59.3441277Z * [new branch] gh/yang-yu-hang/2/orig -> origin/gh/yang-yu-hang/2/orig 2025-12-04T08:53:59.3441350Z * [new branch] gh/yang-yu-hang/3/base -> origin/gh/yang-yu-hang/3/base 2025-12-04T08:53:59.3441421Z * [new branch] gh/yang-yu-hang/3/head -> origin/gh/yang-yu-hang/3/head 2025-12-04T08:53:59.3441493Z * [new branch] gh/yang-yu-hang/3/orig -> origin/gh/yang-yu-hang/3/orig 2025-12-04T08:53:59.3441566Z * [new branch] gh/yangw-dev/12/base -> origin/gh/yangw-dev/12/base 2025-12-04T08:53:59.3441636Z * [new branch] gh/yangw-dev/12/head -> origin/gh/yangw-dev/12/head 2025-12-04T08:53:59.3441706Z * [new branch] gh/yangw-dev/12/orig -> origin/gh/yangw-dev/12/orig 2025-12-04T08:53:59.3441779Z * [new branch] gh/yangw-dev/13/base -> origin/gh/yangw-dev/13/base 2025-12-04T08:53:59.3441867Z * [new branch] gh/yangw-dev/13/head -> origin/gh/yangw-dev/13/head 2025-12-04T08:53:59.3441976Z * [new branch] gh/yangw-dev/13/orig -> origin/gh/yangw-dev/13/orig 2025-12-04T08:53:59.3442049Z * [new branch] gh/yangw-dev/14/base -> origin/gh/yangw-dev/14/base 2025-12-04T08:53:59.3442117Z * [new branch] gh/yangw-dev/14/head -> origin/gh/yangw-dev/14/head 2025-12-04T08:53:59.3442187Z * [new branch] gh/yangw-dev/14/orig -> origin/gh/yangw-dev/14/orig 2025-12-04T08:53:59.3442257Z * [new branch] gh/yangw-dev/15/base -> origin/gh/yangw-dev/15/base 2025-12-04T08:53:59.3442326Z * [new branch] gh/yangw-dev/15/head -> origin/gh/yangw-dev/15/head 2025-12-04T08:53:59.3442395Z * [new branch] gh/yangw-dev/15/orig -> origin/gh/yangw-dev/15/orig 2025-12-04T08:53:59.3442464Z * [new branch] gh/yangw-dev/19/base -> origin/gh/yangw-dev/19/base 2025-12-04T08:53:59.3442532Z * [new branch] gh/yangw-dev/19/head -> origin/gh/yangw-dev/19/head 2025-12-04T08:53:59.3442603Z * [new branch] gh/yangw-dev/19/orig -> origin/gh/yangw-dev/19/orig 2025-12-04T08:53:59.3442671Z * [new branch] gh/yangw-dev/26/base -> origin/gh/yangw-dev/26/base 2025-12-04T08:53:59.3442740Z * [new branch] gh/yangw-dev/26/head -> origin/gh/yangw-dev/26/head 2025-12-04T08:53:59.3442868Z * [new branch] gh/yangw-dev/26/orig -> origin/gh/yangw-dev/26/orig 2025-12-04T08:53:59.3442937Z * [new branch] gh/yangw-dev/27/base -> origin/gh/yangw-dev/27/base 2025-12-04T08:53:59.3443006Z * [new branch] gh/yangw-dev/27/head -> origin/gh/yangw-dev/27/head 2025-12-04T08:53:59.3443077Z * [new branch] gh/yangw-dev/27/orig -> origin/gh/yangw-dev/27/orig 2025-12-04T08:53:59.3443144Z * [new branch] gh/ydwu4/292/base -> origin/gh/ydwu4/292/base 2025-12-04T08:53:59.3443211Z * [new branch] gh/ydwu4/292/head -> origin/gh/ydwu4/292/head 2025-12-04T08:53:59.3443278Z * [new branch] gh/ydwu4/292/orig -> origin/gh/ydwu4/292/orig 2025-12-04T08:53:59.3443343Z * [new branch] gh/ydwu4/294/base -> origin/gh/ydwu4/294/base 2025-12-04T08:53:59.3443408Z * [new branch] gh/ydwu4/294/head -> origin/gh/ydwu4/294/head 2025-12-04T08:53:59.3443475Z * [new branch] gh/ydwu4/294/orig -> origin/gh/ydwu4/294/orig 2025-12-04T08:53:59.3443540Z * [new branch] gh/ydwu4/295/base -> origin/gh/ydwu4/295/base 2025-12-04T08:53:59.3443604Z * [new branch] gh/ydwu4/295/head -> origin/gh/ydwu4/295/head 2025-12-04T08:53:59.3443670Z * [new branch] gh/ydwu4/295/orig -> origin/gh/ydwu4/295/orig 2025-12-04T08:53:59.3443737Z * [new branch] gh/ydwu4/296/base -> origin/gh/ydwu4/296/base 2025-12-04T08:53:59.3443801Z * [new branch] gh/ydwu4/296/head -> origin/gh/ydwu4/296/head 2025-12-04T08:53:59.3443868Z * [new branch] gh/ydwu4/296/orig -> origin/gh/ydwu4/296/orig 2025-12-04T08:53:59.3443932Z * [new branch] gh/ydwu4/306/base -> origin/gh/ydwu4/306/base 2025-12-04T08:53:59.3443998Z * [new branch] gh/ydwu4/306/head -> origin/gh/ydwu4/306/head 2025-12-04T08:53:59.3444063Z * [new branch] gh/ydwu4/306/orig -> origin/gh/ydwu4/306/orig 2025-12-04T08:53:59.3444128Z * [new branch] gh/ydwu4/312/base -> origin/gh/ydwu4/312/base 2025-12-04T08:53:59.3444194Z * [new branch] gh/ydwu4/312/head -> origin/gh/ydwu4/312/head 2025-12-04T08:53:59.3444259Z * [new branch] gh/ydwu4/312/orig -> origin/gh/ydwu4/312/orig 2025-12-04T08:53:59.3444323Z * [new branch] gh/ydwu4/322/base -> origin/gh/ydwu4/322/base 2025-12-04T08:53:59.3444388Z * [new branch] gh/ydwu4/322/head -> origin/gh/ydwu4/322/head 2025-12-04T08:53:59.3444478Z * [new branch] gh/ydwu4/322/orig -> origin/gh/ydwu4/322/orig 2025-12-04T08:53:59.3444542Z * [new branch] gh/ydwu4/327/base -> origin/gh/ydwu4/327/base 2025-12-04T08:53:59.3444608Z * [new branch] gh/ydwu4/327/head -> origin/gh/ydwu4/327/head 2025-12-04T08:53:59.3444673Z * [new branch] gh/ydwu4/327/orig -> origin/gh/ydwu4/327/orig 2025-12-04T08:53:59.3444738Z * [new branch] gh/ydwu4/328/base -> origin/gh/ydwu4/328/base 2025-12-04T08:53:59.3444804Z * [new branch] gh/ydwu4/328/head -> origin/gh/ydwu4/328/head 2025-12-04T08:53:59.3444868Z * [new branch] gh/ydwu4/328/orig -> origin/gh/ydwu4/328/orig 2025-12-04T08:53:59.3444933Z * [new branch] gh/ydwu4/329/base -> origin/gh/ydwu4/329/base 2025-12-04T08:53:59.3444999Z * [new branch] gh/ydwu4/329/head -> origin/gh/ydwu4/329/head 2025-12-04T08:53:59.3445064Z * [new branch] gh/ydwu4/329/orig -> origin/gh/ydwu4/329/orig 2025-12-04T08:53:59.3445128Z * [new branch] gh/ydwu4/330/base -> origin/gh/ydwu4/330/base 2025-12-04T08:53:59.3445193Z * [new branch] gh/ydwu4/330/head -> origin/gh/ydwu4/330/head 2025-12-04T08:53:59.3445284Z * [new branch] gh/ydwu4/330/orig -> origin/gh/ydwu4/330/orig 2025-12-04T08:53:59.3445347Z * [new branch] gh/ydwu4/331/base -> origin/gh/ydwu4/331/base 2025-12-04T08:53:59.3445412Z * [new branch] gh/ydwu4/331/head -> origin/gh/ydwu4/331/head 2025-12-04T08:53:59.3445476Z * [new branch] gh/ydwu4/331/orig -> origin/gh/ydwu4/331/orig 2025-12-04T08:53:59.3445542Z * [new branch] gh/ydwu4/332/base -> origin/gh/ydwu4/332/base 2025-12-04T08:53:59.3445605Z * [new branch] gh/ydwu4/332/head -> origin/gh/ydwu4/332/head 2025-12-04T08:53:59.3445673Z * [new branch] gh/ydwu4/332/orig -> origin/gh/ydwu4/332/orig 2025-12-04T08:53:59.3445739Z * [new branch] gh/ydwu4/333/base -> origin/gh/ydwu4/333/base 2025-12-04T08:53:59.3445802Z * [new branch] gh/ydwu4/333/head -> origin/gh/ydwu4/333/head 2025-12-04T08:53:59.3445866Z * [new branch] gh/ydwu4/333/orig -> origin/gh/ydwu4/333/orig 2025-12-04T08:53:59.3445933Z * [new branch] gh/ydwu4/334/base -> origin/gh/ydwu4/334/base 2025-12-04T08:53:59.3445997Z * [new branch] gh/ydwu4/334/head -> origin/gh/ydwu4/334/head 2025-12-04T08:53:59.3446061Z * [new branch] gh/ydwu4/334/orig -> origin/gh/ydwu4/334/orig 2025-12-04T08:53:59.3446125Z * [new branch] gh/ydwu4/335/base -> origin/gh/ydwu4/335/base 2025-12-04T08:53:59.3446190Z * [new branch] gh/ydwu4/335/head -> origin/gh/ydwu4/335/head 2025-12-04T08:53:59.3446255Z * [new branch] gh/ydwu4/335/orig -> origin/gh/ydwu4/335/orig 2025-12-04T08:53:59.3446319Z * [new branch] gh/ydwu4/337/base -> origin/gh/ydwu4/337/base 2025-12-04T08:53:59.3446383Z * [new branch] gh/ydwu4/337/head -> origin/gh/ydwu4/337/head 2025-12-04T08:53:59.3446447Z * [new branch] gh/ydwu4/337/orig -> origin/gh/ydwu4/337/orig 2025-12-04T08:53:59.3446513Z * [new branch] gh/ydwu4/339/base -> origin/gh/ydwu4/339/base 2025-12-04T08:53:59.3446578Z * [new branch] gh/ydwu4/339/head -> origin/gh/ydwu4/339/head 2025-12-04T08:53:59.3446641Z * [new branch] gh/ydwu4/339/orig -> origin/gh/ydwu4/339/orig 2025-12-04T08:53:59.3446706Z * [new branch] gh/yf225/133/base -> origin/gh/yf225/133/base 2025-12-04T08:53:59.3446769Z * [new branch] gh/yf225/133/head -> origin/gh/yf225/133/head 2025-12-04T08:53:59.3446857Z * [new branch] gh/yf225/93/base -> origin/gh/yf225/93/base 2025-12-04T08:53:59.3446922Z * [new branch] gh/yf225/93/head -> origin/gh/yf225/93/head 2025-12-04T08:53:59.3446994Z * [new branch] gh/yifuwang/152/base -> origin/gh/yifuwang/152/base 2025-12-04T08:53:59.3447066Z * [new branch] gh/yifuwang/152/head -> origin/gh/yifuwang/152/head 2025-12-04T08:53:59.3447138Z * [new branch] gh/yifuwang/152/orig -> origin/gh/yifuwang/152/orig 2025-12-04T08:53:59.3447208Z * [new branch] gh/yifuwang/195/base -> origin/gh/yifuwang/195/base 2025-12-04T08:53:59.3447279Z * [new branch] gh/yifuwang/195/head -> origin/gh/yifuwang/195/head 2025-12-04T08:53:59.3447349Z * [new branch] gh/yifuwang/195/orig -> origin/gh/yifuwang/195/orig 2025-12-04T08:53:59.3447421Z * [new branch] gh/yiming0416/1/base -> origin/gh/yiming0416/1/base 2025-12-04T08:53:59.3447494Z * [new branch] gh/yiming0416/1/head -> origin/gh/yiming0416/1/head 2025-12-04T08:53:59.3447564Z * [new branch] gh/yiming0416/2/base -> origin/gh/yiming0416/2/base 2025-12-04T08:53:59.3447632Z * [new branch] gh/yiming0416/2/head -> origin/gh/yiming0416/2/head 2025-12-04T08:53:59.3447709Z * [new branch] gh/yushangdi/1/base -> origin/gh/yushangdi/1/base 2025-12-04T08:53:59.3447812Z * [new branch] gh/yushangdi/1/head -> origin/gh/yushangdi/1/head 2025-12-04T08:53:59.3447884Z * [new branch] gh/yushangdi/10/base -> origin/gh/yushangdi/10/base 2025-12-04T08:53:59.3447957Z * [new branch] gh/yushangdi/10/head -> origin/gh/yushangdi/10/head 2025-12-04T08:53:59.3448029Z * [new branch] gh/yushangdi/10/orig -> origin/gh/yushangdi/10/orig 2025-12-04T08:53:59.3448100Z * [new branch] gh/yushangdi/11/base -> origin/gh/yushangdi/11/base 2025-12-04T08:53:59.3448175Z * [new branch] gh/yushangdi/11/head -> origin/gh/yushangdi/11/head 2025-12-04T08:53:59.3448247Z * [new branch] gh/yushangdi/11/orig -> origin/gh/yushangdi/11/orig 2025-12-04T08:53:59.3448318Z * [new branch] gh/yushangdi/2/base -> origin/gh/yushangdi/2/base 2025-12-04T08:53:59.3448392Z * [new branch] gh/yushangdi/2/head -> origin/gh/yushangdi/2/head 2025-12-04T08:53:59.3448463Z * [new branch] gh/yushangdi/7/base -> origin/gh/yushangdi/7/base 2025-12-04T08:53:59.3448534Z * [new branch] gh/yushangdi/7/head -> origin/gh/yushangdi/7/head 2025-12-04T08:53:59.3448607Z * [new branch] gh/yushangdi/7/orig -> origin/gh/yushangdi/7/orig 2025-12-04T08:53:59.3448677Z * [new branch] gh/yushangdi/8/base -> origin/gh/yushangdi/8/base 2025-12-04T08:53:59.3448747Z * [new branch] gh/yushangdi/8/head -> origin/gh/yushangdi/8/head 2025-12-04T08:53:59.3448818Z * [new branch] gh/yushangdi/8/orig -> origin/gh/yushangdi/8/orig 2025-12-04T08:53:59.3448887Z * [new branch] gh/yushangdi/9/base -> origin/gh/yushangdi/9/base 2025-12-04T08:53:59.3448960Z * [new branch] gh/yushangdi/9/head -> origin/gh/yushangdi/9/head 2025-12-04T08:53:59.3449030Z * [new branch] gh/yushangdi/9/orig -> origin/gh/yushangdi/9/orig 2025-12-04T08:53:59.3449099Z * [new branch] gh/zklaus/19/base -> origin/gh/zklaus/19/base 2025-12-04T08:53:59.3449168Z * [new branch] gh/zklaus/19/head -> origin/gh/zklaus/19/head 2025-12-04T08:53:59.3449233Z * [new branch] gh/zklaus/19/orig -> origin/gh/zklaus/19/orig 2025-12-04T08:53:59.3449298Z * [new branch] gh/zklaus/20/base -> origin/gh/zklaus/20/base 2025-12-04T08:53:59.3449368Z * [new branch] gh/zklaus/20/head -> origin/gh/zklaus/20/head 2025-12-04T08:53:59.3449477Z * [new branch] gh/zklaus/20/orig -> origin/gh/zklaus/20/orig 2025-12-04T08:53:59.3449542Z * [new branch] gh/zklaus/21/base -> origin/gh/zklaus/21/base 2025-12-04T08:53:59.3449608Z * [new branch] gh/zklaus/21/head -> origin/gh/zklaus/21/head 2025-12-04T08:53:59.3449674Z * [new branch] gh/zklaus/21/orig -> origin/gh/zklaus/21/orig 2025-12-04T08:53:59.3449741Z * [new branch] gh/zklaus/22/base -> origin/gh/zklaus/22/base 2025-12-04T08:53:59.3449809Z * [new branch] gh/zklaus/22/head -> origin/gh/zklaus/22/head 2025-12-04T08:53:59.3449874Z * [new branch] gh/zklaus/22/orig -> origin/gh/zklaus/22/orig 2025-12-04T08:53:59.3449940Z * [new branch] gh/zklaus/23/base -> origin/gh/zklaus/23/base 2025-12-04T08:53:59.3450007Z * [new branch] gh/zklaus/23/head -> origin/gh/zklaus/23/head 2025-12-04T08:53:59.3450076Z * [new branch] gh/zklaus/23/orig -> origin/gh/zklaus/23/orig 2025-12-04T08:53:59.3450142Z * [new branch] gh/zklaus/24/base -> origin/gh/zklaus/24/base 2025-12-04T08:53:59.3450212Z * [new branch] gh/zklaus/24/head -> origin/gh/zklaus/24/head 2025-12-04T08:53:59.3450277Z * [new branch] gh/zklaus/24/orig -> origin/gh/zklaus/24/orig 2025-12-04T08:53:59.3450375Z * [new branch] gh/zou3519/1197/base -> origin/gh/zou3519/1197/base 2025-12-04T08:53:59.3450445Z * [new branch] gh/zou3519/1197/head -> origin/gh/zou3519/1197/head 2025-12-04T08:53:59.3450514Z * [new branch] gh/zou3519/1197/orig -> origin/gh/zou3519/1197/orig 2025-12-04T08:53:59.3450585Z * [new branch] gh/zou3519/1199/base -> origin/gh/zou3519/1199/base 2025-12-04T08:53:59.3450652Z * [new branch] gh/zou3519/1199/head -> origin/gh/zou3519/1199/head 2025-12-04T08:53:59.3450722Z * [new branch] gh/zou3519/1199/orig -> origin/gh/zou3519/1199/orig 2025-12-04T08:53:59.3450791Z * [new branch] gh/zou3519/1200/base -> origin/gh/zou3519/1200/base 2025-12-04T08:53:59.3450859Z * [new branch] gh/zou3519/1200/head -> origin/gh/zou3519/1200/head 2025-12-04T08:53:59.3450928Z * [new branch] gh/zou3519/1200/orig -> origin/gh/zou3519/1200/orig 2025-12-04T08:53:59.3451000Z * [new branch] gh/zou3519/1201/base -> origin/gh/zou3519/1201/base 2025-12-04T08:53:59.3451068Z * [new branch] gh/zou3519/1201/head -> origin/gh/zou3519/1201/head 2025-12-04T08:53:59.3451137Z * [new branch] gh/zou3519/1201/orig -> origin/gh/zou3519/1201/orig 2025-12-04T08:53:59.3451205Z * [new branch] gh/zou3519/1202/base -> origin/gh/zou3519/1202/base 2025-12-04T08:53:59.3451272Z * [new branch] gh/zou3519/1202/head -> origin/gh/zou3519/1202/head 2025-12-04T08:53:59.3451342Z * [new branch] gh/zou3519/1202/orig -> origin/gh/zou3519/1202/orig 2025-12-04T08:53:59.3451412Z * [new branch] gh/zpcore/1/base -> origin/gh/zpcore/1/base 2025-12-04T08:53:59.3451479Z * [new branch] gh/zpcore/1/head -> origin/gh/zpcore/1/head 2025-12-04T08:53:59.3451546Z * [new branch] gh/zpcore/11/base -> origin/gh/zpcore/11/base 2025-12-04T08:53:59.3451616Z * [new branch] gh/zpcore/11/head -> origin/gh/zpcore/11/head 2025-12-04T08:53:59.3451681Z * [new branch] gh/zpcore/11/orig -> origin/gh/zpcore/11/orig 2025-12-04T08:53:59.3451747Z * [new branch] gh/zpcore/12/base -> origin/gh/zpcore/12/base 2025-12-04T08:53:59.3451814Z * [new branch] gh/zpcore/12/head -> origin/gh/zpcore/12/head 2025-12-04T08:53:59.3451909Z * [new branch] gh/zpcore/12/orig -> origin/gh/zpcore/12/orig 2025-12-04T08:53:59.3452023Z * [new branch] gh/zpcore/13/base -> origin/gh/zpcore/13/base 2025-12-04T08:53:59.3452089Z * [new branch] gh/zpcore/13/head -> origin/gh/zpcore/13/head 2025-12-04T08:53:59.3452156Z * [new branch] gh/zpcore/13/orig -> origin/gh/zpcore/13/orig 2025-12-04T08:53:59.3452226Z * [new branch] gh/zpcore/14/base -> origin/gh/zpcore/14/base 2025-12-04T08:53:59.3452291Z * [new branch] gh/zpcore/14/head -> origin/gh/zpcore/14/head 2025-12-04T08:53:59.3452357Z * [new branch] gh/zpcore/14/orig -> origin/gh/zpcore/14/orig 2025-12-04T08:53:59.3452424Z * [new branch] gh/zpcore/15/base -> origin/gh/zpcore/15/base 2025-12-04T08:53:59.3452490Z * [new branch] gh/zpcore/15/head -> origin/gh/zpcore/15/head 2025-12-04T08:53:59.3452555Z * [new branch] gh/zpcore/15/orig -> origin/gh/zpcore/15/orig 2025-12-04T08:53:59.3452622Z * [new branch] gh/zpcore/2/base -> origin/gh/zpcore/2/base 2025-12-04T08:53:59.3452690Z * [new branch] gh/zpcore/2/head -> origin/gh/zpcore/2/head 2025-12-04T08:53:59.3452754Z * [new branch] gh/zpcore/21/base -> origin/gh/zpcore/21/base 2025-12-04T08:53:59.3452821Z * [new branch] gh/zpcore/21/head -> origin/gh/zpcore/21/head 2025-12-04T08:53:59.3452932Z * [new branch] gh/zpcore/21/orig -> origin/gh/zpcore/21/orig 2025-12-04T08:53:59.3452999Z * [new branch] gh/zpcore/22/base -> origin/gh/zpcore/22/base 2025-12-04T08:53:59.3453067Z * [new branch] gh/zpcore/22/head -> origin/gh/zpcore/22/head 2025-12-04T08:53:59.3453133Z * [new branch] gh/zpcore/22/orig -> origin/gh/zpcore/22/orig 2025-12-04T08:53:59.3453198Z * [new branch] gh/zpcore/23/base -> origin/gh/zpcore/23/base 2025-12-04T08:53:59.3453266Z * [new branch] gh/zpcore/23/head -> origin/gh/zpcore/23/head 2025-12-04T08:53:59.3453333Z * [new branch] gh/zpcore/23/orig -> origin/gh/zpcore/23/orig 2025-12-04T08:53:59.3453398Z * [new branch] gh/zpcore/24/base -> origin/gh/zpcore/24/base 2025-12-04T08:53:59.3453466Z * [new branch] gh/zpcore/24/head -> origin/gh/zpcore/24/head 2025-12-04T08:53:59.3453532Z * [new branch] gh/zpcore/24/orig -> origin/gh/zpcore/24/orig 2025-12-04T08:53:59.3453598Z * [new branch] gh/zpcore/25/base -> origin/gh/zpcore/25/base 2025-12-04T08:53:59.3453665Z * [new branch] gh/zpcore/25/head -> origin/gh/zpcore/25/head 2025-12-04T08:53:59.3453730Z * [new branch] gh/zpcore/25/orig -> origin/gh/zpcore/25/orig 2025-12-04T08:53:59.3453799Z * [new branch] gh/zpcore/26/base -> origin/gh/zpcore/26/base 2025-12-04T08:53:59.3453866Z * [new branch] gh/zpcore/26/head -> origin/gh/zpcore/26/head 2025-12-04T08:53:59.3453932Z * [new branch] gh/zpcore/26/orig -> origin/gh/zpcore/26/orig 2025-12-04T08:53:59.3454000Z * [new branch] gh/zpcore/27/base -> origin/gh/zpcore/27/base 2025-12-04T08:53:59.3454064Z * [new branch] gh/zpcore/27/head -> origin/gh/zpcore/27/head 2025-12-04T08:53:59.3454131Z * [new branch] gh/zpcore/27/orig -> origin/gh/zpcore/27/orig 2025-12-04T08:53:59.3454198Z * [new branch] gh/zpcore/28/base -> origin/gh/zpcore/28/base 2025-12-04T08:53:59.3454263Z * [new branch] gh/zpcore/28/head -> origin/gh/zpcore/28/head 2025-12-04T08:53:59.3454329Z * [new branch] gh/zpcore/28/orig -> origin/gh/zpcore/28/orig 2025-12-04T08:53:59.3454399Z * [new branch] gh/zpcore/3/base -> origin/gh/zpcore/3/base 2025-12-04T08:53:59.3454466Z * [new branch] gh/zpcore/3/head -> origin/gh/zpcore/3/head 2025-12-04T08:53:59.3454560Z * [new branch] gh/zpcore/4/base -> origin/gh/zpcore/4/base 2025-12-04T08:53:59.3454631Z * [new branch] gh/zpcore/4/head -> origin/gh/zpcore/4/head 2025-12-04T08:53:59.3454696Z * [new branch] gh/zpcore/5/base -> origin/gh/zpcore/5/base 2025-12-04T08:53:59.3454761Z * [new branch] gh/zpcore/5/head -> origin/gh/zpcore/5/head 2025-12-04T08:53:59.3454832Z * [new branch] gh/zpcore/6/base -> origin/gh/zpcore/6/base 2025-12-04T08:53:59.3454898Z * [new branch] gh/zpcore/6/head -> origin/gh/zpcore/6/head 2025-12-04T08:53:59.3454964Z * [new branch] gh/zpcore/7/base -> origin/gh/zpcore/7/base 2025-12-04T08:53:59.3455031Z * [new branch] gh/zpcore/7/head -> origin/gh/zpcore/7/head 2025-12-04T08:53:59.3455096Z * [new branch] gh/zpcore/8/base -> origin/gh/zpcore/8/base 2025-12-04T08:53:59.3455167Z * [new branch] gh/zpcore/8/head -> origin/gh/zpcore/8/head 2025-12-04T08:53:59.3455234Z * [new branch] google-main -> origin/google-main 2025-12-04T08:53:59.3455319Z * [new branch] guangyey/external_stream -> origin/guangyey/external_stream 2025-12-04T08:53:59.3455393Z * [new branch] guangyey/test_2025 -> origin/guangyey/test_2025 2025-12-04T08:53:59.3455560Z * [new branch] guilhermeleobas/cherry-pick-55d87d9dfd9 -> origin/guilhermeleobas/cherry-pick-55d87d9dfd9 2025-12-04T08:53:59.3455677Z * [new branch] hameerabbasi/complex_tensor_subclass -> origin/hameerabbasi/complex_tensor_subclass 2025-12-04T08:53:59.3455820Z * [new branch] hameerabbasi/fix-ctensor-gradcheck-tests -> origin/hameerabbasi/fix-ctensor-gradcheck-tests 2025-12-04T08:53:59.3455928Z * [new branch] hameerabbasi/gradcheck-allclose -> origin/hameerabbasi/gradcheck-allclose 2025-12-04T08:53:59.3455994Z * [new branch] hc_baseline -> origin/hc_baseline 2025-12-04T08:53:59.3456060Z * [new branch] hhh_rand -> origin/hhh_rand 2025-12-04T08:53:59.3456121Z * [new branch] huba/f1 -> origin/huba/f1 2025-12-04T08:53:59.3456311Z * [new branch] increase-timeout-linux-jammy-cuda12_8-py3_10-gcc11-test -> origin/increase-timeout-linux-jammy-cuda12_8-py3_10-gcc11-test 2025-12-04T08:53:59.3456374Z * [new branch] inlining -> origin/inlining 2025-12-04T08:53:59.3456444Z * [new branch] inlining-ezyang -> origin/inlining-ezyang 2025-12-04T08:53:59.3456527Z * [new branch] install-torchao-0.13.0 -> origin/install-torchao-0.13.0 2025-12-04T08:53:59.3456710Z * [new branch] instrument-trunk-pull-linux-with-job-test-filters -> origin/instrument-trunk-pull-linux-with-job-test-filters 2025-12-04T08:53:59.3456780Z * [new branch] invoke-subgraph -> origin/invoke-subgraph 2025-12-04T08:53:59.3456849Z * [new branch] issue#58739 -> origin/issue#58739 2025-12-04T08:53:59.3456928Z * [new branch] jainapurva-patch-1 -> origin/jainapurva-patch-1 2025-12-04T08:53:59.3456988Z * [new branch] jathu/o3 -> origin/jathu/o3 2025-12-04T08:53:59.3457052Z * [new branch] jathu/sve -> origin/jathu/sve 2025-12-04T08:53:59.3457175Z * [new branch] jcaip/test-cusparselt-version-0.6.2 -> origin/jcaip/test-cusparselt-version-0.6.2 2025-12-04T08:53:59.3457279Z * [new branch] jcaip/update-cusparselt-0.6.2 -> origin/jcaip/update-cusparselt-0.6.2 2025-12-04T08:53:59.3457391Z * [new branch] jiannanWang/memorysnapshot_filter -> origin/jiannanWang/memorysnapshot_filter 2025-12-04T08:53:59.3457500Z * [new branch] jiannanWang/profilerstepwarning -> origin/jiannanWang/profilerstepwarning 2025-12-04T08:53:59.3457611Z * [new branch] jithunnair-amd-patch-1 -> origin/jithunnair-amd-patch-1 2025-12-04T08:53:59.3457699Z * [new branch] jithunnair-amd-patch-10 -> origin/jithunnair-amd-patch-10 2025-12-04T08:53:59.3457781Z * [new branch] jithunnair-amd-patch-2 -> origin/jithunnair-amd-patch-2 2025-12-04T08:53:59.3457862Z * [new branch] jithunnair-amd-patch-3 -> origin/jithunnair-amd-patch-3 2025-12-04T08:53:59.3457944Z * [new branch] jithunnair-amd-patch-4 -> origin/jithunnair-amd-patch-4 2025-12-04T08:53:59.3458022Z * [new branch] jithunnair-amd-patch-5 -> origin/jithunnair-amd-patch-5 2025-12-04T08:53:59.3458101Z * [new branch] jithunnair-amd-patch-6 -> origin/jithunnair-amd-patch-6 2025-12-04T08:53:59.3458181Z * [new branch] jithunnair-amd-patch-7 -> origin/jithunnair-amd-patch-7 2025-12-04T08:53:59.3458260Z * [new branch] jithunnair-amd-patch-8 -> origin/jithunnair-amd-patch-8 2025-12-04T08:53:59.3458342Z * [new branch] jithunnair-amd-patch-9 -> origin/jithunnair-amd-patch-9 2025-12-04T08:53:59.3458422Z * [new branch] justinchu/native-qdq -> origin/justinchu/native-qdq 2025-12-04T08:53:59.3458494Z * [new branch] kainan666/xlf_debug -> origin/kainan666/xlf_debug 2025-12-04T08:53:59.3458599Z * [new branch] kainan_test -> origin/kainan_test 2025-12-04T08:53:59.3458676Z * [new branch] larryliu0820-patch-1 -> origin/larryliu0820-patch-1 2025-12-04T08:53:59.3458779Z * [new branch] leslie/test_group_gemm_epilogues -> origin/leslie/test_group_gemm_epilogues 2025-12-04T08:53:59.3458881Z * [new branch] lessw2020/fix_cutlass_cache_error -> origin/lessw2020/fix_cutlass_cache_error 2025-12-04T08:53:59.3458957Z * [new branch] liaoxuan/shm_all_reduce -> origin/liaoxuan/shm_all_reduce 2025-12-04T08:53:59.3459058Z * [new branch] liaoxuan/test_fa_disable_softmax -> origin/liaoxuan/test_fa_disable_softmax 2025-12-04T08:53:59.3459136Z * [new branch] liaoxuan/test_int8_sdpa -> origin/liaoxuan/test_int8_sdpa 2025-12-04T08:53:59.3459202Z * [new branch] llama4-stable -> origin/llama4-stable 2025-12-04T08:53:59.3459268Z * [new branch] lts/release/1.8 -> origin/lts/release/1.8 2025-12-04T08:53:59.3459346Z * [new branch] lucaskabela/#94773 -> origin/lucaskabela/#94773 2025-12-04T08:53:59.3459422Z * [new branch] lucaskabela/fix_164876 -> origin/lucaskabela/fix_164876 2025-12-04T08:53:59.3459505Z * [new branch] lucaskabela/flop_counter -> origin/lucaskabela/flop_counter 2025-12-04T08:53:59.3459604Z * [new branch] lucaskabela/func_under_decomp -> origin/lucaskabela/func_under_decomp 2025-12-04T08:53:59.3459708Z * [new branch] lucaskabela/functional_in_dynamo -> origin/lucaskabela/functional_in_dynamo 2025-12-04T08:53:59.3459832Z * [new branch] lucaskabela/install_params_as_graph_attr -> origin/lucaskabela/install_params_as_graph_attr 2025-12-04T08:53:59.3459946Z * [new branch] lucaskabela/parameters_as_graph_attr -> origin/lucaskabela/parameters_as_graph_attr 2025-12-04T08:53:59.3460077Z * [new branch] lucaskabela/remove_aot_dispatcher_metadata -> origin/lucaskabela/remove_aot_dispatcher_metadata 2025-12-04T08:53:59.3460156Z * [new branch] lucaskabela/rnn_decomp -> origin/lucaskabela/rnn_decomp 2025-12-04T08:53:59.3460247Z * [new branch] lucaskabela/typing_backends -> origin/lucaskabela/typing_backends 2025-12-04T08:53:59.3460343Z * [new branch] lucaskabela/typing_ctx_manager -> origin/lucaskabela/typing_ctx_manager 2025-12-04T08:53:59.3460437Z * [new branch] lucaskabela/typing_nn_module -> origin/lucaskabela/typing_nn_module 2025-12-04T08:53:59.3460563Z * [new branch] lucaskabela/typing_user_defined -> origin/lucaskabela/typing_user_defined 2025-12-04T08:53:59.3460656Z * [new branch] lucaskabela/typing_variables -> origin/lucaskabela/typing_variables 2025-12-04T08:53:59.3460764Z * [new branch] lucaskabela/typing_variables_dicts -> origin/lucaskabela/typing_variables_dicts 2025-12-04T08:53:59.3460883Z * [new branch] lucaskabela/typing_variables_functions -> origin/lucaskabela/typing_variables_functions 2025-12-04T08:53:59.3460988Z * [new branch] lucaskabela/typing_variables_lists -> origin/lucaskabela/typing_variables_lists 2025-12-04T08:53:59.3461062Z * [new branch] lw/torch_box_by_ref -> origin/lw/torch_box_by_ref 2025-12-04T08:53:59.3461122Z * [new branch] main -> origin/main 2025-12-04T08:53:59.3461191Z * [new branch] malfet-patch-1 -> origin/malfet-patch-1 2025-12-04T08:53:59.3461261Z * [new branch] malfet-patch-2 -> origin/malfet-patch-2 2025-12-04T08:53:59.3461326Z * [new branch] malfet-patch-3 -> origin/malfet-patch-3 2025-12-04T08:53:59.3461390Z * [new branch] malfet-patch-4 -> origin/malfet-patch-4 2025-12-04T08:53:59.3461456Z * [new branch] malfet-patch-5 -> origin/malfet-patch-5 2025-12-04T08:53:59.3461547Z * [new branch] malfet-patch-6 -> origin/malfet-patch-6 2025-12-04T08:53:59.3461612Z * [new branch] malfet-patch-7 -> origin/malfet-patch-7 2025-12-04T08:53:59.3461676Z * [new branch] malfet-patch-8 -> origin/malfet-patch-8 2025-12-04T08:53:59.3461749Z * [new branch] malfet/add-3.14-ci -> origin/malfet/add-3.14-ci 2025-12-04T08:53:59.3461940Z * [new branch] malfet/be-do-not-make-typos-in-build-artifacts -> origin/malfet/be-do-not-make-typos-in-build-artifacts 2025-12-04T08:53:59.3462108Z * [new branch] malfet/be-move-more-settings-to-checkout-pytorch -> origin/malfet/be-move-more-settings-to-checkout-pytorch 2025-12-04T08:53:59.3462234Z * [new branch] malfet/be-remove-misisng-neon-headers -> origin/malfet/be-remove-misisng-neon-headers 2025-12-04T08:53:59.3462333Z * [new branch] malfet/mps-implement-col2im -> origin/malfet/mps-implement-col2im 2025-12-04T08:53:59.3462449Z * [new branch] manuel/aoti_metal_shimify-thread_safe -> origin/manuel/aoti_metal_shimify-thread_safe 2025-12-04T08:53:59.3462539Z * [new branch] manuel/inductor_link_openmp -> origin/manuel/inductor_link_openmp 2025-12-04T08:53:59.3462614Z * [new branch] masnesral/metaconda -> origin/masnesral/metaconda 2025-12-04T08:53:59.3462688Z * [new branch] mem_profiler_flaky_fix -> origin/mem_profiler_flaky_fix 2025-12-04T08:53:59.3462766Z * [new branch] mem_profiler_stack_trace -> origin/mem_profiler_stack_trace 2025-12-04T08:53:59.3462843Z * [new branch] memory_profiler_stack -> origin/memory_profiler_stack 2025-12-04T08:53:59.3462917Z * [new branch] metascroy-patch-1 -> origin/metascroy-patch-1 2025-12-04T08:53:59.3462979Z * [new branch] mingw_posix -> origin/mingw_posix 2025-12-04T08:53:59.3463055Z * [new branch] mlazos/S429861-debug -> origin/mlazos/S429861-debug 2025-12-04T08:53:59.3463115Z * [new branch] mlazos/aa -> origin/mlazos/aa 2025-12-04T08:53:59.3463177Z * [new branch] mlazos/acts -> origin/mlazos/acts 2025-12-04T08:53:59.3463248Z * [new branch] mlazos/arg-renames -> origin/mlazos/arg-renames 2025-12-04T08:53:59.3463324Z * [new branch] mlazos/bad-cudagraphs -> origin/mlazos/bad-cudagraphs 2025-12-04T08:53:59.3463424Z * [new branch] mlazos/baseline-graph-breaks -> origin/mlazos/baseline-graph-breaks 2025-12-04T08:53:59.3463538Z * [new branch] mlazos/beta-tensor -> origin/mlazos/beta-tensor 2025-12-04T08:53:59.3463602Z * [new branch] mlazos/buffers -> origin/mlazos/buffers 2025-12-04T08:53:59.3463669Z * [new branch] mlazos/buffers2 -> origin/mlazos/buffers2 2025-12-04T08:53:59.3463735Z * [new branch] mlazos/buffers3 -> origin/mlazos/buffers3 2025-12-04T08:53:59.3463796Z * [new branch] mlazos/bwd -> origin/mlazos/bwd 2025-12-04T08:53:59.3463868Z * [new branch] mlazos/combo-test -> origin/mlazos/combo-test 2025-12-04T08:53:59.3463940Z * [new branch] mlazos/ctx-cleanup -> origin/mlazos/ctx-cleanup 2025-12-04T08:53:59.3464013Z * [new branch] mlazos/cuda-cmd-log -> origin/mlazos/cuda-cmd-log 2025-12-04T08:53:59.3464095Z * [new branch] mlazos/cudagraph-tests -> origin/mlazos/cudagraph-tests 2025-12-04T08:53:59.3464197Z * [new branch] mlazos/cudagraphs-measurement -> origin/mlazos/cudagraphs-measurement 2025-12-04T08:53:59.3464270Z * [new branch] mlazos/cutlass-test -> origin/mlazos/cutlass-test 2025-12-04T08:53:59.3464351Z * [new branch] mlazos/cutlass-topo-bug -> origin/mlazos/cutlass-topo-bug 2025-12-04T08:53:59.3464465Z * [new branch] mlazos/dataclass-proxy -> origin/mlazos/dataclass-proxy 2025-12-04T08:53:59.3464533Z * [new branch] mlazos/dc-attrs -> origin/mlazos/dc-attrs 2025-12-04T08:53:59.3464603Z * [new branch] mlazos/dc-helion -> origin/mlazos/dc-helion 2025-12-04T08:53:59.3464669Z * [new branch] mlazos/dict-fix -> origin/mlazos/dict-fix 2025-12-04T08:53:59.3464738Z * [new branch] mlazos/disable-tf -> origin/mlazos/disable-tf 2025-12-04T08:53:59.3464804Z * [new branch] mlazos/dupe-fix -> origin/mlazos/dupe-fix 2025-12-04T08:53:59.3464873Z * [new branch] mlazos/dyn-batch -> origin/mlazos/dyn-batch 2025-12-04T08:53:59.3464935Z * [new branch] mlazos/evt -> origin/mlazos/evt 2025-12-04T08:53:59.3465015Z * [new branch] mlazos/extract-examples -> origin/mlazos/extract-examples 2025-12-04T08:53:59.3465084Z * [new branch] mlazos/foreach-op -> origin/mlazos/foreach-op 2025-12-04T08:53:59.3465147Z * [new branch] mlazos/fp8 -> origin/mlazos/fp8 2025-12-04T08:53:59.3465212Z * [new branch] mlazos/fp8-bias -> origin/mlazos/fp8-bias 2025-12-04T08:53:59.3465290Z * [new branch] mlazos/fp8-bias-fusion -> origin/mlazos/fp8-bias-fusion 2025-12-04T08:53:59.3465358Z * [new branch] mlazos/fp8-fixes -> origin/mlazos/fp8-fixes 2025-12-04T08:53:59.3465423Z * [new branch] mlazos/freezing -> origin/mlazos/freezing 2025-12-04T08:53:59.3465489Z * [new branch] mlazos/h-comp -> origin/mlazos/h-comp 2025-12-04T08:53:59.3465556Z * [new branch] mlazos/h-comp2 -> origin/mlazos/h-comp2 2025-12-04T08:53:59.3465623Z * [new branch] mlazos/hash-hop -> origin/mlazos/hash-hop 2025-12-04T08:53:59.3465682Z * [new branch] mlazos/hc -> origin/mlazos/hc 2025-12-04T08:53:59.3465754Z * [new branch] mlazos/hc-cycles -> origin/mlazos/hc-cycles 2025-12-04T08:53:59.3465821Z * [new branch] mlazos/hc-fixes -> origin/mlazos/hc-fixes 2025-12-04T08:53:59.3465888Z * [new branch] mlazos/hc-fixes3 -> origin/mlazos/hc-fixes3 2025-12-04T08:53:59.3465956Z * [new branch] mlazos/hc-fixes4 -> origin/mlazos/hc-fixes4 2025-12-04T08:53:59.3466020Z * [new branch] mlazos/hc-hf -> origin/mlazos/hc-hf 2025-12-04T08:53:59.3466084Z * [new branch] mlazos/hc-mut -> origin/mlazos/hc-mut 2025-12-04T08:53:59.3466175Z * [new branch] mlazos/hc10 -> origin/mlazos/hc10 2025-12-04T08:53:59.3466236Z * [new branch] mlazos/hc11 -> origin/mlazos/hc11 2025-12-04T08:53:59.3466296Z * [new branch] mlazos/hc12 -> origin/mlazos/hc12 2025-12-04T08:53:59.3466359Z * [new branch] mlazos/hc13 -> origin/mlazos/hc13 2025-12-04T08:53:59.3466418Z * [new branch] mlazos/hc14 -> origin/mlazos/hc14 2025-12-04T08:53:59.3466476Z * [new branch] mlazos/hc15 -> origin/mlazos/hc15 2025-12-04T08:53:59.3466539Z * [new branch] mlazos/hc2 -> origin/mlazos/hc2 2025-12-04T08:53:59.3466599Z * [new branch] mlazos/hc4 -> origin/mlazos/hc4 2025-12-04T08:53:59.3466658Z * [new branch] mlazos/hc5 -> origin/mlazos/hc5 2025-12-04T08:53:59.3466720Z * [new branch] mlazos/hc6 -> origin/mlazos/hc6 2025-12-04T08:53:59.3466777Z * [new branch] mlazos/hc7 -> origin/mlazos/hc7 2025-12-04T08:53:59.3466835Z * [new branch] mlazos/hc8 -> origin/mlazos/hc8 2025-12-04T08:53:59.3466895Z * [new branch] mlazos/hc9 -> origin/mlazos/hc9 2025-12-04T08:53:59.3466996Z * [new branch] mlazos/hc_baseline2 -> origin/mlazos/hc_baseline2 2025-12-04T08:53:59.3467080Z * [new branch] mlazos/inductor-streams -> origin/mlazos/inductor-streams 2025-12-04T08:53:59.3467141Z * [new branch] mlazos/main -> origin/mlazos/main 2025-12-04T08:53:59.3467201Z * [new branch] mlazos/mcg2 -> origin/mlazos/mcg2 2025-12-04T08:53:59.3467276Z * [new branch] mlazos/meta-guards -> origin/mlazos/meta-guards 2025-12-04T08:53:59.3467381Z * [new branch] mlazos/mlazos/foreach-map-adam -> origin/mlazos/mlazos/foreach-map-adam 2025-12-04T08:53:59.3467480Z * [new branch] mlazos/mlazos/tf-mode-backup -> origin/mlazos/mlazos/tf-mode-backup 2025-12-04T08:53:59.3467552Z * [new branch] mlazos/mod-fix -> origin/mlazos/mod-fix 2025-12-04T08:53:59.3467619Z * [new branch] mlazos/mode-fix -> origin/mlazos/mode-fix 2025-12-04T08:53:59.3467686Z * [new branch] mlazos/offsets -> origin/mlazos/offsets 2025-12-04T08:53:59.3467761Z * [new branch] mlazos/overguarding -> origin/mlazos/overguarding 2025-12-04T08:53:59.3467836Z * [new branch] mlazos/proxy-ctors -> origin/mlazos/proxy-ctors 2025-12-04T08:53:59.3467904Z * [new branch] mlazos/quant-fix -> origin/mlazos/quant-fix 2025-12-04T08:53:59.3467975Z * [new branch] mlazos/resnet-fix -> origin/mlazos/resnet-fix 2025-12-04T08:53:59.3468047Z * [new branch] mlazos/rm-buf-names -> origin/mlazos/rm-buf-names 2025-12-04T08:53:59.3468113Z * [new branch] mlazos/rm-code -> origin/mlazos/rm-code 2025-12-04T08:53:59.3468181Z * [new branch] mlazos/rm-spam -> origin/mlazos/rm-spam 2025-12-04T08:53:59.3468242Z * [new branch] mlazos/rtp -> origin/mlazos/rtp 2025-12-04T08:53:59.3468320Z * [new branch] mlazos/static-idx-dbg -> origin/mlazos/static-idx-dbg 2025-12-04T08:53:59.3468408Z * [new branch] mlazos/static-inputs-log -> origin/mlazos/static-inputs-log 2025-12-04T08:53:59.3468471Z * [new branch] mlazos/stests -> origin/mlazos/stests 2025-12-04T08:53:59.3468541Z * [new branch] mlazos/stream-ops -> origin/mlazos/stream-ops 2025-12-04T08:53:59.3468606Z * [new branch] mlazos/td-fix2 -> origin/mlazos/td-fix2 2025-12-04T08:53:59.3468683Z * [new branch] mlazos/tensor-hasattr2 -> origin/mlazos/tensor-hasattr2 2025-12-04T08:53:59.3468770Z * [new branch] mlazos/test -> origin/mlazos/test 2025-12-04T08:53:59.3468835Z * [new branch] mlazos/tf-mode -> origin/mlazos/tf-mode 2025-12-04T08:53:59.3468913Z * [new branch] mlazos/tf-mode-backup2 -> origin/mlazos/tf-mode-backup2 2025-12-04T08:53:59.3468997Z * [new branch] mlazos/tf-mode-reland -> origin/mlazos/tf-mode-reland 2025-12-04T08:53:59.3469074Z * [new branch] mlazos/tf-mode-reland2 -> origin/mlazos/tf-mode-reland2 2025-12-04T08:53:59.3469149Z * [new branch] mlazos/tf-mode-reland3 -> origin/mlazos/tf-mode-reland3 2025-12-04T08:53:59.3469230Z * [new branch] mlazos/triton-no-epi -> origin/mlazos/triton-no-epi 2025-12-04T08:53:59.3469304Z * [new branch] mlazos/tune-proto -> origin/mlazos/tune-proto 2025-12-04T08:53:59.3469377Z * [new branch] mlazos/tuple-fixes -> origin/mlazos/tuple-fixes 2025-12-04T08:53:59.3469455Z * [new branch] mlazos/tuple-fixes2 -> origin/mlazos/tuple-fixes2 2025-12-04T08:53:59.3469531Z * [new branch] mlazos/tuple-handling -> origin/mlazos/tuple-handling 2025-12-04T08:53:59.3469612Z * [new branch] mlazos/user-stream-base -> origin/mlazos/user-stream-base 2025-12-04T08:53:59.3469726Z * [new branch] mlazos/user-streams -> origin/mlazos/user-streams 2025-12-04T08:53:59.3469818Z * [new branch] mlazos/user-streams-backup -> origin/mlazos/user-streams-backup 2025-12-04T08:53:59.3469912Z * [new branch] mlazos/user-streams-backup2 -> origin/mlazos/user-streams-backup2 2025-12-04T08:53:59.3469986Z * [new branch] mlazos/vary-beta -> origin/mlazos/vary-beta 2025-12-04T08:53:59.3470055Z * [new branch] mlazos/vary-beta2 -> origin/mlazos/vary-beta2 2025-12-04T08:53:59.3470127Z * [new branch] mlazos/weird-perf1 -> origin/mlazos/weird-perf1 2025-12-04T08:53:59.3470203Z * [new branch] mm_out_dtype_compile -> origin/mm_out_dtype_compile 2025-12-04T08:53:59.3470267Z * [new branch] module-shim -> origin/module-shim 2025-12-04T08:53:59.3470329Z * [new branch] move_config -> origin/move_config 2025-12-04T08:53:59.3470402Z * [new branch] msaroufim/reduce -> origin/msaroufim/reduce 2025-12-04T08:53:59.3470470Z * [new branch] mtia/basic-cmake -> origin/mtia/basic-cmake 2025-12-04T08:53:59.3470577Z * [new branch] mwizak/fix-triton-block-shape -> origin/mwizak/fix-triton-block-shape 2025-12-04T08:53:59.3470644Z * [new branch] my_varlen_backup -> origin/my_varlen_backup 2025-12-04T08:53:59.3470718Z * [new branch] nativert_num_outputs -> origin/nativert_num_outputs 2025-12-04T08:53:59.3470783Z * [new branch] new-codegen -> origin/new-codegen 2025-12-04T08:53:59.3470849Z * [new branch] newtest-base -> origin/newtest-base 2025-12-04T08:53:59.3470921Z * [new branch] ngimel/addmm_dtype -> origin/ngimel/addmm_dtype 2025-12-04T08:53:59.3470988Z * [new branch] ngimel/div_inv -> origin/ngimel/div_inv 2025-12-04T08:53:59.3471065Z * [new branch] ngimel/error_index_list -> origin/ngimel/error_index_list 2025-12-04T08:53:59.3471135Z * [new branch] ngimel/gather_grid -> origin/ngimel/gather_grid 2025-12-04T08:53:59.3471224Z * [new branch] ngimel/gather_grid_release -> origin/ngimel/gather_grid_release 2025-12-04T08:53:59.3471288Z * [new branch] ngimel/gg_new -> origin/ngimel/gg_new 2025-12-04T08:53:59.3471355Z * [new branch] ngimel/hostalloc -> origin/ngimel/hostalloc 2025-12-04T08:53:59.3471426Z * [new branch] ngimel/storage_id -> origin/ngimel/storage_id 2025-12-04T08:53:59.3471518Z * [new branch] nightly -> origin/nightly 2025-12-04T08:53:59.3471633Z * [new branch] nikitaved/addmm_1_rowcol_lt_path_check -> origin/nikitaved/addmm_1_rowcol_lt_path_check 2025-12-04T08:53:59.3471757Z * [new branch] nikitaved/addmm_epilogue_fusions_2d_bias -> origin/nikitaved/addmm_epilogue_fusions_2d_bias 2025-12-04T08:53:59.3471915Z * [new branch] nikitaved/addmm_epilogue_fusions_inductor -> origin/nikitaved/addmm_epilogue_fusions_inductor 2025-12-04T08:53:59.3472039Z * [new branch] nikitaved/addmm_epilogue_fusions_scratch -> origin/nikitaved/addmm_epilogue_fusions_scratch 2025-12-04T08:53:59.3472154Z * [new branch] nikitaved/grad_addmm_epilogue_fusions -> origin/nikitaved/grad_addmm_epilogue_fusions 2025-12-04T08:53:59.3472267Z * [new branch] nikitaved/simpler_can_use_32bit_index -> origin/nikitaved/simpler_can_use_32bit_index 2025-12-04T08:53:59.3472340Z * [new branch] nikitaved/test -> origin/nikitaved/test 2025-12-04T08:53:59.3472464Z * [new branch] nmacchioni-perf-test-async-autotune -> origin/nmacchioni-perf-test-async-autotune 2025-12-04T08:53:59.3472541Z * [new branch] no_distributed_log_spew -> origin/no_distributed_log_spew 2025-12-04T08:53:59.3472660Z * [new branch] nofun-hack -> origin/nofun-hack 2025-12-04T08:53:59.3472722Z * [new branch] norm_bench -> origin/norm_bench 2025-12-04T08:53:59.3472797Z * [new branch] nullplay/fuse_matmul -> origin/nullplay/fuse_matmul 2025-12-04T08:53:59.3472873Z * [new branch] nullplay_fuse_matmul -> origin/nullplay_fuse_matmul 2025-12-04T08:53:59.3472941Z * [new branch] optimizer_test -> origin/optimizer_test 2025-12-04T08:53:59.3473010Z * [new branch] orig/release/1.10 -> origin/orig/release/1.10 2025-12-04T08:53:59.3473083Z * [new branch] orig/release/1.11 -> origin/orig/release/1.11 2025-12-04T08:53:59.3473150Z * [new branch] orig/release/1.12 -> origin/orig/release/1.12 2025-12-04T08:53:59.3473216Z * [new branch] orig/release/1.13 -> origin/orig/release/1.13 2025-12-04T08:53:59.3473284Z * [new branch] orig/release/1.6 -> origin/orig/release/1.6 2025-12-04T08:53:59.3473351Z * [new branch] orig/release/1.7 -> origin/orig/release/1.7 2025-12-04T08:53:59.3473417Z * [new branch] orig/release/1.8 -> origin/orig/release/1.8 2025-12-04T08:53:59.3473485Z * [new branch] orig/release/1.9 -> origin/orig/release/1.9 2025-12-04T08:53:59.3473549Z * [new branch] orig/release/2.0 -> origin/orig/release/2.0 2025-12-04T08:53:59.3473614Z * [new branch] orig/release/2.1 -> origin/orig/release/2.1 2025-12-04T08:53:59.3473682Z * [new branch] orig/release/2.2 -> origin/orig/release/2.2 2025-12-04T08:53:59.3473747Z * [new branch] orig/release/2.3 -> origin/orig/release/2.3 2025-12-04T08:53:59.3473816Z * [new branch] orig/release/2.4 -> origin/orig/release/2.4 2025-12-04T08:53:59.3473882Z * [new branch] orig/release/2.5 -> origin/orig/release/2.5 2025-12-04T08:53:59.3473949Z * [new branch] orig/release/2.6 -> origin/orig/release/2.6 2025-12-04T08:53:59.3474016Z * [new branch] orig/release/2.7 -> origin/orig/release/2.7 2025-12-04T08:53:59.3474082Z * [new branch] orig/release/2.8 -> origin/orig/release/2.8 2025-12-04T08:53:59.3474148Z * [new branch] orig/release/2.9 -> origin/orig/release/2.9 2025-12-04T08:53:59.3474237Z * [new branch] origin/gh/fxdawnn/1/base -> origin/origin/gh/fxdawnn/1/base 2025-12-04T08:53:59.3474363Z * [new branch] origin/gh/fxdawnn/1/orig -> origin/origin/gh/fxdawnn/1/orig 2025-12-04T08:53:59.3474444Z * [new branch] origin/gh/zpcore/14/orig -> origin/origin/gh/zpcore/14/orig 2025-12-04T08:53:59.3474516Z * [new branch] oulgen-patch-1 -> origin/oulgen-patch-1 2025-12-04T08:53:59.3474583Z * [new branch] oulgen-patch-2 -> origin/oulgen-patch-2 2025-12-04T08:53:59.3474650Z * [new branch] oulgen-patch-3 -> origin/oulgen-patch-3 2025-12-04T08:53:59.3474717Z * [new branch] oulgen-patch-4 -> origin/oulgen-patch-4 2025-12-04T08:53:59.3474784Z * [new branch] padded-tensor -> origin/padded-tensor 2025-12-04T08:53:59.3474846Z * [new branch] pca2 -> origin/pca2 2025-12-04T08:53:59.3474922Z * [new branch] per_channel_backup -> origin/per_channel_backup 2025-12-04T08:53:59.3474984Z * [new branch] perf_ops -> origin/perf_ops 2025-12-04T08:53:59.3475049Z * [new branch] perf_ops_2_9 -> origin/perf_ops_2_9 2025-12-04T08:53:59.3475122Z * [new branch] pianpwk-patch-1 -> origin/pianpwk-patch-1 2025-12-04T08:53:59.3475209Z * [new branch] pianpwk/__draft_debug_mode -> origin/pianpwk/__draft_debug_mode 2025-12-04T08:53:59.3475349Z * [new branch] pianpwk/_debug_mode_for_triton_draft -> origin/pianpwk/_debug_mode_for_triton_draft 2025-12-04T08:53:59.3475455Z * [new branch] pianpwk/_debug_nn_module_compile -> origin/pianpwk/_debug_nn_module_compile 2025-12-04T08:53:59.3475540Z * [new branch] pianpwk/_draft_triton_11_3 -> origin/pianpwk/_draft_triton_11_3 2025-12-04T08:53:59.3475634Z * [new branch] pianpwk/_manual_bucket_draft -> origin/pianpwk/_manual_bucket_draft 2025-12-04T08:53:59.3475736Z * [new branch] pianpwk/_profile_w_dispatch_keys -> origin/pianpwk/_profile_w_dispatch_keys 2025-12-04T08:53:59.3475834Z * [new branch] pianpwk/_super_draft_debug_mode -> origin/pianpwk/_super_draft_debug_mode 2025-12-04T08:53:59.3475940Z * [new branch] pianpwk/_unbacked_local_shard_size -> origin/pianpwk/_unbacked_local_shard_size 2025-12-04T08:53:59.3476014Z * [new branch] pianpwk/anomaly_tb -> origin/pianpwk/anomaly_tb 2025-12-04T08:53:59.3476096Z * [new branch] pianpwk/auto_fx_annotate -> origin/pianpwk/auto_fx_annotate 2025-12-04T08:53:59.3476211Z * [new branch] pianpwk/backed_size_oblivious_export -> origin/pianpwk/backed_size_oblivious_export 2025-12-04T08:53:59.3476297Z * [new branch] pianpwk/bert_dynamic_perf -> origin/pianpwk/bert_dynamic_perf 2025-12-04T08:53:59.3476392Z * [new branch] pianpwk/debug_fwd_stack_traces -> origin/pianpwk/debug_fwd_stack_traces 2025-12-04T08:53:59.3476478Z * [new branch] pianpwk/debug_hash_tensor -> origin/pianpwk/debug_hash_tensor 2025-12-04T08:53:59.3476569Z * [new branch] pianpwk/debug_mode_annotate -> origin/pianpwk/debug_mode_annotate 2025-12-04T08:53:59.3476657Z * [new branch] pianpwk/debug_mode_defaults -> origin/pianpwk/debug_mode_defaults 2025-12-04T08:53:59.3476739Z * [new branch] pianpwk/debug_mode_hacks -> origin/pianpwk/debug_mode_hacks 2025-12-04T08:53:59.3476847Z * [new branch] pianpwk/debug_mode_opcall_refactor -> origin/pianpwk/debug_mode_opcall_refactor 2025-12-04T08:53:59.3476937Z * [new branch] pianpwk/debug_mode_show_ids -> origin/pianpwk/debug_mode_show_ids 2025-12-04T08:53:59.3477022Z * [new branch] pianpwk/debug_mode_triton -> origin/pianpwk/debug_mode_triton 2025-12-04T08:53:59.3477118Z * [new branch] pianpwk/debug_show_stack_trace -> origin/pianpwk/debug_show_stack_trace 2025-12-04T08:53:59.3477221Z * [new branch] pianpwk/debug_wait_on_collective -> origin/pianpwk/debug_wait_on_collective 2025-12-04T08:53:59.3477346Z * [new branch] pianpwk/debugmode_compile_tf -> origin/pianpwk/debugmode_compile_tf 2025-12-04T08:53:59.3477470Z * [new branch] pianpwk/dispatch_key_debugging_for_debug -> origin/pianpwk/dispatch_key_debugging_for_debug 2025-12-04T08:53:59.3477578Z * [new branch] pianpwk/draft_debug_mode_tfcompile -> origin/pianpwk/draft_debug_mode_tfcompile 2025-12-04T08:53:59.3477672Z * [new branch] pianpwk/draft_multikernel_nn -> origin/pianpwk/draft_multikernel_nn 2025-12-04T08:53:59.3477786Z * [new branch] pianpwk/draft_multikernel_status_10_5 -> origin/pianpwk/draft_multikernel_status_10_5 2025-12-04T08:53:59.3477879Z * [new branch] pianpwk/dtensor_custom_chunk -> origin/pianpwk/dtensor_custom_chunk 2025-12-04T08:53:59.3477981Z * [new branch] pianpwk/dtensor_unbacked_keypath -> origin/pianpwk/dtensor_unbacked_keypath 2025-12-04T08:53:59.3478062Z * [new branch] pianpwk/event_list_tree -> origin/pianpwk/event_list_tree 2025-12-04T08:53:59.3478144Z * [new branch] pianpwk/false_numel_refs -> origin/pianpwk/false_numel_refs 2025-12-04T08:53:59.3478221Z * [new branch] pianpwk/maybe_guard_rel -> origin/pianpwk/maybe_guard_rel 2025-12-04T08:53:59.3478356Z * [new branch] pianpwk/multikernel_hints_draft -> origin/pianpwk/multikernel_hints_draft 2025-12-04T08:53:59.3478466Z * [new branch] pianpwk/no_size_oblivious_slice_scat -> origin/pianpwk/no_size_oblivious_slice_scat 2025-12-04T08:53:59.3478581Z * [new branch] pianpwk/oblivious_reshape_view_better -> origin/pianpwk/oblivious_reshape_view_better 2025-12-04T08:53:59.3478667Z * [new branch] pianpwk/pre_forward_hook -> origin/pianpwk/pre_forward_hook 2025-12-04T08:53:59.3478774Z * [new branch] pianpwk/skip_python_keys_alternate -> origin/pianpwk/skip_python_keys_alternate 2025-12-04T08:53:59.3478880Z * [new branch] pianpwk/skip_python_keys_in_guards -> origin/pianpwk/skip_python_keys_in_guards 2025-12-04T08:53:59.3478965Z * [new branch] pianpwk/sym_tokens_draft -> origin/pianpwk/sym_tokens_draft 2025-12-04T08:53:59.3479043Z * [new branch] pianpwk/symint_one_hot -> origin/pianpwk/symint_one_hot 2025-12-04T08:53:59.3479156Z * [new branch] pianpwk/test_pointwise_guard_or_false -> origin/pianpwk/test_pointwise_guard_or_false 2025-12-04T08:53:59.3479257Z * [new branch] pianpwk/totally_draft_sym_wrap -> origin/pianpwk/totally_draft_sym_wrap 2025-12-04T08:53:59.3479341Z * [new branch] pianpwk/try_dumb_stuff -> origin/pianpwk/try_dumb_stuff 2025-12-04T08:53:59.3479419Z * [new branch] pianpwk/try_dumb_stuff_2 -> origin/pianpwk/try_dumb_stuff_2 2025-12-04T08:53:59.3479512Z * [new branch] pianpwk/unbacked_dtensor_mm -> origin/pianpwk/unbacked_dtensor_mm 2025-12-04T08:53:59.3479608Z * [new branch] pianpwk/unbacked_tracing_12_2 -> origin/pianpwk/unbacked_tracing_12_2 2025-12-04T08:53:59.3479683Z * [new branch] pianpwk/user_symints -> origin/pianpwk/user_symints 2025-12-04T08:53:59.3479763Z * [new branch] pianpwk/wan21_reshape -> origin/pianpwk/wan21_reshape 2025-12-04T08:53:59.3479857Z * [new branch] piz/fix_partial_backward_1112 -> origin/piz/fix_partial_backward_1112 2025-12-04T08:53:59.3479934Z * [new branch] piz/prop_cache_clean -> origin/piz/prop_cache_clean 2025-12-04T08:53:59.3480001Z * [new branch] pool-separate -> origin/pool-separate 2025-12-04T08:53:59.3480063Z * [new branch] pr-156087 -> origin/pr-156087 2025-12-04T08:53:59.3480125Z * [new branch] pr/131860 -> origin/pr/131860 2025-12-04T08:53:59.3480196Z * [new branch] predispatch_to -> origin/predispatch_to 2025-12-04T08:53:59.3480301Z * [new branch] protect-c17 -> origin/protect-c17 2025-12-04T08:53:59.3480370Z * [new branch] pt-opt-cuda3 -> origin/pt-opt-cuda3 2025-12-04T08:53:59.3480450Z * [new branch] python_compiled_autograd -> origin/python_compiled_autograd 2025-12-04T08:53:59.3480580Z * [new branch] q1l1/fix_device_moved_constant_type_unknown -> origin/q1l1/fix_device_moved_constant_type_unknown 2025-12-04T08:53:59.3480721Z * [new branch] q1l1/fix_wrong_default_type_for_kernel_call_args -> origin/q1l1/fix_wrong_default_type_for_kernel_call_args 2025-12-04T08:53:59.3480799Z * [new branch] qchip/export-D54134695 -> origin/qchip/export-D54134695 2025-12-04T08:53:59.3480872Z * [new branch] quote-pytest_cache -> origin/quote-pytest_cache 2025-12-04T08:53:59.3480972Z * [new branch] reland-accgrad-stream-warn -> origin/reland-accgrad-stream-warn 2025-12-04T08:53:59.3481037Z * [new branch] release/1.10 -> origin/release/1.10 2025-12-04T08:53:59.3481100Z * [new branch] release/1.11 -> origin/release/1.11 2025-12-04T08:53:59.3481164Z * [new branch] release/1.12 -> origin/release/1.12 2025-12-04T08:53:59.3481225Z * [new branch] release/1.13 -> origin/release/1.13 2025-12-04T08:53:59.3481316Z * [new branch] release/1.4 -> origin/release/1.4 2025-12-04T08:53:59.3481383Z * [new branch] release/1.4.1 -> origin/release/1.4.1 2025-12-04T08:53:59.3481443Z * [new branch] release/1.5 -> origin/release/1.5 2025-12-04T08:53:59.3481505Z * [new branch] release/1.6 -> origin/release/1.6 2025-12-04T08:53:59.3481565Z * [new branch] release/1.7 -> origin/release/1.7 2025-12-04T08:53:59.3481626Z * [new branch] release/1.8 -> origin/release/1.8 2025-12-04T08:53:59.3481691Z * [new branch] release/1.9 -> origin/release/1.9 2025-12-04T08:53:59.3481751Z * [new branch] release/2.0 -> origin/release/2.0 2025-12-04T08:53:59.3481811Z * [new branch] release/2.1 -> origin/release/2.1 2025-12-04T08:53:59.3481910Z * [new branch] release/2.2 -> origin/release/2.2 2025-12-04T08:53:59.3481972Z * [new branch] release/2.3 -> origin/release/2.3 2025-12-04T08:53:59.3482032Z * [new branch] release/2.4 -> origin/release/2.4 2025-12-04T08:53:59.3482094Z * [new branch] release/2.5 -> origin/release/2.5 2025-12-04T08:53:59.3482154Z * [new branch] release/2.6 -> origin/release/2.6 2025-12-04T08:53:59.3482213Z * [new branch] release/2.7 -> origin/release/2.7 2025-12-04T08:53:59.3482276Z * [new branch] release/2.8 -> origin/release/2.8 2025-12-04T08:53:59.3482337Z * [new branch] release/2.9 -> origin/release/2.9 2025-12-04T08:53:59.3482401Z * [new branch] release_notes -> origin/release_notes 2025-12-04T08:53:59.3482478Z * [new branch] remove_pyinterpreter -> origin/remove_pyinterpreter 2025-12-04T08:53:59.3482605Z * [new branch] replace-pytorch-labs-20250812-195836 -> origin/replace-pytorch-labs-20250812-195836 2025-12-04T08:53:59.3482725Z * [new branch] replace-pytorch-labs-20250812-200248 -> origin/replace-pytorch-labs-20250812-200248 2025-12-04T08:53:59.3482844Z * [new branch] replace-pytorch-labs-20250812-200324 -> origin/replace-pytorch-labs-20250812-200324 2025-12-04T08:53:59.3482961Z * [new branch] replace-pytorch-labs-20250812-204020 -> origin/replace-pytorch-labs-20250812-204020 2025-12-04T08:53:59.3483091Z * [new branch] revert-131069-gh/krzysztofjordan/1/head -> origin/revert-131069-gh/krzysztofjordan/1/head 2025-12-04T08:53:59.3483250Z * [new branch] revert-131469-gh/andrewor14/51/head -> origin/revert-131469-gh/andrewor14/51/head 2025-12-04T08:53:59.3483353Z * [new branch] revert-152361-gh/fadara01/1/head -> origin/revert-152361-gh/fadara01/1/head 2025-12-04T08:53:59.3483459Z * [new branch] revert-156870-gh/skarjala/3/head -> origin/revert-156870-gh/skarjala/3/head 2025-12-04T08:53:59.3483634Z * [new branch] revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ -> origin/revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ 2025-12-04T08:53:59.3483731Z * [new branch] revert-hoo-invoke-subgraph -> origin/revert-hoo-invoke-subgraph 2025-12-04T08:53:59.3483831Z * [new branch] revert_always_build_distributed -> origin/revert_always_build_distributed 2025-12-04T08:53:59.3483898Z * [new branch] rms_norm_patch -> origin/rms_norm_patch 2025-12-04T08:53:59.3483996Z * [new branch] ruisi/fix_all_to_all_estimation -> origin/ruisi/fix_all_to_all_estimation 2025-12-04T08:53:59.3484080Z * [new branch] ruisi/fix_comm_estimation -> origin/ruisi/fix_comm_estimation 2025-12-04T08:53:59.3484185Z * [new branch] ruisi/fix_dynamic_shape_estimation -> origin/ruisi/fix_dynamic_shape_estimation 2025-12-04T08:53:59.3484322Z * [new branch] ruisi/fix_llama3_autobucketing -> origin/ruisi/fix_llama3_autobucketing 2025-12-04T08:53:59.3484427Z * [new branch] ruisi/fix_manual_bucketing_ep_pass -> origin/ruisi/fix_manual_bucketing_ep_pass 2025-12-04T08:53:59.3484509Z * [new branch] ruisi/manual_bucket_pass -> origin/ruisi/manual_bucket_pass 2025-12-04T08:53:59.3484657Z * [new branch] ryanguo99/cleanup-dynamo-expected-failures -> origin/ryanguo99/cleanup-dynamo-expected-failures 2025-12-04T08:53:59.3484744Z * [new branch] ryanguo99/fix-closure-var -> origin/ryanguo99/fix-closure-var 2025-12-04T08:53:59.3484824Z * [new branch] rzou/faketensor_bench -> origin/rzou/faketensor_bench 2025-12-04T08:53:59.3484888Z * [new branch] rzou/njt -> origin/rzou/njt 2025-12-04T08:53:59.3484950Z * [new branch] rzou/pca -> origin/rzou/pca 2025-12-04T08:53:59.3485017Z * [new branch] rzou/realprop -> origin/rzou/realprop 2025-12-04T08:53:59.3485085Z * [new branch] samplevllm -> origin/samplevllm 2025-12-04T08:53:59.3485250Z * [new branch] sanchitintel/weird_thing_with_test_cpu_select_algorithm -> origin/sanchitintel/weird_thing_with_test_cpu_select_algorithm 2025-12-04T08:53:59.3485343Z * [new branch] sapling-pr-archive-SS-JIA -> origin/sapling-pr-archive-SS-JIA 2025-12-04T08:53:59.3485461Z * [new branch] sapling-pr-archive-tushar00jain -> origin/sapling-pr-archive-tushar00jain 2025-12-04T08:53:59.3485522Z * [new branch] save -> origin/save 2025-12-04T08:53:59.3485583Z * [new branch] scaled_mm -> origin/scaled_mm 2025-12-04T08:53:59.3485650Z * [new branch] scan_attempt -> origin/scan_attempt 2025-12-04T08:53:59.3485711Z * [new branch] sdym/2.5.1 -> origin/sdym/2.5.1 2025-12-04T08:53:59.3485818Z * [new branch] sekyondaMeta-dynamoconfig-fix -> origin/sekyondaMeta-dynamoconfig-fix 2025-12-04T08:53:59.3485897Z * [new branch] shengf/fx-xform-perf -> origin/shengf/fx-xform-perf 2025-12-04T08:53:59.3485973Z * [new branch] shoumikhin-patch-1 -> origin/shoumikhin-patch-1 2025-12-04T08:53:59.3486048Z * [new branch] solve-accuracy-fix -> origin/solve-accuracy-fix 2025-12-04T08:53:59.3486130Z * [new branch] some_rocm_inductor_skips -> origin/some_rocm_inductor_skips 2025-12-04T08:53:59.3486235Z * [new branch] soulitzer/stash-tls-ac -> origin/soulitzer/stash-tls-ac 2025-12-04T08:53:59.3486318Z * [new branch] sparse-mm-bf16-support -> origin/sparse-mm-bf16-support 2025-12-04T08:53:59.3486390Z * [new branch] starterTaskUpdate -> origin/starterTaskUpdate 2025-12-04T08:53:59.3486450Z * [new branch] suo -> origin/suo 2025-12-04T08:53:59.3486514Z * [new branch] sve-poc -> origin/sve-poc 2025-12-04T08:53:59.3486574Z * [new branch] switch-bn -> origin/switch-bn 2025-12-04T08:53:59.3486666Z * [new branch] sy_annotation_in_autograd_hop -> origin/sy_annotation_in_autograd_hop 2025-12-04T08:53:59.3486735Z * [new branch] sy_aot_eager_record -> origin/sy_aot_eager_record 2025-12-04T08:53:59.3486803Z * [new branch] sy_custom_bucketing -> origin/sy_custom_bucketing 2025-12-04T08:53:59.3486873Z * [new branch] sy_debug_mode_test -> origin/sy_debug_mode_test 2025-12-04T08:53:59.3486937Z * [new branch] sy_deserialize -> origin/sy_deserialize 2025-12-04T08:53:59.3487001Z * [new branch] sy_dump_gm_code -> origin/sy_dump_gm_code 2025-12-04T08:53:59.3487061Z * [new branch] sy_exp -> origin/sy_exp 2025-12-04T08:53:59.3487165Z * [new branch] sy_export_annotation -> origin/sy_export_annotation 2025-12-04T08:53:59.3487232Z * [new branch] sy_invoke_subgraph -> origin/sy_invoke_subgraph 2025-12-04T08:53:59.3487298Z * [new branch] sy_kernel_bw_name -> origin/sy_kernel_bw_name 2025-12-04T08:53:59.3487361Z * [new branch] sy_multi_arch -> origin/sy_multi_arch 2025-12-04T08:53:59.3487426Z * [new branch] sy_nn_module_stack -> origin/sy_nn_module_stack 2025-12-04T08:53:59.3487496Z * [new branch] sy_original_dtensor -> origin/sy_original_dtensor 2025-12-04T08:53:59.3487564Z * [new branch] sy_profiler_cia -> origin/sy_profiler_cia 2025-12-04T08:53:59.3487626Z * [new branch] symm_mem_sync -> origin/symm_mem_sync 2025-12-04T08:53:59.3487708Z * [new branch] sympy-bottleneck-repro -> origin/sympy-bottleneck-repro 2025-12-04T08:53:59.3487788Z * [new branch] tensordict_integration -> origin/tensordict_integration 2025-12-04T08:53:59.3487868Z * [new branch] test-move-conda-builds -> origin/test-move-conda-builds 2025-12-04T08:53:59.3487930Z * [new branch] test-old -> origin/test-old 2025-12-04T08:53:59.3487995Z * [new branch] test/bmm_heur -> origin/test/bmm_heur 2025-12-04T08:53:59.3488091Z * [new branch] tianren/customOp_autotune_fix -> origin/tianren/customOp_autotune_fix 2025-12-04T08:53:59.3491444Z * [new branch] tianren/customOp_enable_max_autotune -> origin/tianren/customOp_enable_max_autotune 2025-12-04T08:53:59.3491543Z * [new branch] tianren/customOp_fusion -> origin/tianren/customOp_fusion 2025-12-04T08:53:59.3491669Z * [new branch] tianren/customop_collectiveop_benchmark -> origin/tianren/customop_collectiveop_benchmark 2025-12-04T08:53:59.3491807Z * [new branch] tianren/customop_collectiveop_benchmark_fix -> origin/tianren/customop_collectiveop_benchmark_fix 2025-12-04T08:53:59.3491937Z * [new branch] tianren/customop_dynamic_config -> origin/tianren/customop_dynamic_config 2025-12-04T08:53:59.3492031Z * [new branch] tianren/dynamic_range_input -> origin/tianren/dynamic_range_input 2025-12-04T08:53:59.3492131Z * [new branch] tianren/dynamic_range_input_fix -> origin/tianren/dynamic_range_input_fix 2025-12-04T08:53:59.3492232Z * [new branch] tianren/dynamic_range_input_merge -> origin/tianren/dynamic_range_input_merge 2025-12-04T08:53:59.3492411Z * [new branch] tianren/flex_paged_attn_fix_temp -> origin/tianren/flex_paged_attn_fix_temp 2025-12-04T08:53:59.3492491Z * [new branch] tianren/fx_codegen_dump -> origin/tianren/fx_codegen_dump 2025-12-04T08:53:59.3492574Z * [new branch] tianren/symmetric_memory -> origin/tianren/symmetric_memory 2025-12-04T08:53:59.3492640Z * [new branch] tianren/test -> origin/tianren/test 2025-12-04T08:53:59.3492716Z * [new branch] tidy_performance_cyy -> origin/tidy_performance_cyy 2025-12-04T08:53:59.3492774Z * [new branch] tmp -> origin/tmp 2025-12-04T08:53:59.3492839Z * [new branch] torchtitan_ep -> origin/torchtitan_ep 2025-12-04T08:53:59.3492919Z * [new branch] torchtitan_integration -> origin/torchtitan_integration 2025-12-04T08:53:59.3493003Z * [new branch] trace_fsdp_torchtune_lora -> origin/trace_fsdp_torchtune_lora 2025-12-04T08:53:59.3493095Z * [new branch] traceable_fsdp_unit_tests -> origin/traceable_fsdp_unit_tests 2025-12-04T08:53:59.3493167Z * [new branch] tree_loop_vec_base -> origin/tree_loop_vec_base 2025-12-04T08:53:59.3493230Z * [new branch] triton_kernel -> origin/triton_kernel 2025-12-04T08:53:59.3493337Z * [new branch] tt_pkg_1908 -> origin/tt_pkg_1908 2025-12-04T08:53:59.3493400Z * [new branch] type_dec -> origin/type_dec 2025-12-04T08:53:59.3493491Z * [new branch] udate-sphinx-dependancies -> origin/udate-sphinx-dependancies 2025-12-04T08:53:59.3493628Z * [new branch] update-audio-commit-hash/17630256502-1803-1 -> origin/update-audio-commit-hash/17630256502-1803-1 2025-12-04T08:53:59.3493761Z * [new branch] update-audio-commit-hash/19087141161-1916-1 -> origin/update-audio-commit-hash/19087141161-1916-1 2025-12-04T08:53:59.3493894Z * [new branch] update-audio-commit-hash/19250643381-1929-1 -> origin/update-audio-commit-hash/19250643381-1929-1 2025-12-04T08:53:59.3494024Z * [new branch] update-audio-commit-hash/19397724337-1935-1 -> origin/update-audio-commit-hash/19397724337-1935-1 2025-12-04T08:53:59.3494154Z * [new branch] update-audio-commit-hash/19555670148-1941-1 -> origin/update-audio-commit-hash/19555670148-1941-1 2025-12-04T08:53:59.3494283Z * [new branch] update-audio-commit-hash/19750627930-1946-1 -> origin/update-audio-commit-hash/19750627930-1946-1 2025-12-04T08:53:59.3494421Z * [new branch] update-triton-commit-hash/13663274526-1487-2 -> origin/update-triton-commit-hash/13663274526-1487-2 2025-12-04T08:53:59.3494554Z * [new branch] update-vision-commit-hash/19087141161-1916-1 -> origin/update-vision-commit-hash/19087141161-1916-1 2025-12-04T08:53:59.3494687Z * [new branch] update-vision-commit-hash/19184897099-1925-1 -> origin/update-vision-commit-hash/19184897099-1925-1 2025-12-04T08:53:59.3494825Z * [new branch] update-vision-commit-hash/19250643381-1929-1 -> origin/update-vision-commit-hash/19250643381-1929-1 2025-12-04T08:53:59.3494955Z * [new branch] update-vision-commit-hash/19381328640-1934-1 -> origin/update-vision-commit-hash/19381328640-1934-1 2025-12-04T08:53:59.3495090Z * [new branch] update-vision-commit-hash/19485237164-1938-1 -> origin/update-vision-commit-hash/19485237164-1938-1 2025-12-04T08:53:59.3495219Z * [new branch] update-vllm-commit-hash/18451675449-1879-1 -> origin/update-vllm-commit-hash/18451675449-1879-1 2025-12-04T08:53:59.3495303Z * [new branch] update-vllm-dockerfile -> origin/update-vllm-dockerfile 2025-12-04T08:53:59.3495429Z * [new branch] update-xla-commit-hash/19224287370-211-1 -> origin/update-xla-commit-hash/19224287370-211-1 2025-12-04T08:53:59.3495586Z * [new branch] update-xla-commit-hash/19422028566-212-1 -> origin/update-xla-commit-hash/19422028566-212-1 2025-12-04T08:53:59.3495705Z * [new branch] update-xla-commit-hash/19626841311-213-1 -> origin/update-xla-commit-hash/19626841311-213-1 2025-12-04T08:53:59.3495830Z * [new branch] update_docs_torch_multinomial_issue#125388 -> origin/update_docs_torch_multinomial_issue#125388 2025-12-04T08:53:59.3495909Z * [new branch] update_operator_readme -> origin/update_operator_readme 2025-12-04T08:53:59.3495996Z * [new branch] update_slow_tests_1722488736 -> origin/update_slow_tests_1722488736 2025-12-04T08:53:59.3496083Z * [new branch] update_slow_tests_1722879173 -> origin/update_slow_tests_1722879173 2025-12-04T08:53:59.3496167Z * [new branch] update_slow_tests_1762155677 -> origin/update_slow_tests_1762155677 2025-12-04T08:53:59.3496254Z * [new branch] update_slow_tests_1763365283 -> origin/update_slow_tests_1763365283 2025-12-04T08:53:59.3496348Z * [new branch] update_submodule_FBGEMM -> origin/update_submodule_FBGEMM 2025-12-04T08:53:59.3496425Z * [new branch] update_submodule_kineto -> origin/update_submodule_kineto 2025-12-04T08:53:59.3496517Z * [new branch] update_submodule_tensorpipe -> origin/update_submodule_tensorpipe 2025-12-04T08:53:59.3496648Z * [new branch] upload-tests-for-autorevert -> origin/upload-tests-for-autorevert 2025-12-04T08:53:59.3496710Z * [new branch] v0.1.2 -> origin/v0.1.2 2025-12-04T08:53:59.3496771Z * [new branch] v1.0.1 -> origin/v1.0.1 2025-12-04T08:53:59.3496828Z * [new branch] v1.0.3 -> origin/v1.0.3 2025-12-04T08:53:59.3496883Z * [new branch] v1.1.0 -> origin/v1.1.0 2025-12-04T08:53:59.3496944Z * [new branch] v1.2.0 -> origin/v1.2.0 2025-12-04T08:53:59.3497004Z * [new branch] v1.3.0 -> origin/v1.3.0 2025-12-04T08:53:59.3497059Z * [new branch] v1.3.1 -> origin/v1.3.1 2025-12-04T08:53:59.3497123Z * [new branch] validate_fn -> origin/validate_fn 2025-12-04T08:53:59.3497192Z * [new branch] validations_2.6 -> origin/validations_2.6 2025-12-04T08:53:59.3497259Z * [new branch] validations_2.8 -> origin/validations_2.8 2025-12-04T08:53:59.3497326Z * [new branch] varlen-api -> origin/varlen-api 2025-12-04T08:53:59.3497400Z * [new branch] varlen-api-backup -> origin/varlen-api-backup 2025-12-04T08:53:59.3497476Z * [new branch] varlen_batch_invariance -> origin/varlen_batch_invariance 2025-12-04T08:53:59.3497540Z * [new branch] viable/strict -> origin/viable/strict 2025-12-04T08:53:59.3497656Z * [new branch] vishal9-team/dtensor_parallelism_toy -> origin/vishal9-team/dtensor_parallelism_toy 2025-12-04T08:53:59.3497720Z * [new branch] vllmbuildci -> origin/vllmbuildci 2025-12-04T08:53:59.3497783Z * [new branch] vllmpin -> origin/vllmpin 2025-12-04T08:53:59.3497870Z * [new branch] vscode-recommend-pyrefly -> origin/vscode-recommend-pyrefly 2025-12-04T08:53:59.3497938Z * [new branch] wdvr-patch-1 -> origin/wdvr-patch-1 2025-12-04T08:53:59.3498003Z * [new branch] wdvr/iss_145259 -> origin/wdvr/iss_145259 2025-12-04T08:53:59.3498063Z * [new branch] whc/pei -> origin/whc/pei 2025-12-04T08:53:59.3498130Z * [new branch] whc/pp_fix -> origin/whc/pp_fix 2025-12-04T08:53:59.3498193Z * [new branch] whc/sharding -> origin/whc/sharding 2025-12-04T08:53:59.3498258Z * [new branch] whc/sharding2 -> origin/whc/sharding2 2025-12-04T08:53:59.3498348Z * [new branch] whc/uneven -> origin/whc/uneven 2025-12-04T08:53:59.3498417Z * [new branch] whc/uneven-merge -> origin/whc/uneven-merge 2025-12-04T08:53:59.3498478Z * [new branch] win_warnings -> origin/win_warnings 2025-12-04T08:53:59.3498557Z * [new branch] windows_libtorch_free -> origin/windows_libtorch_free 2025-12-04T08:53:59.3498618Z * [new branch] xmfan-war -> origin/xmfan-war 2025-12-04T08:53:59.3498679Z * [new branch] xmfan/ca_0516 -> origin/xmfan/ca_0516 2025-12-04T08:53:59.3498748Z * [new branch] xmfan/ca_1051b93192 -> origin/xmfan/ca_1051b93192 2025-12-04T08:53:59.3498897Z * [new branch] xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 -> origin/xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 2025-12-04T08:53:59.3498965Z * [new branch] xmfan/ca_5a2be192d1 -> origin/xmfan/ca_5a2be192d1 2025-12-04T08:53:59.3499034Z * [new branch] xmfan/ca_9d59b516e9 -> origin/xmfan/ca_9d59b516e9 2025-12-04T08:53:59.3499097Z * [new branch] xmfan/ca_apr8 -> origin/xmfan/ca_apr8 2025-12-04T08:53:59.3499159Z * [new branch] xmfan/ca_base -> origin/xmfan/ca_base 2025-12-04T08:53:59.3499254Z * [new branch] xmfan/ca_dynamic -> origin/xmfan/ca_dynamic 2025-12-04T08:53:59.3499321Z * [new branch] xmfan/ca_fix_dyn -> origin/xmfan/ca_fix_dyn 2025-12-04T08:53:59.3499395Z * [new branch] xmfan/ca_fix_lowering -> origin/xmfan/ca_fix_lowering 2025-12-04T08:53:59.3499470Z * [new branch] xmfan/ca_fix_polyfills -> origin/xmfan/ca_fix_polyfills 2025-12-04T08:53:59.3499531Z * [new branch] xmfan/ca_jan3 -> origin/xmfan/ca_jan3 2025-12-04T08:53:59.3499594Z * [new branch] xmfan/ca_jun18 -> origin/xmfan/ca_jun18 2025-12-04T08:53:59.3499661Z * [new branch] xmfan/ca_jun24 -> origin/xmfan/ca_jun24 2025-12-04T08:53:59.3499726Z * [new branch] xmfan/ca_nested -> origin/xmfan/ca_nested 2025-12-04T08:53:59.3499794Z * [new branch] xmfan/ca_overhead -> origin/xmfan/ca_overhead 2025-12-04T08:53:59.3499886Z * [new branch] xmfan/ca_overhead_0eba7e5451 -> origin/xmfan/ca_overhead_0eba7e5451 2025-12-04T08:53:59.3499955Z * [new branch] xmfan/cacu_jun18 -> origin/xmfan/cacu_jun18 2025-12-04T08:53:59.3500020Z * [new branch] xmfan/cacu_jun19 -> origin/xmfan/cacu_jun19 2025-12-04T08:53:59.3500085Z * [new branch] xmfan/cacu_jun4 -> origin/xmfan/cacu_jun4 2025-12-04T08:53:59.3500167Z * [new branch] xmfan/disable_duck_shape -> origin/xmfan/disable_duck_shape 2025-12-04T08:53:59.3500264Z * [new branch] xmfan/fca_cpp_node_passthrough -> origin/xmfan/fca_cpp_node_passthrough 2025-12-04T08:53:59.3500418Z * [new branch] xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 2025-12-04T08:53:59.3500562Z * [new branch] xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 2025-12-04T08:53:59.3500633Z * [new branch] xmfan/single_step -> origin/xmfan/single_step 2025-12-04T08:53:59.3500698Z * [new branch] xmfan/sth_0829 -> origin/xmfan/sth_0829 2025-12-04T08:53:59.3500758Z * [new branch] xmfan/test -> origin/xmfan/test 2025-12-04T08:53:59.3500845Z * [new branch] yguo/debug-0226-constexpr -> origin/yguo/debug-0226-constexpr 2025-12-04T08:53:59.3500922Z * [new branch] yguo/new_latest_changes -> origin/yguo/new_latest_changes 2025-12-04T08:53:59.3501015Z * [new branch] yguo/patch_constexpr_changes -> origin/yguo/patch_constexpr_changes 2025-12-04T08:53:59.3501113Z * [new branch] yiming/bootcamp -> origin/yiming/bootcamp 2025-12-04T08:53:59.3501215Z * [new branch] yiming/run_with_start_end_rng_hop -> origin/yiming/run_with_start_end_rng_hop 2025-12-04T08:53:59.3501278Z * [new branch] yolo-llama3 -> origin/yolo-llama3 2025-12-04T08:53:59.3501349Z * [new branch] zainr/canary-test -> origin/zainr/canary-test 2025-12-04T08:53:59.3501436Z * [new branch] zainr/cleanup-gh-runners -> origin/zainr/cleanup-gh-runners 2025-12-04T08:53:59.3501514Z * [new branch] zainr/pull-migration-c -> origin/zainr/pull-migration-c 2025-12-04T08:53:59.3501575Z * [new branch] zainr/test2 -> origin/zainr/test2 2025-12-04T08:53:59.3501648Z * [new branch] zasdfgbnm-patch-3 -> origin/zasdfgbnm-patch-3 2025-12-04T08:53:59.3501712Z * [new branch] zb2p -> origin/zb2p 2025-12-04T08:53:59.3501794Z * [new branch] zeros-and-scatter-part2 -> origin/zeros-and-scatter-part2 2025-12-04T08:53:59.3501913Z * [new branch] zhxchen17/ci/vllm_lora_oom -> origin/zhxchen17/ci/vllm_lora_oom 2025-12-04T08:53:59.3502014Z * [new branch] zhxchen17/ci/vllm_multimodal_oom -> origin/zhxchen17/ci/vllm_multimodal_oom 2025-12-04T08:53:59.3502135Z * [new branch] zhxchen17/ci/vllm_pin -> origin/zhxchen17/ci/vllm_pin 2025-12-04T08:53:59.3502257Z * [new branch] zhxchen17/dynamo/unsafe_drop_all_guards -> origin/zhxchen17/dynamo/unsafe_drop_all_guards 2025-12-04T08:53:59.3502355Z * [new branch] zhxchen17/export/call_override -> origin/zhxchen17/export/call_override 2025-12-04T08:53:59.3502443Z * [new branch] zhxchen17/export/codemod1 -> origin/zhxchen17/export/codemod1 2025-12-04T08:53:59.3502531Z * [new branch] zhxchen17/export/ctx_return -> origin/zhxchen17/export/ctx_return 2025-12-04T08:53:59.3502660Z * [new branch] zhxchen17/export/disable_side_effect_warn -> origin/zhxchen17/export/disable_side_effect_warn 2025-12-04T08:53:59.3502759Z * [new branch] zhxchen17/export/pytree_check -> origin/zhxchen17/export/pytree_check 2025-12-04T08:53:59.3502847Z * [new branch] zhxchen17/precompile/aoti -> origin/zhxchen17/precompile/aoti 2025-12-04T08:53:59.3502943Z * [new branch] zhxchen17/precompile/globals -> origin/zhxchen17/precompile/globals 2025-12-04T08:53:59.3503062Z * [new branch] zhxchen17/precompile/inductor_guards -> origin/zhxchen17/precompile/inductor_guards 2025-12-04T08:53:59.3503134Z * [new branch] zhxchen17/scratch/0 -> origin/zhxchen17/scratch/0 2025-12-04T08:53:59.3503237Z * [new branch] zhxchen17/torch_export_api_update -> origin/zhxchen17/torch_export_api_update 2025-12-04T08:53:59.3503314Z * [new branch] zhxhcen17/moodycamel -> origin/zhxhcen17/moodycamel 2025-12-04T08:53:59.3503388Z * [new branch] zxiiro/build-times -> origin/zxiiro/build-times 2025-12-04T08:53:59.3503461Z * [new branch] zxiiro/c7i.2xlarge -> origin/zxiiro/c7i.2xlarge 2025-12-04T08:53:59.3503539Z * [new branch] zxiiro/c7i.2xlarge.h100 -> origin/zxiiro/c7i.2xlarge.h100 2025-12-04T08:53:59.3503602Z * [new branch] zxiiro/main -> origin/zxiiro/main 2025-12-04T08:53:59.3503666Z * [new branch] zxiiro/risc64 -> origin/zxiiro/risc64 2025-12-04T08:53:59.3503756Z * [new branch] zxiiro/test-multicloud-arc -> origin/zxiiro/test-multicloud-arc 2025-12-04T08:53:59.3503815Z * [new tag] ciflow/dynamo/169525 -> ciflow/dynamo/169525 2025-12-04T08:53:59.3503886Z t [tag update] ciflow/inductor/167647 -> ciflow/inductor/167647 2025-12-04T08:53:59.3503993Z t [tag update] ciflow/inductor/168266 -> ciflow/inductor/168266 2025-12-04T08:53:59.3504058Z t [tag update] ciflow/inductor/169535 -> ciflow/inductor/169535 2025-12-04T08:53:59.3504118Z * [new tag] ciflow/trunk/165728 -> ciflow/trunk/165728 2025-12-04T08:53:59.3504177Z * [new tag] ciflow/trunk/169048 -> ciflow/trunk/169048 2025-12-04T08:53:59.3504236Z * [new tag] ciflow/trunk/169125 -> ciflow/trunk/169125 2025-12-04T08:53:59.3504296Z * [new tag] ciflow/trunk/169555 -> ciflow/trunk/169555 2025-12-04T08:53:59.3504354Z * [new tag] ciflow/xpu/169555 -> ciflow/xpu/169555 2025-12-04T08:53:59.5382314Z [command]/usr/bin/git rev-parse --verify --quiet ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32^{object} 2025-12-04T08:53:59.5538365Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T08:53:59.5542428Z ##[endgroup] 2025-12-04T08:53:59.5542917Z ##[group]Determining the checkout info 2025-12-04T08:53:59.5544116Z ##[endgroup] 2025-12-04T08:53:59.5548890Z [command]/usr/bin/git sparse-checkout disable 2025-12-04T08:53:59.5637317Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-12-04T08:53:59.5660883Z ##[group]Checking out the ref 2025-12-04T08:53:59.5662637Z [command]/usr/bin/git checkout --progress --force ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T08:53:59.5924803Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919) 2025-12-04T08:53:59.5931141Z ##[endgroup] 2025-12-04T08:53:59.5931368Z ##[group]Setting up auth for fetching submodules 2025-12-04T08:53:59.5935913Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T08:53:59.5960017Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-12-04T08:53:59.5975112Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-12-04T08:53:59.5989809Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-12-04T08:53:59.6003181Z ##[endgroup] 2025-12-04T08:53:59.6003346Z ##[group]Fetching submodules 2025-12-04T08:53:59.6004887Z [command]/usr/bin/git submodule sync --recursive 2025-12-04T08:53:59.6210065Z Synchronizing submodule url for 'android/libs/fbjni' 2025-12-04T08:53:59.6223645Z Synchronizing submodule url for 'third_party/FP16' 2025-12-04T08:53:59.6235161Z Synchronizing submodule url for 'third_party/FXdiv' 2025-12-04T08:53:59.6247474Z Synchronizing submodule url for 'third_party/NNPACK' 2025-12-04T08:53:59.6259837Z Synchronizing submodule url for 'third_party/NVTX' 2025-12-04T08:53:59.6278323Z Synchronizing submodule url for 'third_party/VulkanMemoryAllocator' 2025-12-04T08:53:59.6291062Z Synchronizing submodule url for 'third_party/XNNPACK' 2025-12-04T08:53:59.6308963Z Synchronizing submodule url for 'third_party/aiter' 2025-12-04T08:53:59.6323332Z Synchronizing submodule url for 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:53:59.6343997Z Synchronizing submodule url for 'third_party/benchmark' 2025-12-04T08:53:59.6354847Z Synchronizing submodule url for 'third_party/composable_kernel' 2025-12-04T08:53:59.6367237Z Synchronizing submodule url for 'third_party/cpp-httplib' 2025-12-04T08:53:59.6376730Z Synchronizing submodule url for 'third_party/cpuinfo' 2025-12-04T08:53:59.6386325Z Synchronizing submodule url for 'third_party/cudnn_frontend' 2025-12-04T08:53:59.6395806Z Synchronizing submodule url for 'third_party/cutlass' 2025-12-04T08:53:59.6411718Z Synchronizing submodule url for 'third_party/fbgemm' 2025-12-04T08:53:59.6426299Z Synchronizing submodule url for 'third_party/fbgemm/external/asmjit' 2025-12-04T08:53:59.6438751Z Synchronizing submodule url for 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:53:59.6451975Z Synchronizing submodule url for 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:53:59.6466336Z Synchronizing submodule url for 'third_party/fbgemm/external/cutlass' 2025-12-04T08:53:59.6482057Z Synchronizing submodule url for 'third_party/fbgemm/external/googletest' 2025-12-04T08:53:59.6492394Z Synchronizing submodule url for 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:53:59.6501354Z Synchronizing submodule url for 'third_party/fbgemm/external/json' 2025-12-04T08:53:59.6514226Z Synchronizing submodule url for 'third_party/flash-attention' 2025-12-04T08:53:59.6526518Z Synchronizing submodule url for 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:53:59.6538085Z Synchronizing submodule url for 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:53:59.6553487Z Synchronizing submodule url for 'third_party/flatbuffers' 2025-12-04T08:53:59.6570898Z Synchronizing submodule url for 'third_party/fmt' 2025-12-04T08:53:59.6582114Z Synchronizing submodule url for 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:53:59.6599690Z Synchronizing submodule url for 'third_party/gloo' 2025-12-04T08:53:59.6610779Z Synchronizing submodule url for 'third_party/googletest' 2025-12-04T08:53:59.6621371Z Synchronizing submodule url for 'third_party/ideep' 2025-12-04T08:53:59.6632818Z Synchronizing submodule url for 'third_party/ideep/mkl-dnn' 2025-12-04T08:53:59.6649114Z Synchronizing submodule url for 'third_party/ittapi' 2025-12-04T08:53:59.6661041Z Synchronizing submodule url for 'third_party/kineto' 2025-12-04T08:53:59.6674152Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:53:59.6686105Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:53:59.6699015Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:53:59.6710406Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:53:59.6722818Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:53:59.6733870Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:53:59.6749540Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:53:59.6760450Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:53:59.6771058Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:53:59.6782023Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:53:59.6790892Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:53:59.6801632Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:53:59.6813244Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:53:59.6828381Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:53:59.6838574Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:53:59.6851790Z Synchronizing submodule url for 'third_party/kleidiai' 2025-12-04T08:53:59.6863165Z Synchronizing submodule url for 'third_party/mimalloc' 2025-12-04T08:53:59.6873962Z Synchronizing submodule url for 'third_party/nlohmann' 2025-12-04T08:53:59.6884876Z Synchronizing submodule url for 'third_party/onnx' 2025-12-04T08:53:59.6904452Z Synchronizing submodule url for 'third_party/onnx/third_party/pybind11' 2025-12-04T08:53:59.6916613Z Synchronizing submodule url for 'third_party/opentelemetry-cpp' 2025-12-04T08:53:59.6930963Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:53:59.6944017Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:53:59.6954524Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:53:59.6963824Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:53:59.6974255Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:53:59.6985210Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:53:59.6995827Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:53:59.7007117Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:53:59.7018363Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:53:59.7029041Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:53:59.7049309Z Synchronizing submodule url for 'third_party/pocketfft' 2025-12-04T08:53:59.7059773Z Synchronizing submodule url for 'third_party/protobuf' 2025-12-04T08:53:59.7081110Z Synchronizing submodule url for 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:53:59.7092020Z Synchronizing submodule url for 'third_party/protobuf/third_party/googletest' 2025-12-04T08:53:59.7105626Z Synchronizing submodule url for 'third_party/psimd' 2025-12-04T08:53:59.7116296Z Synchronizing submodule url for 'third_party/pthreadpool' 2025-12-04T08:53:59.7127009Z Synchronizing submodule url for 'third_party/pybind11' 2025-12-04T08:53:59.7137595Z Synchronizing submodule url for 'third_party/python-peachpy' 2025-12-04T08:53:59.7148067Z Synchronizing submodule url for 'third_party/sleef' 2025-12-04T08:53:59.7158973Z Synchronizing submodule url for 'third_party/tensorpipe' 2025-12-04T08:53:59.7170158Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:53:59.7181151Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:53:59.7190617Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:53:59.7201354Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:53:59.7211285Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:53:59.7235291Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-12-04T08:53:59.7475587Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-12-04T08:53:59.7548628Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-12-04T08:53:59.7604978Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-12-04T08:53:59.7723381Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-12-04T08:53:59.7789292Z Submodule path 'third_party/NVTX': checked out '3ebbc93ded7285963bff932c678fa367eb393ba6' 2025-12-04T08:53:59.7843682Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-12-04T08:54:00.2921301Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-12-04T08:54:00.3105279Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-12-04T08:54:00.3297653Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-12-04T08:54:00.3421203Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-12-04T08:54:00.3642317Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T08:54:00.3732841Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246' 2025-12-04T08:54:00.4376870Z Submodule path 'third_party/cpuinfo': checked out 'f858c30bcb16f8effd5ff46996f0514539e17abc' 2025-12-04T08:54:00.4470893Z Submodule path 'third_party/cudnn_frontend': checked out '0b1577c8c83401237d601d0d0db5210506705396' 2025-12-04T08:54:00.4611503Z Submodule path 'third_party/cutlass': checked out 'f88806b1e31dfa579842638740216dd41fc6c588' 2025-12-04T08:54:00.5339929Z Submodule path 'third_party/fbgemm': checked out 'c0b988d39a9e47c794d699f29930ed4d7c7e13a4' 2025-12-04T08:54:00.5641063Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea' 2025-12-04T08:54:00.7384536Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T08:54:00.8029203Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-12-04T08:54:01.2339745Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '98125ce499b0fdf7ffbe0e3052f5b8709f4840f8' 2025-12-04T08:54:01.2555622Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T08:54:01.2647831Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691' 2025-12-04T08:54:01.3183273Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-12-04T08:54:01.3297216Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-12-04T08:54:01.3489514Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-12-04T08:54:01.3600109Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-12-04T08:54:01.3703286Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-12-04T08:54:01.3850687Z Submodule path 'third_party/fmt': checked out '407c905e45ad75fc29bf0f9bb7c5c2fd3475976f' 2025-12-04T08:54:01.4050755Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-12-04T08:54:01.4188359Z Submodule path 'third_party/gloo': checked out '54cbae0d3a67fa890b4c3d9ee162b7860315e341' 2025-12-04T08:54:01.4392864Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T08:54:01.4493771Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-12-04T08:54:01.8579704Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-12-04T08:54:01.8683179Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-12-04T08:54:01.8774697Z Submodule path 'third_party/kineto': checked out '31f85df8fbd89c188f14ef10f1ec65379786b943' 2025-12-04T08:54:01.8860500Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out 'd2ffe0a4e3acace628db49974246b66fc3e85fb1' 2025-12-04T08:54:01.8961078Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-12-04T08:54:01.9037723Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-12-04T08:54:01.9114495Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-12-04T08:54:01.9182154Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-12-04T08:54:01.9250652Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-12-04T08:54:01.9320915Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-12-04T08:54:01.9404133Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T08:54:01.9491106Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-12-04T08:54:01.9545823Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-12-04T08:54:01.9622175Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp': checked out 'b1234816facfdda29845c46696a02998a4af115a' 2025-12-04T08:54:01.9697327Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'd7ba35bbb649209c66e582d5a0244ba988a15159' 2025-12-04T08:54:01.9753362Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T08:54:01.9811264Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-12-04T08:54:01.9876476Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T08:54:01.9953575Z Submodule path 'third_party/kleidiai': checked out 'd7770c89632329a9914ef1a90289917597639cbe' 2025-12-04T08:54:02.0023598Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-12-04T08:54:02.0125174Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-12-04T08:54:02.1893103Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-12-04T08:54:02.2077205Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-12-04T08:54:02.2179851Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-12-04T08:54:02.2247782Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-12-04T08:54:02.2324657Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-12-04T08:54:02.2398742Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-12-04T08:54:02.2491664Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-12-04T08:54:02.2557117Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-12-04T08:54:02.2608862Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-12-04T08:54:02.2675130Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-12-04T08:54:02.2758258Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-12-04T08:54:02.2835080Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T08:54:02.2993327Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-12-04T08:54:02.3049420Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-12-04T08:54:02.4390234Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-12-04T08:54:02.4490003Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-12-04T08:54:02.4716617Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-12-04T08:54:02.4781922Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-12-04T08:54:02.4877158Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-12-04T08:54:02.5068530Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8' 2025-12-04T08:54:02.5303009Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-12-04T08:54:02.5552144Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-12-04T08:54:02.5663102Z Submodule path 'third_party/tensorpipe': checked out '2b4cd91092d335a697416b2a3cb398283246849d' 2025-12-04T08:54:02.5846834Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-12-04T08:54:02.5952259Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-12-04T08:54:02.6235271Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b' 2025-12-04T08:54:02.6372808Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-12-04T08:54:02.6455368Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-12-04T08:54:02.6490829Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-12-04T08:54:02.6722042Z Entering 'android/libs/fbjni' 2025-12-04T08:54:02.6743881Z Entering 'third_party/FP16' 2025-12-04T08:54:02.6765920Z Entering 'third_party/FXdiv' 2025-12-04T08:54:02.6792744Z Entering 'third_party/NNPACK' 2025-12-04T08:54:02.6814645Z Entering 'third_party/NVTX' 2025-12-04T08:54:02.6843622Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T08:54:02.6867849Z Entering 'third_party/XNNPACK' 2025-12-04T08:54:02.6896222Z Entering 'third_party/aiter' 2025-12-04T08:54:02.6920712Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:54:02.6948625Z Entering 'third_party/benchmark' 2025-12-04T08:54:02.6976711Z Entering 'third_party/composable_kernel' 2025-12-04T08:54:02.7005440Z Entering 'third_party/cpp-httplib' 2025-12-04T08:54:02.7027695Z Entering 'third_party/cpuinfo' 2025-12-04T08:54:02.7053995Z Entering 'third_party/cudnn_frontend' 2025-12-04T08:54:02.7082530Z Entering 'third_party/cutlass' 2025-12-04T08:54:02.7125317Z Entering 'third_party/fbgemm' 2025-12-04T08:54:02.7156497Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T08:54:02.7187796Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:54:02.7215489Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:54:02.7239294Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T08:54:02.7261509Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T08:54:02.7284369Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:54:02.7302231Z Entering 'third_party/fbgemm/external/json' 2025-12-04T08:54:02.7323914Z Entering 'third_party/flash-attention' 2025-12-04T08:54:02.7345890Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:54:02.7373249Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:54:02.7403566Z Entering 'third_party/flatbuffers' 2025-12-04T08:54:02.7426535Z Entering 'third_party/fmt' 2025-12-04T08:54:02.7447839Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:54:02.7470400Z Entering 'third_party/gloo' 2025-12-04T08:54:02.7498704Z Entering 'third_party/googletest' 2025-12-04T08:54:02.7528505Z Entering 'third_party/ideep' 2025-12-04T08:54:02.7552524Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T08:54:02.7574877Z Entering 'third_party/ittapi' 2025-12-04T08:54:02.7598214Z Entering 'third_party/kineto' 2025-12-04T08:54:02.7623233Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:54:02.7646833Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:54:02.7673501Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:54:02.7706147Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:54:02.7728212Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:54:02.7753825Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:54:02.7775212Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:54:02.7800407Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:54:02.7832914Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:54:02.7863950Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:54:02.7895621Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:54:02.7920800Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:02.7951067Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:02.7980782Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:54:02.8002361Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:54:02.8025844Z Entering 'third_party/kleidiai' 2025-12-04T08:54:02.8049523Z Entering 'third_party/mimalloc' 2025-12-04T08:54:02.8078601Z Entering 'third_party/nlohmann' 2025-12-04T08:54:02.8107792Z Entering 'third_party/onnx' 2025-12-04T08:54:02.8137582Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T08:54:02.8169728Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T08:54:02.8195690Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:54:02.8220820Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:54:02.8246100Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:54:02.8268102Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:54:02.8292063Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:54:02.8314194Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:54:02.8339035Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:54:02.8367382Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:02.8387306Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:02.8410905Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:54:02.8445081Z Entering 'third_party/pocketfft' 2025-12-04T08:54:02.8469085Z Entering 'third_party/protobuf' 2025-12-04T08:54:02.8493854Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:54:02.8515001Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T08:54:02.8546311Z Entering 'third_party/psimd' 2025-12-04T08:54:02.8575166Z Entering 'third_party/pthreadpool' 2025-12-04T08:54:02.8597944Z Entering 'third_party/pybind11' 2025-12-04T08:54:02.8620764Z Entering 'third_party/python-peachpy' 2025-12-04T08:54:02.8647972Z Entering 'third_party/sleef' 2025-12-04T08:54:02.8673761Z Entering 'third_party/tensorpipe' 2025-12-04T08:54:02.8700700Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:54:02.8722725Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:54:02.8744737Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:54:02.8764151Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:54:02.8784084Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:54:02.8828668Z ##[endgroup] 2025-12-04T08:54:02.8828891Z ##[group]Persisting credentials for submodules 2025-12-04T08:54:02.8837450Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-12-04T08:54:02.9026292Z Entering 'android/libs/fbjni' 2025-12-04T08:54:02.9051676Z Entering 'third_party/FP16' 2025-12-04T08:54:02.9077333Z Entering 'third_party/FXdiv' 2025-12-04T08:54:02.9103848Z Entering 'third_party/NNPACK' 2025-12-04T08:54:02.9131569Z Entering 'third_party/NVTX' 2025-12-04T08:54:02.9162330Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T08:54:02.9192936Z Entering 'third_party/XNNPACK' 2025-12-04T08:54:02.9229690Z Entering 'third_party/aiter' 2025-12-04T08:54:02.9263198Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:54:02.9291815Z Entering 'third_party/benchmark' 2025-12-04T08:54:02.9317977Z Entering 'third_party/composable_kernel' 2025-12-04T08:54:02.9343454Z Entering 'third_party/cpp-httplib' 2025-12-04T08:54:02.9363885Z Entering 'third_party/cpuinfo' 2025-12-04T08:54:02.9388759Z Entering 'third_party/cudnn_frontend' 2025-12-04T08:54:02.9409002Z Entering 'third_party/cutlass' 2025-12-04T08:54:02.9438420Z Entering 'third_party/fbgemm' 2025-12-04T08:54:02.9464423Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T08:54:02.9486137Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:54:02.9515240Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:54:02.9537595Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T08:54:02.9562472Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T08:54:02.9582803Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:54:02.9602090Z Entering 'third_party/fbgemm/external/json' 2025-12-04T08:54:02.9628204Z Entering 'third_party/flash-attention' 2025-12-04T08:54:02.9656833Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:54:02.9689726Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:54:02.9718795Z Entering 'third_party/flatbuffers' 2025-12-04T08:54:02.9742628Z Entering 'third_party/fmt' 2025-12-04T08:54:02.9764958Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:54:02.9791417Z Entering 'third_party/gloo' 2025-12-04T08:54:02.9815852Z Entering 'third_party/googletest' 2025-12-04T08:54:02.9840956Z Entering 'third_party/ideep' 2025-12-04T08:54:02.9865678Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T08:54:02.9891148Z Entering 'third_party/ittapi' 2025-12-04T08:54:02.9915696Z Entering 'third_party/kineto' 2025-12-04T08:54:02.9938427Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:54:02.9961008Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:54:02.9983668Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:54:03.0005404Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:54:03.0026089Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:54:03.0051155Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:54:03.0087518Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:54:03.0110541Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:54:03.0138053Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:54:03.0157728Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:54:03.0179996Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:54:03.0201800Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:03.0235194Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:03.0262179Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:54:03.0285678Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:54:03.0312410Z Entering 'third_party/kleidiai' 2025-12-04T08:54:03.0336907Z Entering 'third_party/mimalloc' 2025-12-04T08:54:03.0357214Z Entering 'third_party/nlohmann' 2025-12-04T08:54:03.0380130Z Entering 'third_party/onnx' 2025-12-04T08:54:03.0407534Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T08:54:03.0432624Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T08:54:03.0455337Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:54:03.0483747Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:54:03.0516707Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:54:03.0538196Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:54:03.0564820Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:54:03.0586458Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:54:03.0611925Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:54:03.0639053Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:03.0668428Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:03.0694636Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:54:03.0734416Z Entering 'third_party/pocketfft' 2025-12-04T08:54:03.0757554Z Entering 'third_party/protobuf' 2025-12-04T08:54:03.0781581Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:54:03.0806256Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T08:54:03.0830593Z Entering 'third_party/psimd' 2025-12-04T08:54:03.0853076Z Entering 'third_party/pthreadpool' 2025-12-04T08:54:03.0878347Z Entering 'third_party/pybind11' 2025-12-04T08:54:03.0902222Z Entering 'third_party/python-peachpy' 2025-12-04T08:54:03.0929442Z Entering 'third_party/sleef' 2025-12-04T08:54:03.0958679Z Entering 'third_party/tensorpipe' 2025-12-04T08:54:03.0982995Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:54:03.1005878Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:54:03.1028135Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:54:03.1051066Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:54:03.1071715Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:54:03.1109842Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-12-04T08:54:03.1280136Z Entering 'android/libs/fbjni' 2025-12-04T08:54:03.1301718Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T08:54:03.1318634Z Entering 'third_party/FP16' 2025-12-04T08:54:03.1345336Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T08:54:03.1361907Z Entering 'third_party/FXdiv' 2025-12-04T08:54:03.1382036Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T08:54:03.1395662Z Entering 'third_party/NNPACK' 2025-12-04T08:54:03.1420140Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T08:54:03.1430315Z Entering 'third_party/NVTX' 2025-12-04T08:54:03.1450687Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T08:54:03.1461220Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T08:54:03.1483440Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T08:54:03.1494487Z Entering 'third_party/XNNPACK' 2025-12-04T08:54:03.1519877Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T08:54:03.1535516Z Entering 'third_party/aiter' 2025-12-04T08:54:03.1556890Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T08:54:03.1567789Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:54:03.1587978Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T08:54:03.1605977Z Entering 'third_party/benchmark' 2025-12-04T08:54:03.1626878Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T08:54:03.1639974Z Entering 'third_party/composable_kernel' 2025-12-04T08:54:03.1660115Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T08:54:03.1674112Z Entering 'third_party/cpp-httplib' 2025-12-04T08:54:03.1694916Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T08:54:03.1706182Z Entering 'third_party/cpuinfo' 2025-12-04T08:54:03.1731909Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T08:54:03.1745376Z Entering 'third_party/cudnn_frontend' 2025-12-04T08:54:03.1772323Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T08:54:03.1787210Z Entering 'third_party/cutlass' 2025-12-04T08:54:03.1817350Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T08:54:03.1833240Z Entering 'third_party/fbgemm' 2025-12-04T08:54:03.1855479Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T08:54:03.1865203Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T08:54:03.1884621Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T08:54:03.1895211Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:54:03.1919721Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T08:54:03.1933773Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:54:03.1953061Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T08:54:03.1963151Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T08:54:03.1984593Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T08:54:03.1998119Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T08:54:03.2025602Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T08:54:03.2036120Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:54:03.2055365Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T08:54:03.2064481Z Entering 'third_party/fbgemm/external/json' 2025-12-04T08:54:03.2086838Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T08:54:03.2099050Z Entering 'third_party/flash-attention' 2025-12-04T08:54:03.2120006Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T08:54:03.2128891Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:54:03.2148453Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T08:54:03.2160738Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:54:03.2187566Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T08:54:03.2202473Z Entering 'third_party/flatbuffers' 2025-12-04T08:54:03.2221411Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T08:54:03.2232773Z Entering 'third_party/fmt' 2025-12-04T08:54:03.2253521Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T08:54:03.2266593Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:54:03.2286728Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T08:54:03.2300522Z Entering 'third_party/gloo' 2025-12-04T08:54:03.2321837Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T08:54:03.2332844Z Entering 'third_party/googletest' 2025-12-04T08:54:03.2351526Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:54:03.2362609Z Entering 'third_party/ideep' 2025-12-04T08:54:03.2383274Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T08:54:03.2392620Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T08:54:03.2412608Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T08:54:03.2425739Z Entering 'third_party/ittapi' 2025-12-04T08:54:03.2445755Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T08:54:03.2455875Z Entering 'third_party/kineto' 2025-12-04T08:54:03.2477656Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T08:54:03.2487871Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:54:03.2508982Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T08:54:03.2518527Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:54:03.2539788Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T08:54:03.2554248Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:54:03.2572802Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T08:54:03.2581505Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:54:03.2601168Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T08:54:03.2610087Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:54:03.2630216Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T08:54:03.2642264Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:54:03.2670656Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T08:54:03.2682213Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:54:03.2705629Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T08:54:03.2714983Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:54:03.2735066Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:54:03.2744761Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:54:03.2767293Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T08:54:03.2776243Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:54:03.2796769Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T08:54:03.2806479Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:54:03.2825738Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T08:54:03.2834888Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:03.2860095Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T08:54:03.2870676Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:03.2902201Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T08:54:03.2916809Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:54:03.2937119Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T08:54:03.2946758Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:54:03.2970621Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T08:54:03.2981454Z Entering 'third_party/kleidiai' 2025-12-04T08:54:03.3007922Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T08:54:03.3024705Z Entering 'third_party/mimalloc' 2025-12-04T08:54:03.3046538Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T08:54:03.3058828Z Entering 'third_party/nlohmann' 2025-12-04T08:54:03.3080969Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T08:54:03.3092220Z Entering 'third_party/onnx' 2025-12-04T08:54:03.3112874Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T08:54:03.3128835Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T08:54:03.3148714Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T08:54:03.3160919Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T08:54:03.3181308Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T08:54:03.3191312Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:54:03.3211172Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T08:54:03.3220871Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:54:03.3240108Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:54:03.3249747Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:54:03.3268870Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T08:54:03.3278476Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:54:03.3298675Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T08:54:03.3310251Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:54:03.3329913Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T08:54:03.3339523Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:54:03.3359025Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T08:54:03.3368459Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:54:03.3387353Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T08:54:03.3396913Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:03.3416686Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T08:54:03.3434017Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:03.3453386Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T08:54:03.3465023Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:54:03.3491935Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T08:54:03.3508993Z Entering 'third_party/pocketfft' 2025-12-04T08:54:03.3529332Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T08:54:03.3539886Z Entering 'third_party/protobuf' 2025-12-04T08:54:03.3558581Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T08:54:03.3568978Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:54:03.3587327Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T08:54:03.3596717Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T08:54:03.3615962Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:54:03.3627070Z Entering 'third_party/psimd' 2025-12-04T08:54:03.3654490Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T08:54:03.3666872Z Entering 'third_party/pthreadpool' 2025-12-04T08:54:03.3686475Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T08:54:03.3695841Z Entering 'third_party/pybind11' 2025-12-04T08:54:03.3718857Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T08:54:03.3728277Z Entering 'third_party/python-peachpy' 2025-12-04T08:54:03.3754796Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T08:54:03.3764578Z Entering 'third_party/sleef' 2025-12-04T08:54:03.3788610Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T08:54:03.3804089Z Entering 'third_party/tensorpipe' 2025-12-04T08:54:03.3850155Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T08:54:03.3867310Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:54:03.3909521Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:54:03.3921842Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:54:03.3955821Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T08:54:03.3967396Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:54:03.4003308Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T08:54:03.4016982Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:54:03.4036159Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T08:54:03.4051983Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:54:03.4077047Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T08:54:03.4275839Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-12-04T08:54:03.4494986Z Entering 'android/libs/fbjni' 2025-12-04T08:54:03.4519461Z Entering 'third_party/FP16' 2025-12-04T08:54:03.4546007Z Entering 'third_party/FXdiv' 2025-12-04T08:54:03.4565911Z Entering 'third_party/NNPACK' 2025-12-04T08:54:03.4587175Z Entering 'third_party/NVTX' 2025-12-04T08:54:03.4607341Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T08:54:03.4627904Z Entering 'third_party/XNNPACK' 2025-12-04T08:54:03.4654496Z Entering 'third_party/aiter' 2025-12-04T08:54:03.4678473Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:54:03.4704183Z Entering 'third_party/benchmark' 2025-12-04T08:54:03.4729577Z Entering 'third_party/composable_kernel' 2025-12-04T08:54:03.4758332Z Entering 'third_party/cpp-httplib' 2025-12-04T08:54:03.4787735Z Entering 'third_party/cpuinfo' 2025-12-04T08:54:03.4810890Z Entering 'third_party/cudnn_frontend' 2025-12-04T08:54:03.4832209Z Entering 'third_party/cutlass' 2025-12-04T08:54:03.4858474Z Entering 'third_party/fbgemm' 2025-12-04T08:54:03.4882805Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T08:54:03.4902184Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:54:03.4926414Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:54:03.4952245Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T08:54:03.4974048Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T08:54:03.4994250Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:54:03.5014101Z Entering 'third_party/fbgemm/external/json' 2025-12-04T08:54:03.5039229Z Entering 'third_party/flash-attention' 2025-12-04T08:54:03.5060249Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:54:03.5083407Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:54:03.5108540Z Entering 'third_party/flatbuffers' 2025-12-04T08:54:03.5129273Z Entering 'third_party/fmt' 2025-12-04T08:54:03.5149494Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:54:03.5170643Z Entering 'third_party/gloo' 2025-12-04T08:54:03.5198885Z Entering 'third_party/googletest' 2025-12-04T08:54:03.5219933Z Entering 'third_party/ideep' 2025-12-04T08:54:03.5240237Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T08:54:03.5265441Z Entering 'third_party/ittapi' 2025-12-04T08:54:03.5286156Z Entering 'third_party/kineto' 2025-12-04T08:54:03.5306272Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:54:03.5325432Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:54:03.5345365Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:54:03.5363969Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:54:03.5382166Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:54:03.5399546Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:54:03.5421584Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:54:03.5443147Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:54:03.5468672Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:54:03.5488642Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:54:03.5507978Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:54:03.5526657Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:03.5547764Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:03.5571322Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:54:03.5590776Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:54:03.5611940Z Entering 'third_party/kleidiai' 2025-12-04T08:54:03.5633379Z Entering 'third_party/mimalloc' 2025-12-04T08:54:03.5656789Z Entering 'third_party/nlohmann' 2025-12-04T08:54:03.5681109Z Entering 'third_party/onnx' 2025-12-04T08:54:03.5707563Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T08:54:03.5730505Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T08:54:03.5751134Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:54:03.5770554Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:54:03.5790318Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:54:03.5816037Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:54:03.5842045Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:54:03.5863766Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:54:03.5883653Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:54:03.5903534Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:03.5922222Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:03.5942153Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:54:03.5970413Z Entering 'third_party/pocketfft' 2025-12-04T08:54:03.5994909Z Entering 'third_party/protobuf' 2025-12-04T08:54:03.6017907Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:54:03.6037859Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T08:54:03.6063106Z Entering 'third_party/psimd' 2025-12-04T08:54:03.6086956Z Entering 'third_party/pthreadpool' 2025-12-04T08:54:03.6110869Z Entering 'third_party/pybind11' 2025-12-04T08:54:03.6130723Z Entering 'third_party/python-peachpy' 2025-12-04T08:54:03.6150797Z Entering 'third_party/sleef' 2025-12-04T08:54:03.6169646Z Entering 'third_party/tensorpipe' 2025-12-04T08:54:03.6195153Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:54:03.6221978Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:54:03.6241638Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:54:03.6261920Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:54:03.6280037Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:54:03.6316168Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-12-04T08:54:03.6506203Z Entering 'android/libs/fbjni' 2025-12-04T08:54:03.6540645Z Entering 'third_party/FP16' 2025-12-04T08:54:03.6568176Z Entering 'third_party/FXdiv' 2025-12-04T08:54:03.6589415Z Entering 'third_party/NNPACK' 2025-12-04T08:54:03.6608943Z Entering 'third_party/NVTX' 2025-12-04T08:54:03.6628633Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T08:54:03.6652110Z Entering 'third_party/XNNPACK' 2025-12-04T08:54:03.6678496Z Entering 'third_party/aiter' 2025-12-04T08:54:03.6708544Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:54:03.6742743Z Entering 'third_party/benchmark' 2025-12-04T08:54:03.6765405Z Entering 'third_party/composable_kernel' 2025-12-04T08:54:03.6794550Z Entering 'third_party/cpp-httplib' 2025-12-04T08:54:03.6815204Z Entering 'third_party/cpuinfo' 2025-12-04T08:54:03.6834660Z Entering 'third_party/cudnn_frontend' 2025-12-04T08:54:03.6855171Z Entering 'third_party/cutlass' 2025-12-04T08:54:03.6879180Z Entering 'third_party/fbgemm' 2025-12-04T08:54:03.6901283Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T08:54:03.6919819Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:54:03.6942600Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:54:03.6962082Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T08:54:03.6985833Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T08:54:03.7005361Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:54:03.7024183Z Entering 'third_party/fbgemm/external/json' 2025-12-04T08:54:03.7050985Z Entering 'third_party/flash-attention' 2025-12-04T08:54:03.7077530Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:54:03.7098932Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:54:03.7124529Z Entering 'third_party/flatbuffers' 2025-12-04T08:54:03.7147454Z Entering 'third_party/fmt' 2025-12-04T08:54:03.7167426Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:54:03.7188008Z Entering 'third_party/gloo' 2025-12-04T08:54:03.7209736Z Entering 'third_party/googletest' 2025-12-04T08:54:03.7229465Z Entering 'third_party/ideep' 2025-12-04T08:54:03.7249130Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T08:54:03.7272263Z Entering 'third_party/ittapi' 2025-12-04T08:54:03.7293248Z Entering 'third_party/kineto' 2025-12-04T08:54:03.7313027Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:54:03.7330390Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:54:03.7350986Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:54:03.7377389Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:54:03.7401609Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:54:03.7424269Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:54:03.7445735Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:54:03.7463460Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:54:03.7481164Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:54:03.7499849Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:54:03.7519920Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:54:03.7539335Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:03.7570131Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:03.7595562Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:54:03.7616356Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:54:03.7642703Z Entering 'third_party/kleidiai' 2025-12-04T08:54:03.7662618Z Entering 'third_party/mimalloc' 2025-12-04T08:54:03.7683488Z Entering 'third_party/nlohmann' 2025-12-04T08:54:03.7705402Z Entering 'third_party/onnx' 2025-12-04T08:54:03.7729170Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T08:54:03.7751663Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T08:54:03.7772465Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:54:03.7798352Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:54:03.7817081Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:54:03.7836251Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:54:03.7855571Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:54:03.7875411Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:54:03.7895079Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:54:03.7913907Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:03.7938674Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:03.7962995Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:54:03.7990058Z Entering 'third_party/pocketfft' 2025-12-04T08:54:03.8015633Z Entering 'third_party/protobuf' 2025-12-04T08:54:03.8036674Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:54:03.8055184Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T08:54:03.8076823Z Entering 'third_party/psimd' 2025-12-04T08:54:03.8097788Z Entering 'third_party/pthreadpool' 2025-12-04T08:54:03.8118012Z Entering 'third_party/pybind11' 2025-12-04T08:54:03.8138848Z Entering 'third_party/python-peachpy' 2025-12-04T08:54:03.8171563Z Entering 'third_party/sleef' 2025-12-04T08:54:03.8194351Z Entering 'third_party/tensorpipe' 2025-12-04T08:54:03.8216014Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:54:03.8234152Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:54:03.8256136Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:54:03.8279331Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:54:03.8298492Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:54:03.8332593Z ##[endgroup] 2025-12-04T08:54:03.8610971Z [command]/usr/bin/git log -1 --format=%H 2025-12-04T08:54:03.8769963Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T08:54:03.8894067Z ##[group]Run actions/checkout@v4 2025-12-04T08:54:03.8894199Z with: 2025-12-04T08:54:03.8894314Z ref: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T08:54:03.8894449Z fetch-depth: 0 2025-12-04T08:54:03.8894546Z submodules: recursive 2025-12-04T08:54:03.8894649Z show-progress: false 2025-12-04T08:54:03.8894776Z repository: pytorch/pytorch 2025-12-04T08:54:03.8894957Z token: *** 2025-12-04T08:54:03.8895049Z ssh-strict: true 2025-12-04T08:54:03.8895150Z ssh-user: git 2025-12-04T08:54:03.8895247Z persist-credentials: true 2025-12-04T08:54:03.8895361Z clean: true 2025-12-04T08:54:03.8895466Z sparse-checkout-cone-mode: true 2025-12-04T08:54:03.8895590Z fetch-tags: false 2025-12-04T08:54:03.8895689Z lfs: false 2025-12-04T08:54:03.8895780Z set-safe-directory: true 2025-12-04T08:54:03.8895885Z env: 2025-12-04T08:54:03.8895980Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:03.8896084Z ##[endgroup] 2025-12-04T08:54:03.9351928Z Syncing repository: pytorch/pytorch 2025-12-04T08:54:03.9352234Z ##[group]Getting Git version info 2025-12-04T08:54:03.9352449Z Working directory is '/home/runner/_work/pytorch/pytorch' 2025-12-04T08:54:03.9366749Z [command]/usr/bin/git version 2025-12-04T08:54:03.9395463Z git version 2.52.0 2025-12-04T08:54:03.9416091Z ##[endgroup] 2025-12-04T08:54:03.9422240Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/dcbc1bd2-dbda-4811-b815-f3a2f10e13fb/.gitconfig' 2025-12-04T08:54:03.9429156Z Temporarily overriding HOME='/home/runner/_work/_temp/dcbc1bd2-dbda-4811-b815-f3a2f10e13fb' before making global git config changes 2025-12-04T08:54:03.9429478Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T08:54:03.9432082Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch 2025-12-04T08:54:03.9459214Z [command]/usr/bin/git config --local --get remote.origin.url 2025-12-04T08:54:03.9480006Z https://github.com/pytorch/pytorch 2025-12-04T08:54:03.9498002Z ##[group]Removing previously created refs, to avoid conflicts 2025-12-04T08:54:03.9501650Z [command]/usr/bin/git rev-parse --symbolic-full-name --verify --quiet HEAD 2025-12-04T08:54:03.9516933Z HEAD 2025-12-04T08:54:03.9551475Z ##[endgroup] 2025-12-04T08:54:03.9553636Z [command]/usr/bin/git submodule status 2025-12-04T08:54:03.9778874Z 7e1e1fe3858c63c251c637ae41a20de425dde96f android/libs/fbjni (v0.1.0-12-g7e1e1fe) 2025-12-04T08:54:03.9825527Z 4dfe081cf6bcd15db339cf2680b9281b8451eeb3 third_party/FP16 (4dfe081) 2025-12-04T08:54:03.9885627Z b408327ac2a15ec3e43352421954f5b1967701d1 third_party/FXdiv (b408327) 2025-12-04T08:54:03.9942583Z c07e3a0400713d546e0dea2d5466dd22ea389c73 third_party/NNPACK (c07e3a0) 2025-12-04T08:54:03.9982134Z 3ebbc93ded7285963bff932c678fa367eb393ba6 third_party/NVTX (v3.1.0-313-g3ebbc93) 2025-12-04T08:54:04.0033538Z 1d8f600fd424278486eade7ed3e877c99f0846b1 third_party/VulkanMemoryAllocator (v2.1.0-982-g1d8f600) 2025-12-04T08:54:04.0346163Z 51a0103656eff6fc9bfd39a4597923c4b542c883 third_party/XNNPACK (remotes/origin/ds/ndk-1243-g51a0103656) 2025-12-04T08:54:04.0372776Z 01aae101b9e5e94d6c16a9514c9fb8df99c93150 third_party/aiter (v0.1.1-92-g01aae101) 2025-12-04T08:54:04.0388916Z 299e5928955cc62af9968370293b916f5130916f third_party/benchmark (v1.9.3) 2025-12-04T08:54:04.0449728Z 7fe50dc3da2069d6645d9deb8c017a876472a977 third_party/composable_kernel (rocm-6.4.3-459-g7fe50dc3d) 2025-12-04T08:54:04.0525253Z 89c932f313c6437c38f2982869beacc89c2f2246 third_party/cpp-httplib (v0.26.0) 2025-12-04T08:54:04.0605096Z f858c30bcb16f8effd5ff46996f0514539e17abc third_party/cpuinfo (f858c30) 2025-12-04T08:54:04.0632677Z 0b1577c8c83401237d601d0d0db5210506705396 third_party/cudnn_frontend (v0.5-61-g0b1577c) 2025-12-04T08:54:04.0699823Z f88806b1e31dfa579842638740216dd41fc6c588 third_party/cutlass (v4.3.1) 2025-12-04T08:54:04.0723983Z c0b988d39a9e47c794d699f29930ed4d7c7e13a4 third_party/fbgemm (v1.4.0-rc1-2-gc0b988d39) 2025-12-04T08:54:04.0777976Z 979702c87a8713a8e0a5e9fee122b90d2ef13be5 third_party/flash-attention (v2.7.4) 2025-12-04T08:54:04.0792169Z a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757 third_party/flatbuffers (v24.12.23) 2025-12-04T08:54:04.1028644Z 407c905e45ad75fc29bf0f9bb7c5c2fd3475976f third_party/fmt (12.1.0) 2025-12-04T08:54:04.1100324Z 3fb5c176c17c765a3492cd2f0321b0dab712f350 third_party/gemmlowp/gemmlowp (remotes/origin/revert-87-master-135-g3fb5c17) 2025-12-04T08:54:04.1177230Z 54cbae0d3a67fa890b4c3d9ee162b7860315e341 third_party/gloo (remotes/origin/gh/c-p-i-o/1/base-37-g54cbae0) 2025-12-04T08:54:04.1314133Z 52eb8108c5bdec04579160ae17225d66034bd723 third_party/googletest (release-1.8.0-3544-g52eb8108) 2025-12-04T08:54:04.1368007Z 719d8e6cd7f7a0e01b155657526d693acf97c2b3 third_party/ideep (pytorch-rls-v3.7.1) 2025-12-04T08:54:04.1414922Z dec1d23ca65ab069d225dfe40dea14f455170959 third_party/ittapi (v3.25.5) 2025-12-04T08:54:04.1547160Z 31f85df8fbd89c188f14ef10f1ec65379786b943 third_party/kineto (heads/main) 2025-12-04T08:54:04.1562266Z d7770c89632329a9914ef1a90289917597639cbe third_party/kleidiai (v1.15.0) 2025-12-04T08:54:04.1586617Z fbd8b99c2b828428947d70fdc046bb55609be93e third_party/mimalloc (v2.2.4) 2025-12-04T08:54:04.1610983Z 55f93686c01528224f448c19128836e7df245f72 third_party/nlohmann (v3.12.0) 2025-12-04T08:54:04.1813352Z e709452ef2bbc1d113faf678c24e6d3467696e83 third_party/onnx (v1.18.0) 2025-12-04T08:54:04.1831406Z a799f4aed9c94b765dcdaabaeab7d5e7e2310878 third_party/opentelemetry-cpp (v1.14.2) 2025-12-04T08:54:04.1856184Z 0fa0ef591e38c2758e3184c6c23e497b9f732ffa third_party/pocketfft (release_for_eigen-40-g0fa0ef5) 2025-12-04T08:54:04.2058584Z d1eca4e4b421cd2997495c4b4e65cea6be4e9b8a third_party/protobuf (v3.7.0-rc.2-1279-gd1eca4e4b) 2025-12-04T08:54:04.2105831Z 072586a71b55b7f8c584153d223e95687148a900 third_party/psimd (heads/master) 2025-12-04T08:54:04.2144755Z 4fe0e1e183925bf8cfa6aae24237e724a96479b8 third_party/pthreadpool (0.1-144-g4fe0e1e) 2025-12-04T08:54:04.2163648Z f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8 third_party/pybind11 (v3.0.1) 2025-12-04T08:54:04.2221694Z f45429b087dd7d5bc78bb40dc7cf06425c252d67 third_party/python-peachpy (remotes/origin/pre-generated) 2025-12-04T08:54:04.2269519Z 5a1d179df9cf652951b59010a2d2075372d67f68 third_party/sleef (3.8) 2025-12-04T08:54:04.2310011Z 2b4cd91092d335a697416b2a3cb398283246849d third_party/tensorpipe (heads/main) 2025-12-04T08:54:04.2320680Z ##[group]Cleaning the repository 2025-12-04T08:54:04.2325620Z [command]/usr/bin/git clean -ffdx 2025-12-04T08:54:04.2440307Z [command]/usr/bin/git reset --hard HEAD 2025-12-04T08:54:04.3232304Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919) 2025-12-04T08:54:04.3283200Z ##[endgroup] 2025-12-04T08:54:04.3285662Z ##[group]Disabling automatic garbage collection 2025-12-04T08:54:04.3292300Z [command]/usr/bin/git config --local gc.auto 0 2025-12-04T08:54:04.3320899Z ##[endgroup] 2025-12-04T08:54:04.3321136Z ##[group]Setting up auth 2025-12-04T08:54:04.3326363Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T08:54:04.3360899Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T08:54:04.3586367Z Entering 'android/libs/fbjni' 2025-12-04T08:54:04.3608910Z Entering 'third_party/FP16' 2025-12-04T08:54:04.3647867Z Entering 'third_party/FXdiv' 2025-12-04T08:54:04.3676952Z Entering 'third_party/NNPACK' 2025-12-04T08:54:04.3702805Z Entering 'third_party/NVTX' 2025-12-04T08:54:04.3723256Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T08:54:04.3749004Z Entering 'third_party/XNNPACK' 2025-12-04T08:54:04.3781065Z Entering 'third_party/aiter' 2025-12-04T08:54:04.3814384Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:54:04.3861623Z Entering 'third_party/benchmark' 2025-12-04T08:54:04.3885765Z Entering 'third_party/composable_kernel' 2025-12-04T08:54:04.3918703Z Entering 'third_party/cpp-httplib' 2025-12-04T08:54:04.3940942Z Entering 'third_party/cpuinfo' 2025-12-04T08:54:04.3967704Z Entering 'third_party/cudnn_frontend' 2025-12-04T08:54:04.3989424Z Entering 'third_party/cutlass' 2025-12-04T08:54:04.4013916Z Entering 'third_party/fbgemm' 2025-12-04T08:54:04.4048577Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T08:54:04.4070198Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:54:04.4103472Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:54:04.4128173Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T08:54:04.4154118Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T08:54:04.4175699Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:54:04.4197992Z Entering 'third_party/fbgemm/external/json' 2025-12-04T08:54:04.4220442Z Entering 'third_party/flash-attention' 2025-12-04T08:54:04.4244731Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:54:04.4276325Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:54:04.4303006Z Entering 'third_party/flatbuffers' 2025-12-04T08:54:04.4328566Z Entering 'third_party/fmt' 2025-12-04T08:54:04.4351221Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:54:04.4375880Z Entering 'third_party/gloo' 2025-12-04T08:54:04.4403239Z Entering 'third_party/googletest' 2025-12-04T08:54:04.4428432Z Entering 'third_party/ideep' 2025-12-04T08:54:04.4451575Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T08:54:04.4481063Z Entering 'third_party/ittapi' 2025-12-04T08:54:04.4504731Z Entering 'third_party/kineto' 2025-12-04T08:54:04.4525108Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:54:04.4549573Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:54:04.4571521Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:54:04.4595926Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:54:04.4620217Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:54:04.4643446Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:54:04.4668051Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:54:04.4691043Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:54:04.4714818Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:54:04.4736625Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:54:04.4755774Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:54:04.4775524Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:04.4798858Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:04.4834371Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:54:04.4854307Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:54:04.4877563Z Entering 'third_party/kleidiai' 2025-12-04T08:54:04.4899596Z Entering 'third_party/mimalloc' 2025-12-04T08:54:04.4921074Z Entering 'third_party/nlohmann' 2025-12-04T08:54:04.4942580Z Entering 'third_party/onnx' 2025-12-04T08:54:04.4974077Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T08:54:04.4999844Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T08:54:04.5029019Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:54:04.5058632Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:54:04.5084129Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:54:04.5106425Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:54:04.5134976Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:54:04.5159607Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:54:04.5180746Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:54:04.5201155Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:04.5222127Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:04.5243304Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:54:04.5274210Z Entering 'third_party/pocketfft' 2025-12-04T08:54:04.5295558Z Entering 'third_party/protobuf' 2025-12-04T08:54:04.5319055Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:54:04.5352111Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T08:54:04.5375850Z Entering 'third_party/psimd' 2025-12-04T08:54:04.5396206Z Entering 'third_party/pthreadpool' 2025-12-04T08:54:04.5415413Z Entering 'third_party/pybind11' 2025-12-04T08:54:04.5438735Z Entering 'third_party/python-peachpy' 2025-12-04T08:54:04.5465678Z Entering 'third_party/sleef' 2025-12-04T08:54:04.5487191Z Entering 'third_party/tensorpipe' 2025-12-04T08:54:04.5507565Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:54:04.5529428Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:54:04.5552947Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:54:04.5590560Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:54:04.5618153Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:54:04.5657537Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T08:54:04.5672920Z http.https://github.com/.extraheader 2025-12-04T08:54:04.5681641Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-12-04T08:54:04.5701678Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T08:54:04.5873330Z Entering 'android/libs/fbjni' 2025-12-04T08:54:04.5892153Z http.https://github.com/.extraheader 2025-12-04T08:54:04.5910634Z Entering 'third_party/FP16' 2025-12-04T08:54:04.5922949Z http.https://github.com/.extraheader 2025-12-04T08:54:04.5942853Z Entering 'third_party/FXdiv' 2025-12-04T08:54:04.5956949Z http.https://github.com/.extraheader 2025-12-04T08:54:04.5979017Z Entering 'third_party/NNPACK' 2025-12-04T08:54:04.5992079Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6009735Z Entering 'third_party/NVTX' 2025-12-04T08:54:04.6022628Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6040532Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T08:54:04.6056258Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6074823Z Entering 'third_party/XNNPACK' 2025-12-04T08:54:04.6088528Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6110143Z Entering 'third_party/aiter' 2025-12-04T08:54:04.6125614Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6147570Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:54:04.6161349Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6190379Z Entering 'third_party/benchmark' 2025-12-04T08:54:04.6203712Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6219904Z Entering 'third_party/composable_kernel' 2025-12-04T08:54:04.6233633Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6257648Z Entering 'third_party/cpp-httplib' 2025-12-04T08:54:04.6271369Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6287778Z Entering 'third_party/cpuinfo' 2025-12-04T08:54:04.6301597Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6318770Z Entering 'third_party/cudnn_frontend' 2025-12-04T08:54:04.6332066Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6353822Z Entering 'third_party/cutlass' 2025-12-04T08:54:04.6370461Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6390935Z Entering 'third_party/fbgemm' 2025-12-04T08:54:04.6404228Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6421647Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T08:54:04.6434991Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6453164Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:54:04.6468012Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6487677Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:54:04.6500803Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6517425Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T08:54:04.6531608Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6552106Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T08:54:04.6564724Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6581730Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:54:04.6593971Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6613593Z Entering 'third_party/fbgemm/external/json' 2025-12-04T08:54:04.6626060Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6645508Z Entering 'third_party/flash-attention' 2025-12-04T08:54:04.6658174Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6677382Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:54:04.6689164Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6716592Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:54:04.6729257Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6750915Z Entering 'third_party/flatbuffers' 2025-12-04T08:54:04.6764149Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6788542Z Entering 'third_party/fmt' 2025-12-04T08:54:04.6802170Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6817813Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:54:04.6830654Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6846336Z Entering 'third_party/gloo' 2025-12-04T08:54:04.6859348Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6875097Z Entering 'third_party/googletest' 2025-12-04T08:54:04.6888173Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6904287Z Entering 'third_party/ideep' 2025-12-04T08:54:04.6919206Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6933375Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T08:54:04.6946894Z http.https://github.com/.extraheader 2025-12-04T08:54:04.6973942Z Entering 'third_party/ittapi' 2025-12-04T08:54:04.6989893Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7007880Z Entering 'third_party/kineto' 2025-12-04T08:54:04.7025138Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7052018Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:54:04.7073934Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7098525Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:54:04.7114781Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7132393Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:54:04.7143514Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7159946Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:54:04.7171718Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7194238Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:54:04.7211590Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7230330Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:54:04.7255077Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7282168Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:54:04.7296026Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7313395Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:54:04.7328419Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7348761Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:54:04.7368841Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7387253Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:54:04.7401789Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7424698Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:54:04.7440727Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7457854Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:04.7470651Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7495602Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:04.7508411Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7529844Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:54:04.7543688Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7560464Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:54:04.7579980Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7600861Z Entering 'third_party/kleidiai' 2025-12-04T08:54:04.7615588Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7633177Z Entering 'third_party/mimalloc' 2025-12-04T08:54:04.7646073Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7662677Z Entering 'third_party/nlohmann' 2025-12-04T08:54:04.7679571Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7696246Z Entering 'third_party/onnx' 2025-12-04T08:54:04.7711184Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7733232Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T08:54:04.7746413Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7769596Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T08:54:04.7782647Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7801490Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:54:04.7821466Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7841558Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:54:04.7855697Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7875531Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:54:04.7890186Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7907654Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:54:04.7918901Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7939827Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:54:04.7952059Z http.https://github.com/.extraheader 2025-12-04T08:54:04.7976274Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:54:04.7989907Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8007244Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:54:04.8020357Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8037784Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:04.8049605Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8066877Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:04.8078891Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8098047Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:54:04.8109180Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8136126Z Entering 'third_party/pocketfft' 2025-12-04T08:54:04.8149570Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8171953Z Entering 'third_party/protobuf' 2025-12-04T08:54:04.8185472Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8204010Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:54:04.8216554Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8232885Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T08:54:04.8246317Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8265768Z Entering 'third_party/psimd' 2025-12-04T08:54:04.8279047Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8296985Z Entering 'third_party/pthreadpool' 2025-12-04T08:54:04.8310523Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8327015Z Entering 'third_party/pybind11' 2025-12-04T08:54:04.8340053Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8360716Z Entering 'third_party/python-peachpy' 2025-12-04T08:54:04.8374030Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8391149Z Entering 'third_party/sleef' 2025-12-04T08:54:04.8404957Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8422179Z Entering 'third_party/tensorpipe' 2025-12-04T08:54:04.8434564Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8456893Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:54:04.8481344Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8499425Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:54:04.8511789Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8527418Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:54:04.8545040Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8575242Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:54:04.8591207Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8613705Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:54:04.8640808Z http.https://github.com/.extraheader 2025-12-04T08:54:04.8689931Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:04.8719776Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T08:54:04.8909320Z Entering 'android/libs/fbjni' 2025-12-04T08:54:04.8921053Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T08:54:04.8932934Z Entering 'third_party/FP16' 2025-12-04T08:54:04.8943915Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T08:54:04.8957303Z Entering 'third_party/FXdiv' 2025-12-04T08:54:04.8967561Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T08:54:04.8976945Z Entering 'third_party/NNPACK' 2025-12-04T08:54:04.8986663Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T08:54:04.8995609Z Entering 'third_party/NVTX' 2025-12-04T08:54:04.9008774Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T08:54:04.9018432Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T08:54:04.9029562Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T08:54:04.9041492Z Entering 'third_party/XNNPACK' 2025-12-04T08:54:04.9052061Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T08:54:04.9072954Z Entering 'third_party/aiter' 2025-12-04T08:54:04.9083176Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T08:54:04.9093847Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:54:04.9107390Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T08:54:04.9121061Z Entering 'third_party/benchmark' 2025-12-04T08:54:04.9138716Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T08:54:04.9147950Z Entering 'third_party/composable_kernel' 2025-12-04T08:54:04.9159556Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T08:54:04.9170495Z Entering 'third_party/cpp-httplib' 2025-12-04T08:54:04.9179809Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T08:54:04.9188489Z Entering 'third_party/cpuinfo' 2025-12-04T08:54:04.9197744Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T08:54:04.9206769Z Entering 'third_party/cudnn_frontend' 2025-12-04T08:54:04.9216700Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T08:54:04.9229292Z Entering 'third_party/cutlass' 2025-12-04T08:54:04.9242390Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T08:54:04.9262004Z Entering 'third_party/fbgemm' 2025-12-04T08:54:04.9276423Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T08:54:04.9292048Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T08:54:04.9302086Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T08:54:04.9310355Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:54:04.9320078Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T08:54:04.9333727Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:54:04.9343491Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T08:54:04.9353350Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T08:54:04.9372592Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T08:54:04.9385316Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T08:54:04.9395513Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T08:54:04.9404336Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:54:04.9413700Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T08:54:04.9421833Z Entering 'third_party/fbgemm/external/json' 2025-12-04T08:54:04.9431010Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T08:54:04.9442351Z Entering 'third_party/flash-attention' 2025-12-04T08:54:04.9452345Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T08:54:04.9462801Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:54:04.9472935Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T08:54:04.9484448Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:54:04.9493692Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T08:54:04.9506797Z Entering 'third_party/flatbuffers' 2025-12-04T08:54:04.9516969Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T08:54:04.9526326Z Entering 'third_party/fmt' 2025-12-04T08:54:04.9535846Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T08:54:04.9551179Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:54:04.9563886Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T08:54:04.9573077Z Entering 'third_party/gloo' 2025-12-04T08:54:04.9584683Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T08:54:04.9593368Z Entering 'third_party/googletest' 2025-12-04T08:54:04.9603116Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:54:04.9615040Z Entering 'third_party/ideep' 2025-12-04T08:54:04.9626976Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T08:54:04.9637666Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T08:54:04.9648187Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T08:54:04.9660835Z Entering 'third_party/ittapi' 2025-12-04T08:54:04.9671082Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T08:54:04.9681106Z Entering 'third_party/kineto' 2025-12-04T08:54:04.9693231Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T08:54:04.9702326Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:54:04.9712756Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T08:54:04.9722853Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:54:04.9734006Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T08:54:04.9744769Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:54:04.9756000Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T08:54:04.9768816Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:54:04.9791271Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T08:54:04.9799422Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:54:04.9812463Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T08:54:04.9820236Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:54:04.9832297Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T08:54:04.9843648Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:54:04.9855056Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T08:54:04.9867137Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:54:04.9877479Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:54:04.9887011Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:54:04.9896990Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T08:54:04.9911462Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:54:04.9922186Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T08:54:04.9930272Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:54:04.9946297Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T08:54:04.9955869Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:04.9965656Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T08:54:04.9977471Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:04.9990928Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T08:54:05.0003822Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:54:05.0013344Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T08:54:05.0023384Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:54:05.0034262Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T08:54:05.0044891Z Entering 'third_party/kleidiai' 2025-12-04T08:54:05.0054721Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T08:54:05.0063554Z Entering 'third_party/mimalloc' 2025-12-04T08:54:05.0074136Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T08:54:05.0083176Z Entering 'third_party/nlohmann' 2025-12-04T08:54:05.0093756Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T08:54:05.0103619Z Entering 'third_party/onnx' 2025-12-04T08:54:05.0113992Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T08:54:05.0129249Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T08:54:05.0139131Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T08:54:05.0151674Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T08:54:05.0162690Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T08:54:05.0172555Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:54:05.0182146Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T08:54:05.0191512Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:54:05.0200828Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:54:05.0209404Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:54:05.0218309Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T08:54:05.0226679Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:54:05.0235343Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T08:54:05.0244053Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:54:05.0252957Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T08:54:05.0261276Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:54:05.0271310Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T08:54:05.0282913Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:54:05.0293982Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T08:54:05.0302553Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:05.0312501Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T08:54:05.0321779Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:05.0332448Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T08:54:05.0343449Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:54:05.0355783Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T08:54:05.0372755Z Entering 'third_party/pocketfft' 2025-12-04T08:54:05.0382645Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T08:54:05.0391968Z Entering 'third_party/protobuf' 2025-12-04T08:54:05.0402731Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T08:54:05.0413233Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:54:05.0422225Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T08:54:05.0431665Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T08:54:05.0442932Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:54:05.0456552Z Entering 'third_party/psimd' 2025-12-04T08:54:05.0466737Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T08:54:05.0477962Z Entering 'third_party/pthreadpool' 2025-12-04T08:54:05.0488431Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T08:54:05.0497792Z Entering 'third_party/pybind11' 2025-12-04T08:54:05.0507314Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T08:54:05.0516133Z Entering 'third_party/python-peachpy' 2025-12-04T08:54:05.0525882Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T08:54:05.0536001Z Entering 'third_party/sleef' 2025-12-04T08:54:05.0546094Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T08:54:05.0555884Z Entering 'third_party/tensorpipe' 2025-12-04T08:54:05.0567439Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T08:54:05.0577117Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:54:05.0585978Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:54:05.0595072Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:54:05.0604384Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T08:54:05.0612901Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:54:05.0627851Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T08:54:05.0636507Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:54:05.0645666Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T08:54:05.0653768Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:54:05.0671321Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T08:54:05.0700359Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0719720Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0733992Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0753347Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0767209Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0780550Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0793387Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0809884Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0826541Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0840512Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0859048Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0872815Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0885238Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0905472Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0927614Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0941267Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0961033Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0974595Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.0987467Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1003178Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1020871Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1034289Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1051134Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1065086Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1080057Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1094496Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1110107Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1124316Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1137874Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1154750Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1168705Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1183133Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1197215Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1211643Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1229726Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1244319Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1263309Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1278642Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1293538Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1307582Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1320281Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1334588Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1348625Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1362709Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1378933Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1392455Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1406640Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1421003Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1436041Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1455651Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1469495Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1483691Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1499549Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1512655Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1527024Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1541025Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1555386Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1573621Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1595437Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1610550Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1624891Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1640318Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1654820Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1668367Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1681189Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1696205Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1711246Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1725566Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1740568Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1759133Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1773323Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1787531Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1801619Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1819309Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1833266Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1851903Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1866049Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1896379Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1923691Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1943601Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1962024Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T08:54:05.1980937Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T08:54:05.2003649Z ##[endgroup] 2025-12-04T08:54:05.2003882Z ##[group]Fetching the repository 2025-12-04T08:54:05.2007397Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-12-04T08:54:06.5443504Z [command]/usr/bin/git rev-parse --verify --quiet ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32^{object} 2025-12-04T08:54:06.5544813Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T08:54:06.5550549Z ##[endgroup] 2025-12-04T08:54:06.5550991Z ##[group]Determining the checkout info 2025-12-04T08:54:06.5551811Z ##[endgroup] 2025-12-04T08:54:06.5557781Z [command]/usr/bin/git sparse-checkout disable 2025-12-04T08:54:06.5652262Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-12-04T08:54:06.5675655Z ##[group]Checking out the ref 2025-12-04T08:54:06.5677420Z [command]/usr/bin/git checkout --progress --force ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T08:54:06.5946337Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919) 2025-12-04T08:54:06.5952441Z ##[endgroup] 2025-12-04T08:54:06.5952891Z ##[group]Setting up auth for fetching submodules 2025-12-04T08:54:06.5955754Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T08:54:06.5984920Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-12-04T08:54:06.6002703Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-12-04T08:54:06.6019288Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-12-04T08:54:06.6034074Z ##[endgroup] 2025-12-04T08:54:06.6034322Z ##[group]Fetching submodules 2025-12-04T08:54:06.6035614Z [command]/usr/bin/git submodule sync --recursive 2025-12-04T08:54:06.6236844Z Synchronizing submodule url for 'android/libs/fbjni' 2025-12-04T08:54:06.6247981Z Synchronizing submodule url for 'third_party/FP16' 2025-12-04T08:54:06.6259302Z Synchronizing submodule url for 'third_party/FXdiv' 2025-12-04T08:54:06.6269127Z Synchronizing submodule url for 'third_party/NNPACK' 2025-12-04T08:54:06.6280258Z Synchronizing submodule url for 'third_party/NVTX' 2025-12-04T08:54:06.6293397Z Synchronizing submodule url for 'third_party/VulkanMemoryAllocator' 2025-12-04T08:54:06.6303178Z Synchronizing submodule url for 'third_party/XNNPACK' 2025-12-04T08:54:06.6321945Z Synchronizing submodule url for 'third_party/aiter' 2025-12-04T08:54:06.6336302Z Synchronizing submodule url for 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:54:06.6350221Z Synchronizing submodule url for 'third_party/benchmark' 2025-12-04T08:54:06.6360621Z Synchronizing submodule url for 'third_party/composable_kernel' 2025-12-04T08:54:06.6374002Z Synchronizing submodule url for 'third_party/cpp-httplib' 2025-12-04T08:54:06.6384876Z Synchronizing submodule url for 'third_party/cpuinfo' 2025-12-04T08:54:06.6396255Z Synchronizing submodule url for 'third_party/cudnn_frontend' 2025-12-04T08:54:06.6409709Z Synchronizing submodule url for 'third_party/cutlass' 2025-12-04T08:54:06.6430799Z Synchronizing submodule url for 'third_party/fbgemm' 2025-12-04T08:54:06.6446420Z Synchronizing submodule url for 'third_party/fbgemm/external/asmjit' 2025-12-04T08:54:06.6463448Z Synchronizing submodule url for 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:54:06.6476268Z Synchronizing submodule url for 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:54:06.6489504Z Synchronizing submodule url for 'third_party/fbgemm/external/cutlass' 2025-12-04T08:54:06.6506990Z Synchronizing submodule url for 'third_party/fbgemm/external/googletest' 2025-12-04T08:54:06.6518590Z Synchronizing submodule url for 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:54:06.6527986Z Synchronizing submodule url for 'third_party/fbgemm/external/json' 2025-12-04T08:54:06.6539642Z Synchronizing submodule url for 'third_party/flash-attention' 2025-12-04T08:54:06.6553107Z Synchronizing submodule url for 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:54:06.6564468Z Synchronizing submodule url for 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:54:06.6582214Z Synchronizing submodule url for 'third_party/flatbuffers' 2025-12-04T08:54:06.6594057Z Synchronizing submodule url for 'third_party/fmt' 2025-12-04T08:54:06.6608360Z Synchronizing submodule url for 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:54:06.6621976Z Synchronizing submodule url for 'third_party/gloo' 2025-12-04T08:54:06.6632912Z Synchronizing submodule url for 'third_party/googletest' 2025-12-04T08:54:06.6643918Z Synchronizing submodule url for 'third_party/ideep' 2025-12-04T08:54:06.6663969Z Synchronizing submodule url for 'third_party/ideep/mkl-dnn' 2025-12-04T08:54:06.6679620Z Synchronizing submodule url for 'third_party/ittapi' 2025-12-04T08:54:06.6689994Z Synchronizing submodule url for 'third_party/kineto' 2025-12-04T08:54:06.6699708Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:54:06.6712730Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:54:06.6724361Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:54:06.6735517Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:54:06.6747539Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:54:06.6764573Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:54:06.6776919Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:54:06.6787056Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:54:06.6799562Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:54:06.6810210Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:54:06.6821438Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:54:06.6833929Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:06.6848854Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:06.6864888Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:54:06.6874058Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:54:06.6885169Z Synchronizing submodule url for 'third_party/kleidiai' 2025-12-04T08:54:06.6896185Z Synchronizing submodule url for 'third_party/mimalloc' 2025-12-04T08:54:06.6906832Z Synchronizing submodule url for 'third_party/nlohmann' 2025-12-04T08:54:06.6917683Z Synchronizing submodule url for 'third_party/onnx' 2025-12-04T08:54:06.6944853Z Synchronizing submodule url for 'third_party/onnx/third_party/pybind11' 2025-12-04T08:54:06.6958628Z Synchronizing submodule url for 'third_party/opentelemetry-cpp' 2025-12-04T08:54:06.6973259Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:54:06.6983908Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:54:06.7000182Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:54:06.7010313Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:54:06.7021391Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:54:06.7030505Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:54:06.7040529Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:54:06.7051584Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:06.7062918Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:06.7076955Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:54:06.7097187Z Synchronizing submodule url for 'third_party/pocketfft' 2025-12-04T08:54:06.7107295Z Synchronizing submodule url for 'third_party/protobuf' 2025-12-04T08:54:06.7127914Z Synchronizing submodule url for 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:54:06.7138743Z Synchronizing submodule url for 'third_party/protobuf/third_party/googletest' 2025-12-04T08:54:06.7151685Z Synchronizing submodule url for 'third_party/psimd' 2025-12-04T08:54:06.7164783Z Synchronizing submodule url for 'third_party/pthreadpool' 2025-12-04T08:54:06.7175402Z Synchronizing submodule url for 'third_party/pybind11' 2025-12-04T08:54:06.7184892Z Synchronizing submodule url for 'third_party/python-peachpy' 2025-12-04T08:54:06.7193875Z Synchronizing submodule url for 'third_party/sleef' 2025-12-04T08:54:06.7203258Z Synchronizing submodule url for 'third_party/tensorpipe' 2025-12-04T08:54:06.7213056Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:54:06.7223001Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:54:06.7233203Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:54:06.7243447Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:54:06.7260892Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:54:06.7286816Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-12-04T08:54:06.7518696Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-12-04T08:54:06.7565081Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-12-04T08:54:06.7606961Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-12-04T08:54:06.7669612Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-12-04T08:54:06.7750275Z Submodule path 'third_party/NVTX': checked out '3ebbc93ded7285963bff932c678fa367eb393ba6' 2025-12-04T08:54:06.7806444Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-12-04T08:54:06.7949693Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-12-04T08:54:06.8086269Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-12-04T08:54:06.8254323Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-12-04T08:54:06.8321936Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-12-04T08:54:06.8524405Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T08:54:06.8603280Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246' 2025-12-04T08:54:06.8677119Z Submodule path 'third_party/cpuinfo': checked out 'f858c30bcb16f8effd5ff46996f0514539e17abc' 2025-12-04T08:54:06.8747276Z Submodule path 'third_party/cudnn_frontend': checked out '0b1577c8c83401237d601d0d0db5210506705396' 2025-12-04T08:54:06.8866847Z Submodule path 'third_party/cutlass': checked out 'f88806b1e31dfa579842638740216dd41fc6c588' 2025-12-04T08:54:06.8999762Z Submodule path 'third_party/fbgemm': checked out 'c0b988d39a9e47c794d699f29930ed4d7c7e13a4' 2025-12-04T08:54:06.9055718Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea' 2025-12-04T08:54:06.9233645Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T08:54:06.9299076Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-12-04T08:54:06.9410639Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '98125ce499b0fdf7ffbe0e3052f5b8709f4840f8' 2025-12-04T08:54:06.9476088Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T08:54:06.9530283Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691' 2025-12-04T08:54:06.9620817Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-12-04T08:54:06.9714175Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-12-04T08:54:06.9891672Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-12-04T08:54:07.0010424Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-12-04T08:54:07.0108919Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-12-04T08:54:07.0176978Z Submodule path 'third_party/fmt': checked out '407c905e45ad75fc29bf0f9bb7c5c2fd3475976f' 2025-12-04T08:54:07.0238097Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-12-04T08:54:07.0298434Z Submodule path 'third_party/gloo': checked out '54cbae0d3a67fa890b4c3d9ee162b7860315e341' 2025-12-04T08:54:07.0362132Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T08:54:07.0419246Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-12-04T08:54:07.0592301Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-12-04T08:54:07.0649009Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-12-04T08:54:07.0718153Z Submodule path 'third_party/kineto': checked out '31f85df8fbd89c188f14ef10f1ec65379786b943' 2025-12-04T08:54:07.0793854Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out 'd2ffe0a4e3acace628db49974246b66fc3e85fb1' 2025-12-04T08:54:07.0872479Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-12-04T08:54:07.0926807Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-12-04T08:54:07.0985679Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-12-04T08:54:07.1050920Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-12-04T08:54:07.1120690Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-12-04T08:54:07.1178653Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-12-04T08:54:07.1242221Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T08:54:07.1343502Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-12-04T08:54:07.1397541Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-12-04T08:54:07.1460232Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp': checked out 'b1234816facfdda29845c46696a02998a4af115a' 2025-12-04T08:54:07.1572602Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'd7ba35bbb649209c66e582d5a0244ba988a15159' 2025-12-04T08:54:07.1656464Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T08:54:07.1719452Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-12-04T08:54:07.1771947Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T08:54:07.1847997Z Submodule path 'third_party/kleidiai': checked out 'd7770c89632329a9914ef1a90289917597639cbe' 2025-12-04T08:54:07.1914971Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-12-04T08:54:07.1999901Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-12-04T08:54:07.2147130Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-12-04T08:54:07.2231697Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-12-04T08:54:07.2324684Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-12-04T08:54:07.2395083Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-12-04T08:54:07.2448725Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-12-04T08:54:07.2500375Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-12-04T08:54:07.2597249Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-12-04T08:54:07.2650171Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-12-04T08:54:07.2698949Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-12-04T08:54:07.2755688Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-12-04T08:54:07.2834405Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-12-04T08:54:07.2897347Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T08:54:07.3047346Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-12-04T08:54:07.3102565Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-12-04T08:54:07.3287437Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-12-04T08:54:07.3356570Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-12-04T08:54:07.3417309Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-12-04T08:54:07.3488493Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-12-04T08:54:07.3538219Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-12-04T08:54:07.3609663Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8' 2025-12-04T08:54:07.3670703Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-12-04T08:54:07.3724852Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-12-04T08:54:07.3779970Z Submodule path 'third_party/tensorpipe': checked out '2b4cd91092d335a697416b2a3cb398283246849d' 2025-12-04T08:54:07.3844250Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-12-04T08:54:07.3898999Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-12-04T08:54:07.4039407Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b' 2025-12-04T08:54:07.4102448Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-12-04T08:54:07.4154100Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-12-04T08:54:07.4185510Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-12-04T08:54:07.4352012Z Entering 'android/libs/fbjni' 2025-12-04T08:54:07.4376838Z Entering 'third_party/FP16' 2025-12-04T08:54:07.4402873Z Entering 'third_party/FXdiv' 2025-12-04T08:54:07.4424018Z Entering 'third_party/NNPACK' 2025-12-04T08:54:07.4451983Z Entering 'third_party/NVTX' 2025-12-04T08:54:07.4472775Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T08:54:07.4493453Z Entering 'third_party/XNNPACK' 2025-12-04T08:54:07.4524628Z Entering 'third_party/aiter' 2025-12-04T08:54:07.4549469Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:54:07.4582587Z Entering 'third_party/benchmark' 2025-12-04T08:54:07.4604842Z Entering 'third_party/composable_kernel' 2025-12-04T08:54:07.4635916Z Entering 'third_party/cpp-httplib' 2025-12-04T08:54:07.4658821Z Entering 'third_party/cpuinfo' 2025-12-04T08:54:07.4681113Z Entering 'third_party/cudnn_frontend' 2025-12-04T08:54:07.4707327Z Entering 'third_party/cutlass' 2025-12-04T08:54:07.4735058Z Entering 'third_party/fbgemm' 2025-12-04T08:54:07.4757331Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T08:54:07.4777314Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:54:07.4802193Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:54:07.4822914Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T08:54:07.4845290Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T08:54:07.4865665Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:54:07.4891341Z Entering 'third_party/fbgemm/external/json' 2025-12-04T08:54:07.4916666Z Entering 'third_party/flash-attention' 2025-12-04T08:54:07.4937866Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:54:07.4970297Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:54:07.5001274Z Entering 'third_party/flatbuffers' 2025-12-04T08:54:07.5024236Z Entering 'third_party/fmt' 2025-12-04T08:54:07.5046560Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:54:07.5069101Z Entering 'third_party/gloo' 2025-12-04T08:54:07.5090975Z Entering 'third_party/googletest' 2025-12-04T08:54:07.5112434Z Entering 'third_party/ideep' 2025-12-04T08:54:07.5132793Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T08:54:07.5165364Z Entering 'third_party/ittapi' 2025-12-04T08:54:07.5188820Z Entering 'third_party/kineto' 2025-12-04T08:54:07.5213197Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:54:07.5231188Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:54:07.5256233Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:54:07.5276170Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:54:07.5298231Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:54:07.5319381Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:54:07.5340637Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:54:07.5368331Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:54:07.5399658Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:54:07.5425152Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:54:07.5447946Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:54:07.5477214Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:07.5498336Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:07.5525894Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:54:07.5553200Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:54:07.5581680Z Entering 'third_party/kleidiai' 2025-12-04T08:54:07.5604814Z Entering 'third_party/mimalloc' 2025-12-04T08:54:07.5629403Z Entering 'third_party/nlohmann' 2025-12-04T08:54:07.5653620Z Entering 'third_party/onnx' 2025-12-04T08:54:07.5696413Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T08:54:07.5735168Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T08:54:07.5771425Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:54:07.5794795Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:54:07.5819338Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:54:07.5840535Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:54:07.5865016Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:54:07.5888016Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:54:07.5909638Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:54:07.5933616Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:07.5958230Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:07.5988871Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:54:07.6022272Z Entering 'third_party/pocketfft' 2025-12-04T08:54:07.6045347Z Entering 'third_party/protobuf' 2025-12-04T08:54:07.6067883Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:54:07.6087389Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T08:54:07.6109258Z Entering 'third_party/psimd' 2025-12-04T08:54:07.6130155Z Entering 'third_party/pthreadpool' 2025-12-04T08:54:07.6149504Z Entering 'third_party/pybind11' 2025-12-04T08:54:07.6168295Z Entering 'third_party/python-peachpy' 2025-12-04T08:54:07.6187065Z Entering 'third_party/sleef' 2025-12-04T08:54:07.6205263Z Entering 'third_party/tensorpipe' 2025-12-04T08:54:07.6224499Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:54:07.6243847Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:54:07.6264174Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:54:07.6284520Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:54:07.6304376Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:54:07.6336453Z ##[endgroup] 2025-12-04T08:54:07.6336639Z ##[group]Persisting credentials for submodules 2025-12-04T08:54:07.6341512Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-12-04T08:54:07.6498954Z Entering 'android/libs/fbjni' 2025-12-04T08:54:07.6512540Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6513602Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6528891Z Entering 'third_party/FP16' 2025-12-04T08:54:07.6541218Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6541356Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6560826Z Entering 'third_party/FXdiv' 2025-12-04T08:54:07.6572610Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6572741Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6589390Z Entering 'third_party/NNPACK' 2025-12-04T08:54:07.6601467Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6601594Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6615418Z Entering 'third_party/NVTX' 2025-12-04T08:54:07.6630142Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6630269Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6648152Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T08:54:07.6663987Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6664116Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6679844Z Entering 'third_party/XNNPACK' 2025-12-04T08:54:07.6691911Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6692045Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6715028Z Entering 'third_party/aiter' 2025-12-04T08:54:07.6728266Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6728394Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6744229Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:54:07.6757843Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6757975Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6779563Z Entering 'third_party/benchmark' 2025-12-04T08:54:07.6797164Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6797300Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6816127Z Entering 'third_party/composable_kernel' 2025-12-04T08:54:07.6829240Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6852009Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6852135Z Entering 'third_party/cpp-httplib' 2025-12-04T08:54:07.6864989Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6865118Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6882146Z Entering 'third_party/cpuinfo' 2025-12-04T08:54:07.6899773Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6899918Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6917174Z Entering 'third_party/cudnn_frontend' 2025-12-04T08:54:07.6930083Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6930219Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6948574Z Entering 'third_party/cutlass' 2025-12-04T08:54:07.6962687Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6962815Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6984261Z Entering 'third_party/fbgemm' 2025-12-04T08:54:07.6995858Z url.https://github.com/.insteadof 2025-12-04T08:54:07.6995994Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7016191Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T08:54:07.7031448Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7031591Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7057539Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:54:07.7073302Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7073520Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7098378Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:54:07.7112798Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7112927Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7130301Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T08:54:07.7145942Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7146062Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7171236Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T08:54:07.7184220Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7184340Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7201942Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:54:07.7217808Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7217921Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7235470Z Entering 'third_party/fbgemm/external/json' 2025-12-04T08:54:07.7249511Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7249640Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7269619Z Entering 'third_party/flash-attention' 2025-12-04T08:54:07.7282404Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7282573Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7299503Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:54:07.7316405Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7316615Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7334695Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:54:07.7349534Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7349667Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7371151Z Entering 'third_party/flatbuffers' 2025-12-04T08:54:07.7383931Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7384052Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7401986Z Entering 'third_party/fmt' 2025-12-04T08:54:07.7417160Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7417286Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7437316Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:54:07.7456162Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7456282Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7474343Z Entering 'third_party/gloo' 2025-12-04T08:54:07.7489773Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7489898Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7509271Z Entering 'third_party/googletest' 2025-12-04T08:54:07.7522811Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7522945Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7541420Z Entering 'third_party/ideep' 2025-12-04T08:54:07.7554080Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7554223Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7570098Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T08:54:07.7582896Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7583278Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7605588Z Entering 'third_party/ittapi' 2025-12-04T08:54:07.7619119Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7619242Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7635949Z Entering 'third_party/kineto' 2025-12-04T08:54:07.7648646Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7648763Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7665522Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:54:07.7678857Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7678978Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7695234Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:54:07.7709785Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7709904Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7728454Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:54:07.7744071Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7744199Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7765040Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:54:07.7782163Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7782389Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7811357Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:54:07.7826771Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7826901Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7844136Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:54:07.7856884Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7857016Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7877679Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:54:07.7892494Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7892617Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7912472Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:54:07.7926077Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7943576Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7943768Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:54:07.7957578Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7957719Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7974823Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:54:07.7988093Z url.https://github.com/.insteadof 2025-12-04T08:54:07.7988458Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8004984Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:54:07.8020795Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8020934Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8040668Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:07.8053960Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8054085Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8071830Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:07.8084650Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8084774Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8106861Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:54:07.8120357Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8120608Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8137292Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:54:07.8150005Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8150161Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8168883Z Entering 'third_party/kleidiai' 2025-12-04T08:54:07.8181286Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8181423Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8199163Z Entering 'third_party/mimalloc' 2025-12-04T08:54:07.8211312Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8211442Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8227657Z Entering 'third_party/nlohmann' 2025-12-04T08:54:07.8241721Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8241884Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8261425Z Entering 'third_party/onnx' 2025-12-04T08:54:07.8275039Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8275166Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8297608Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T08:54:07.8310289Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8310468Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8328896Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T08:54:07.8342192Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8342362Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8361084Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:54:07.8374162Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8375216Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8392207Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:54:07.8406175Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8406360Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8423512Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:54:07.8437118Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8437258Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8453488Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:54:07.8467268Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8467401Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8488281Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:54:07.8504160Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8504319Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8527034Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:54:07.8543132Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8543273Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8563124Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:54:07.8575822Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8575958Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8593643Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:07.8606295Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8606438Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8625147Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:07.8638995Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8639125Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8657669Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:54:07.8670287Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8670416Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8694068Z Entering 'third_party/pocketfft' 2025-12-04T08:54:07.8707913Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8708044Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8725049Z Entering 'third_party/protobuf' 2025-12-04T08:54:07.8738929Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8739062Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8758252Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:54:07.8771054Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8771187Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8788371Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T08:54:07.8799897Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8800031Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8818697Z Entering 'third_party/psimd' 2025-12-04T08:54:07.8833225Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8833530Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8852468Z Entering 'third_party/pthreadpool' 2025-12-04T08:54:07.8877283Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8877430Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8905605Z Entering 'third_party/pybind11' 2025-12-04T08:54:07.8927993Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8928131Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8956231Z Entering 'third_party/python-peachpy' 2025-12-04T08:54:07.8976398Z url.https://github.com/.insteadof 2025-12-04T08:54:07.8976548Z url.https://github.com/.insteadof 2025-12-04T08:54:07.9002263Z Entering 'third_party/sleef' 2025-12-04T08:54:07.9016982Z url.https://github.com/.insteadof 2025-12-04T08:54:07.9017105Z url.https://github.com/.insteadof 2025-12-04T08:54:07.9042357Z Entering 'third_party/tensorpipe' 2025-12-04T08:54:07.9055518Z url.https://github.com/.insteadof 2025-12-04T08:54:07.9055647Z url.https://github.com/.insteadof 2025-12-04T08:54:07.9074854Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:54:07.9088832Z url.https://github.com/.insteadof 2025-12-04T08:54:07.9088957Z url.https://github.com/.insteadof 2025-12-04T08:54:07.9106744Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:54:07.9119228Z url.https://github.com/.insteadof 2025-12-04T08:54:07.9119358Z url.https://github.com/.insteadof 2025-12-04T08:54:07.9133573Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:54:07.9144355Z url.https://github.com/.insteadof 2025-12-04T08:54:07.9144651Z url.https://github.com/.insteadof 2025-12-04T08:54:07.9174669Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:54:07.9192444Z url.https://github.com/.insteadof 2025-12-04T08:54:07.9192633Z url.https://github.com/.insteadof 2025-12-04T08:54:07.9211831Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:54:07.9226382Z url.https://github.com/.insteadof 2025-12-04T08:54:07.9226580Z url.https://github.com/.insteadof 2025-12-04T08:54:07.9260394Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-12-04T08:54:07.9416582Z Entering 'android/libs/fbjni' 2025-12-04T08:54:07.9437531Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T08:54:07.9448697Z Entering 'third_party/FP16' 2025-12-04T08:54:07.9471818Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T08:54:07.9480518Z Entering 'third_party/FXdiv' 2025-12-04T08:54:07.9501190Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T08:54:07.9511998Z Entering 'third_party/NNPACK' 2025-12-04T08:54:07.9532696Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T08:54:07.9543213Z Entering 'third_party/NVTX' 2025-12-04T08:54:07.9564089Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T08:54:07.9574269Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T08:54:07.9594991Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T08:54:07.9605306Z Entering 'third_party/XNNPACK' 2025-12-04T08:54:07.9627469Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T08:54:07.9643913Z Entering 'third_party/aiter' 2025-12-04T08:54:07.9664437Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T08:54:07.9677817Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:54:07.9698936Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T08:54:07.9714104Z Entering 'third_party/benchmark' 2025-12-04T08:54:07.9733592Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T08:54:07.9744117Z Entering 'third_party/composable_kernel' 2025-12-04T08:54:07.9765215Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T08:54:07.9779107Z Entering 'third_party/cpp-httplib' 2025-12-04T08:54:07.9800039Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T08:54:07.9810606Z Entering 'third_party/cpuinfo' 2025-12-04T08:54:07.9829605Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T08:54:07.9839781Z Entering 'third_party/cudnn_frontend' 2025-12-04T08:54:07.9874526Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T08:54:07.9898591Z Entering 'third_party/cutlass' 2025-12-04T08:54:07.9921264Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T08:54:07.9935686Z Entering 'third_party/fbgemm' 2025-12-04T08:54:07.9957178Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T08:54:07.9969537Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T08:54:07.9992971Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T08:54:08.0003401Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:54:08.0024904Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T08:54:08.0038407Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:54:08.0057479Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T08:54:08.0067250Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T08:54:08.0089565Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T08:54:08.0104092Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T08:54:08.0125445Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T08:54:08.0135241Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:54:08.0157008Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T08:54:08.0165631Z Entering 'third_party/fbgemm/external/json' 2025-12-04T08:54:08.0185325Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T08:54:08.0196585Z Entering 'third_party/flash-attention' 2025-12-04T08:54:08.0216185Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T08:54:08.0226441Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:54:08.0248541Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T08:54:08.0261277Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:54:08.0285445Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T08:54:08.0300077Z Entering 'third_party/flatbuffers' 2025-12-04T08:54:08.0323265Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T08:54:08.0333993Z Entering 'third_party/fmt' 2025-12-04T08:54:08.0352779Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T08:54:08.0362915Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:54:08.0382240Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T08:54:08.0392661Z Entering 'third_party/gloo' 2025-12-04T08:54:08.0415762Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T08:54:08.0426496Z Entering 'third_party/googletest' 2025-12-04T08:54:08.0447451Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:54:08.0458484Z Entering 'third_party/ideep' 2025-12-04T08:54:08.0479453Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T08:54:08.0489741Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T08:54:08.0511832Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T08:54:08.0529636Z Entering 'third_party/ittapi' 2025-12-04T08:54:08.0565584Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T08:54:08.0577525Z Entering 'third_party/kineto' 2025-12-04T08:54:08.0601981Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T08:54:08.0613918Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:54:08.0632958Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T08:54:08.0641397Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:54:08.0665939Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T08:54:08.0676945Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:54:08.0699081Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T08:54:08.0709783Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:54:08.0728257Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T08:54:08.0737082Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:54:08.0757078Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T08:54:08.0767583Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:54:08.0788626Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T08:54:08.0801280Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:54:08.0822092Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T08:54:08.0831784Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:54:08.0851794Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:54:08.0861409Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:54:08.0881687Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T08:54:08.0892091Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:54:08.0912105Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T08:54:08.0926883Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:54:08.0948184Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T08:54:08.0956377Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:08.0978262Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T08:54:08.0989498Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:08.1010770Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T08:54:08.1024815Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:54:08.1046218Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T08:54:08.1056171Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:54:08.1076881Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T08:54:08.1088189Z Entering 'third_party/kleidiai' 2025-12-04T08:54:08.1108359Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T08:54:08.1119140Z Entering 'third_party/mimalloc' 2025-12-04T08:54:08.1139829Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T08:54:08.1150240Z Entering 'third_party/nlohmann' 2025-12-04T08:54:08.1171389Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T08:54:08.1182778Z Entering 'third_party/onnx' 2025-12-04T08:54:08.1214313Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T08:54:08.1235126Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T08:54:08.1256945Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T08:54:08.1271098Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T08:54:08.1293702Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T08:54:08.1305214Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:54:08.1325708Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T08:54:08.1335792Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:54:08.1355105Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:54:08.1365104Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:54:08.1390158Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T08:54:08.1399694Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:54:08.1418073Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T08:54:08.1427617Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:54:08.1448071Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T08:54:08.1457996Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:54:08.1478769Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T08:54:08.1488393Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:54:08.1515433Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T08:54:08.1525447Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:08.1547168Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T08:54:08.1557434Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:08.1577967Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T08:54:08.1590273Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:54:08.1611235Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T08:54:08.1630071Z Entering 'third_party/pocketfft' 2025-12-04T08:54:08.1649870Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T08:54:08.1659669Z Entering 'third_party/protobuf' 2025-12-04T08:54:08.1681226Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T08:54:08.1693064Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:54:08.1713675Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T08:54:08.1724110Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T08:54:08.1743339Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:54:08.1754953Z Entering 'third_party/psimd' 2025-12-04T08:54:08.1775668Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T08:54:08.1785913Z Entering 'third_party/pthreadpool' 2025-12-04T08:54:08.1805639Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T08:54:08.1816464Z Entering 'third_party/pybind11' 2025-12-04T08:54:08.1846045Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T08:54:08.1857693Z Entering 'third_party/python-peachpy' 2025-12-04T08:54:08.1882567Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T08:54:08.1892616Z Entering 'third_party/sleef' 2025-12-04T08:54:08.1914038Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T08:54:08.1924365Z Entering 'third_party/tensorpipe' 2025-12-04T08:54:08.1944719Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T08:54:08.1955656Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:54:08.1979674Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T08:54:08.1990285Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:54:08.2012493Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T08:54:08.2022348Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:54:08.2042615Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T08:54:08.2054163Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:54:08.2083051Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T08:54:08.2092444Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:54:08.2112916Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T08:54:08.2329260Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-12-04T08:54:08.2490694Z Entering 'android/libs/fbjni' 2025-12-04T08:54:08.2524202Z Entering 'third_party/FP16' 2025-12-04T08:54:08.2550705Z Entering 'third_party/FXdiv' 2025-12-04T08:54:08.2580650Z Entering 'third_party/NNPACK' 2025-12-04T08:54:08.2607341Z Entering 'third_party/NVTX' 2025-12-04T08:54:08.2633150Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T08:54:08.2658939Z Entering 'third_party/XNNPACK' 2025-12-04T08:54:08.2691466Z Entering 'third_party/aiter' 2025-12-04T08:54:08.2717296Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:54:08.2744295Z Entering 'third_party/benchmark' 2025-12-04T08:54:08.2770015Z Entering 'third_party/composable_kernel' 2025-12-04T08:54:08.2799552Z Entering 'third_party/cpp-httplib' 2025-12-04T08:54:08.2828520Z Entering 'third_party/cpuinfo' 2025-12-04T08:54:08.2851091Z Entering 'third_party/cudnn_frontend' 2025-12-04T08:54:08.2874488Z Entering 'third_party/cutlass' 2025-12-04T08:54:08.2903191Z Entering 'third_party/fbgemm' 2025-12-04T08:54:08.2929869Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T08:54:08.2949799Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:54:08.2972328Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:54:08.2990996Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T08:54:08.3014236Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T08:54:08.3033167Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:54:08.3051462Z Entering 'third_party/fbgemm/external/json' 2025-12-04T08:54:08.3079985Z Entering 'third_party/flash-attention' 2025-12-04T08:54:08.3112824Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:54:08.3137976Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:54:08.3171756Z Entering 'third_party/flatbuffers' 2025-12-04T08:54:08.3198523Z Entering 'third_party/fmt' 2025-12-04T08:54:08.3233000Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:54:08.3260271Z Entering 'third_party/gloo' 2025-12-04T08:54:08.3288790Z Entering 'third_party/googletest' 2025-12-04T08:54:08.3319052Z Entering 'third_party/ideep' 2025-12-04T08:54:08.3344240Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T08:54:08.3374627Z Entering 'third_party/ittapi' 2025-12-04T08:54:08.3401777Z Entering 'third_party/kineto' 2025-12-04T08:54:08.3433072Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:54:08.3450826Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:54:08.3473804Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:54:08.3495213Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:54:08.3514448Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:54:08.3534858Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:54:08.3556941Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:54:08.3577930Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:54:08.3597863Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:54:08.3617482Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:54:08.3635921Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:54:08.3655055Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:08.3677165Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:08.3702097Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:54:08.3722081Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:54:08.3744536Z Entering 'third_party/kleidiai' 2025-12-04T08:54:08.3770298Z Entering 'third_party/mimalloc' 2025-12-04T08:54:08.3793321Z Entering 'third_party/nlohmann' 2025-12-04T08:54:08.3816490Z Entering 'third_party/onnx' 2025-12-04T08:54:08.3846258Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T08:54:08.3870027Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T08:54:08.3894262Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:54:08.3914944Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:54:08.3938339Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:54:08.3958098Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:54:08.3986441Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:54:08.4009002Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:54:08.4029153Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:54:08.4052055Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:08.4073420Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:08.4098380Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:54:08.4129438Z Entering 'third_party/pocketfft' 2025-12-04T08:54:08.4154140Z Entering 'third_party/protobuf' 2025-12-04T08:54:08.4180792Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:54:08.4201554Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T08:54:08.4224318Z Entering 'third_party/psimd' 2025-12-04T08:54:08.4248252Z Entering 'third_party/pthreadpool' 2025-12-04T08:54:08.4271386Z Entering 'third_party/pybind11' 2025-12-04T08:54:08.4295971Z Entering 'third_party/python-peachpy' 2025-12-04T08:54:08.4320113Z Entering 'third_party/sleef' 2025-12-04T08:54:08.4343347Z Entering 'third_party/tensorpipe' 2025-12-04T08:54:08.4369307Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:54:08.4390805Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:54:08.4410510Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:54:08.4433154Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:54:08.4452679Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:54:08.4489928Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-12-04T08:54:08.4646367Z Entering 'android/libs/fbjni' 2025-12-04T08:54:08.4666163Z Entering 'third_party/FP16' 2025-12-04T08:54:08.4687273Z Entering 'third_party/FXdiv' 2025-12-04T08:54:08.4714798Z Entering 'third_party/NNPACK' 2025-12-04T08:54:08.4740274Z Entering 'third_party/NVTX' 2025-12-04T08:54:08.4768480Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T08:54:08.4790941Z Entering 'third_party/XNNPACK' 2025-12-04T08:54:08.4820393Z Entering 'third_party/aiter' 2025-12-04T08:54:08.4842882Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T08:54:08.4870888Z Entering 'third_party/benchmark' 2025-12-04T08:54:08.4891307Z Entering 'third_party/composable_kernel' 2025-12-04T08:54:08.4913379Z Entering 'third_party/cpp-httplib' 2025-12-04T08:54:08.4933432Z Entering 'third_party/cpuinfo' 2025-12-04T08:54:08.4955144Z Entering 'third_party/cudnn_frontend' 2025-12-04T08:54:08.4977132Z Entering 'third_party/cutlass' 2025-12-04T08:54:08.5001239Z Entering 'third_party/fbgemm' 2025-12-04T08:54:08.5026487Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T08:54:08.5045487Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T08:54:08.5068767Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T08:54:08.5089046Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T08:54:08.5111961Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T08:54:08.5131338Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T08:54:08.5151528Z Entering 'third_party/fbgemm/external/json' 2025-12-04T08:54:08.5178636Z Entering 'third_party/flash-attention' 2025-12-04T08:54:08.5199616Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T08:54:08.5225066Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T08:54:08.5250378Z Entering 'third_party/flatbuffers' 2025-12-04T08:54:08.5271463Z Entering 'third_party/fmt' 2025-12-04T08:54:08.5291726Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T08:54:08.5313798Z Entering 'third_party/gloo' 2025-12-04T08:54:08.5333967Z Entering 'third_party/googletest' 2025-12-04T08:54:08.5354998Z Entering 'third_party/ideep' 2025-12-04T08:54:08.5374977Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T08:54:08.5401611Z Entering 'third_party/ittapi' 2025-12-04T08:54:08.5422040Z Entering 'third_party/kineto' 2025-12-04T08:54:08.5443433Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T08:54:08.5464058Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T08:54:08.5484348Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T08:54:08.5511128Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T08:54:08.5535617Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T08:54:08.5558340Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T08:54:08.5581100Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T08:54:08.5601358Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T08:54:08.5622345Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T08:54:08.5642304Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T08:54:08.5661716Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T08:54:08.5680381Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:08.5700676Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:08.5724834Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T08:54:08.5744104Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T08:54:08.5765956Z Entering 'third_party/kleidiai' 2025-12-04T08:54:08.5787215Z Entering 'third_party/mimalloc' 2025-12-04T08:54:08.5808719Z Entering 'third_party/nlohmann' 2025-12-04T08:54:08.5829987Z Entering 'third_party/onnx' 2025-12-04T08:54:08.5861340Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T08:54:08.5885059Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T08:54:08.5905819Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T08:54:08.5927193Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T08:54:08.5946280Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T08:54:08.5969168Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T08:54:08.5997601Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T08:54:08.6019781Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T08:54:08.6039541Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T08:54:08.6058990Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T08:54:08.6078210Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T08:54:08.6098433Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T08:54:08.6126936Z Entering 'third_party/pocketfft' 2025-12-04T08:54:08.6148620Z Entering 'third_party/protobuf' 2025-12-04T08:54:08.6174266Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T08:54:08.6196316Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T08:54:08.6219648Z Entering 'third_party/psimd' 2025-12-04T08:54:08.6241200Z Entering 'third_party/pthreadpool' 2025-12-04T08:54:08.6262485Z Entering 'third_party/pybind11' 2025-12-04T08:54:08.6284971Z Entering 'third_party/python-peachpy' 2025-12-04T08:54:08.6304601Z Entering 'third_party/sleef' 2025-12-04T08:54:08.6324810Z Entering 'third_party/tensorpipe' 2025-12-04T08:54:08.6342684Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T08:54:08.6368276Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T08:54:08.6386943Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T08:54:08.6409288Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T08:54:08.6434392Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T08:54:08.6470889Z ##[endgroup] 2025-12-04T08:54:08.6620558Z [command]/usr/bin/git log -1 --format=%H 2025-12-04T08:54:08.6732120Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T08:54:08.6840723Z Prepare all required actions 2025-12-04T08:54:08.6841028Z Getting action download info 2025-12-04T08:54:08.9099897Z Download action repository 'aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076' (SHA:062b18b96a7aff071d4dc91bc00c4c1a7945b076) 2025-12-04T08:54:09.6158159Z ##[group]Run ./.github/actions/setup-rocm 2025-12-04T08:54:09.6158302Z env: 2025-12-04T08:54:09.6158399Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:09.6158507Z ##[endgroup] 2025-12-04T08:54:09.6171816Z ##[group]Run dpkg -l | grep -E " rocm" 2025-12-04T08:54:09.6172025Z dpkg -l | grep -E " rocm" 2025-12-04T08:54:09.6176477Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:54:09.6176617Z env: 2025-12-04T08:54:09.6176709Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:09.6176818Z ##[endgroup] 2025-12-04T08:54:09.6231088Z ii rocm-cmake 0.14.0.60401-83~22.04 amd64 rocm-cmake built using CMake 2025-12-04T08:54:09.6231548Z ii rocm-core 6.4.1.60401-83~22.04 amd64 ROCm Runtime software stack 2025-12-04T08:54:09.6231770Z ii rocm-dbgapi 0.77.2.60401-83~22.04 amd64 Library to provide AMD GPU debugger API 2025-12-04T08:54:09.6232072Z ii rocm-debug-agent 2.0.4.60401-83~22.04 amd64 Radeon Open Compute Debug Agent (ROCdebug-agent) 2025-12-04T08:54:09.6232322Z ii rocm-dev 6.4.1.60401-83~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-12-04T08:54:09.6232561Z ii rocm-device-libs 1.0.0.60401-83~22.04 amd64 Radeon Open Compute - device libraries 2025-12-04T08:54:09.6232777Z ii rocm-gdb 15.2.60401-83~22.04 amd64 ROCgdb 2025-12-04T08:54:09.6232973Z ii rocm-llvm 19.0.0.25184.60401-83~22.04 amd64 ROCm core compiler 2025-12-04T08:54:09.6233183Z ii rocm-opencl 2.0.0.60401-83~22.04 amd64 clr built using CMake 2025-12-04T08:54:09.6233399Z ii rocm-opencl-dev 2.0.0.60401-83~22.04 amd64 clr built using CMake 2025-12-04T08:54:09.6233887Z ii rocm-smi-lib 7.5.0.60401-83~22.04 amd64 AMD System Management libraries 2025-12-04T08:54:09.6234238Z ii rocm-utils 6.4.1.60401-83~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-12-04T08:54:09.6234480Z ii rocminfo 1.0.0.60401-83~22.04 amd64 Radeon Open Compute (ROCm) Runtime rocminfo tool 2025-12-04T08:54:09.6248541Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2025-12-04T08:54:09.6248807Z # ignore expansion of "docker ps -q" since it could be empty 2025-12-04T08:54:09.6248982Z # shellcheck disable=SC2046 2025-12-04T08:54:09.6249133Z docker stop $(docker ps -q) || true 2025-12-04T08:54:09.6249259Z # Prune all stopped containers. 2025-12-04T08:54:09.6249385Z docker container prune -f 2025-12-04T08:54:09.6254043Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:54:09.6254179Z env: 2025-12-04T08:54:09.6254266Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:09.6254366Z ##[endgroup] 2025-12-04T08:54:09.6470203Z docker: 'docker stop' requires at least 1 argument 2025-12-04T08:54:09.6470306Z 2025-12-04T08:54:09.6470372Z Usage: docker stop [OPTIONS] CONTAINER [CONTAINER...] 2025-12-04T08:54:09.6470478Z 2025-12-04T08:54:09.6470535Z See 'docker stop --help' for more information 2025-12-04T08:54:09.6552157Z Total reclaimed space: 0B 2025-12-04T08:54:09.6575422Z ##[group]Run cat /etc/os-release || true 2025-12-04T08:54:09.6575598Z cat /etc/os-release || true 2025-12-04T08:54:09.6575754Z cat /etc/apt/sources.list.d/rocm.list || true 2025-12-04T08:54:09.6576049Z cat /opt/rocm/.info/version || true 2025-12-04T08:54:09.6576169Z whoami 2025-12-04T08:54:09.6580411Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:54:09.6580556Z env: 2025-12-04T08:54:09.6580639Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:09.6580742Z ##[endgroup] 2025-12-04T08:54:09.6602970Z PRETTY_NAME="Ubuntu 22.04.5 LTS" 2025-12-04T08:54:09.6603104Z NAME="Ubuntu" 2025-12-04T08:54:09.6603199Z VERSION_ID="22.04" 2025-12-04T08:54:09.6603318Z VERSION="22.04.5 LTS (Jammy Jellyfish)" 2025-12-04T08:54:09.6603440Z VERSION_CODENAME=jammy 2025-12-04T08:54:09.6603539Z ID=ubuntu 2025-12-04T08:54:09.6603622Z ID_LIKE=debian 2025-12-04T08:54:09.6603738Z HOME_URL="https://www.ubuntu.com/" 2025-12-04T08:54:09.6603870Z SUPPORT_URL="https://help.ubuntu.com/" 2025-12-04T08:54:09.6604022Z BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" 2025-12-04T08:54:09.6604234Z PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" 2025-12-04T08:54:09.6604562Z UBUNTU_CODENAME=jammy 2025-12-04T08:54:09.6608869Z deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/6.4.1 jammy main 2025-12-04T08:54:09.6613627Z 6.4.1-83 2025-12-04T08:54:09.6619307Z runner 2025-12-04T08:54:09.6637036Z ##[group]Run dpkg -l | grep -E " amdgpu" 2025-12-04T08:54:09.6637221Z dpkg -l | grep -E " amdgpu" 2025-12-04T08:54:09.6641475Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:54:09.6641620Z env: 2025-12-04T08:54:09.6641706Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:09.6641812Z ##[endgroup] 2025-12-04T08:54:09.6693815Z ii amdgpu-core 1:6.4.60401-2164967.22.04 all Core meta package for unified amdgpu driver. 2025-12-04T08:54:09.6694073Z ii amdgpu-install 6.4.60401-2164967.22.04 all AMDGPU driver repository and installer 2025-12-04T08:54:09.6710432Z ##[group]Run rocm-smi 2025-12-04T08:54:09.6710579Z rocm-smi 2025-12-04T08:54:09.6714791Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:54:09.6714939Z env: 2025-12-04T08:54:09.6715025Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:09.6715128Z ##[endgroup] 2025-12-04T08:54:09.7260475Z 2025-12-04T08:54:09.7260487Z 2025-12-04T08:54:09.7260855Z ============================================ ROCm System Management Interface ============================================ 2025-12-04T08:54:09.7261415Z ====================================================== Concise Info ====================================================== 2025-12-04T08:54:09.7262184Z Device Node IDs Temp Power Partitions SCLK MCLK Fan Perf PwrCap VRAM% GPU% 2025-12-04T08:54:09.7263268Z  (DID, GUID) (Junction) (Socket) (Mem, Compute, ID)  2025-12-04T08:54:09.7263801Z ========================================================================================================================== 2025-12-04T08:54:09.7264773Z 0 3 0x74a5, 51110 25.0°C 117.0W NPS1, SPX, 0 N/A 900Mhz 0% manual 1000.0W 0% 0% 2025-12-04T08:54:09.7265278Z ========================================================================================================================== 2025-12-04T08:54:09.7265693Z ================================================== End of ROCm SMI Log =================================================== 2025-12-04T08:54:09.7319568Z ##[group]Run rocminfo 2025-12-04T08:54:09.7319686Z rocminfo 2025-12-04T08:54:09.7322693Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:54:09.7322840Z env: 2025-12-04T08:54:09.7322929Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:09.7323038Z ##[endgroup] 2025-12-04T08:54:09.7850679Z ROCk module version 6.12.12 is loaded 2025-12-04T08:54:09.7850859Z ===================== 2025-12-04T08:54:09.7851096Z HSA System Attributes 2025-12-04T08:54:09.7851282Z ===================== 2025-12-04T08:54:09.7851652Z Runtime Version: 1.15 2025-12-04T08:54:09.7851767Z Runtime Ext Version: 1.7 2025-12-04T08:54:09.7851983Z System Timestamp Freq.: 1000.000000MHz 2025-12-04T08:54:09.7852164Z Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) 2025-12-04T08:54:09.7852361Z Machine Model: LARGE 2025-12-04T08:54:09.7852517Z System Endianness: LITTLE 2025-12-04T08:54:09.7852658Z Mwaitx: DISABLED 2025-12-04T08:54:09.7852771Z XNACK enabled: NO 2025-12-04T08:54:09.7852911Z DMAbuf Support: YES 2025-12-04T08:54:09.7853242Z VMM Support: YES 2025-12-04T08:54:09.7853418Z 2025-12-04T08:54:09.7853539Z ========== 2025-12-04T08:54:09.7853793Z HSA Agents 2025-12-04T08:54:09.7854105Z ========== 2025-12-04T08:54:09.7854517Z ******* 2025-12-04T08:54:09.7854797Z Agent 1 2025-12-04T08:54:09.7855107Z ******* 2025-12-04T08:54:09.7855421Z Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T08:54:09.7855955Z Uuid: CPU-XX 2025-12-04T08:54:09.7856358Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T08:54:09.7856775Z Vendor Name: CPU 2025-12-04T08:54:09.7857168Z Feature: None specified 2025-12-04T08:54:09.7857555Z Profile: FULL_PROFILE 2025-12-04T08:54:09.7857954Z Float Round Mode: NEAR 2025-12-04T08:54:09.7858358Z Max Queue Number: 0(0x0) 2025-12-04T08:54:09.7858749Z Queue Min Size: 0(0x0) 2025-12-04T08:54:09.7859131Z Queue Max Size: 0(0x0) 2025-12-04T08:54:09.7859527Z Queue Type: MULTI 2025-12-04T08:54:09.7859892Z Node: 0 2025-12-04T08:54:09.7860258Z Device Type: CPU 2025-12-04T08:54:09.7860599Z Cache Info: 2025-12-04T08:54:09.7860909Z L1: 49152(0xc000) KB 2025-12-04T08:54:09.7861269Z Chip ID: 0(0x0) 2025-12-04T08:54:09.7861642Z ASIC Revision: 0(0x0) 2025-12-04T08:54:09.7862101Z Cacheline Size: 64(0x40) 2025-12-04T08:54:09.7862500Z Max Clock Freq. (MHz): 3300 2025-12-04T08:54:09.7862866Z BDFID: 0 2025-12-04T08:54:09.7863011Z Internal Node ID: 0 2025-12-04T08:54:09.7863160Z Compute Unit: 64 2025-12-04T08:54:09.7863311Z SIMDs per CU: 0 2025-12-04T08:54:09.7863463Z Shader Engines: 0 2025-12-04T08:54:09.7863615Z Shader Arrs. per Eng.: 0 2025-12-04T08:54:09.7863774Z WatchPts on Addr. Ranges:1 2025-12-04T08:54:09.7863910Z Memory Properties: 2025-12-04T08:54:09.7864020Z Features: None 2025-12-04T08:54:09.7864127Z Pool Info: 2025-12-04T08:54:09.7864230Z Pool 1 2025-12-04T08:54:09.7864361Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T08:54:09.7864513Z Size: 1584777168(0x5e75c7d0) KB 2025-12-04T08:54:09.7864659Z Allocatable: TRUE 2025-12-04T08:54:09.7864813Z Alloc Granule: 4KB 2025-12-04T08:54:09.7865018Z Alloc Recommended Granule:4KB 2025-12-04T08:54:09.7865231Z Alloc Alignment: 4KB 2025-12-04T08:54:09.7865385Z Accessible by all: TRUE 2025-12-04T08:54:09.7865523Z Pool 2 2025-12-04T08:54:09.7865651Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T08:54:09.7865801Z Size: 1584777168(0x5e75c7d0) KB 2025-12-04T08:54:09.7865947Z Allocatable: TRUE 2025-12-04T08:54:09.7866098Z Alloc Granule: 4KB 2025-12-04T08:54:09.7866257Z Alloc Recommended Granule:4KB 2025-12-04T08:54:09.7866416Z Alloc Alignment: 4KB 2025-12-04T08:54:09.7866569Z Accessible by all: TRUE 2025-12-04T08:54:09.7866739Z Pool 3 2025-12-04T08:54:09.7866868Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-12-04T08:54:09.7867011Z Size: 1584777168(0x5e75c7d0) KB 2025-12-04T08:54:09.7867157Z Allocatable: TRUE 2025-12-04T08:54:09.7867307Z Alloc Granule: 4KB 2025-12-04T08:54:09.7867465Z Alloc Recommended Granule:4KB 2025-12-04T08:54:09.7867620Z Alloc Alignment: 4KB 2025-12-04T08:54:09.7867772Z Accessible by all: TRUE 2025-12-04T08:54:09.7867904Z Pool 4 2025-12-04T08:54:09.7868026Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T08:54:09.7868166Z Size: 1584777168(0x5e75c7d0) KB 2025-12-04T08:54:09.7868315Z Allocatable: TRUE 2025-12-04T08:54:09.7868460Z Alloc Granule: 4KB 2025-12-04T08:54:09.7868614Z Alloc Recommended Granule:4KB 2025-12-04T08:54:09.7868769Z Alloc Alignment: 4KB 2025-12-04T08:54:09.7868918Z Accessible by all: TRUE 2025-12-04T08:54:09.7869049Z ISA Info: 2025-12-04T08:54:09.7869147Z ******* 2025-12-04T08:54:09.7869238Z Agent 2 2025-12-04T08:54:09.7869330Z ******* 2025-12-04T08:54:09.7869439Z Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T08:54:09.7869580Z Uuid: CPU-XX 2025-12-04T08:54:09.7869728Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T08:54:09.7869878Z Vendor Name: CPU 2025-12-04T08:54:09.7870031Z Feature: None specified 2025-12-04T08:54:09.7870178Z Profile: FULL_PROFILE 2025-12-04T08:54:09.7870323Z Float Round Mode: NEAR 2025-12-04T08:54:09.7870469Z Max Queue Number: 0(0x0) 2025-12-04T08:54:09.7870611Z Queue Min Size: 0(0x0) 2025-12-04T08:54:09.7870753Z Queue Max Size: 0(0x0) 2025-12-04T08:54:09.7870899Z Queue Type: MULTI 2025-12-04T08:54:09.7871031Z Node: 1 2025-12-04T08:54:09.7871167Z Device Type: CPU 2025-12-04T08:54:09.7871293Z Cache Info: 2025-12-04T08:54:09.7871401Z L1: 49152(0xc000) KB 2025-12-04T08:54:09.7871566Z Chip ID: 0(0x0) 2025-12-04T08:54:09.7871706Z ASIC Revision: 0(0x0) 2025-12-04T08:54:09.7871912Z Cacheline Size: 64(0x40) 2025-12-04T08:54:09.7872173Z Max Clock Freq. (MHz): 3300 2025-12-04T08:54:09.7872312Z BDFID: 0 2025-12-04T08:54:09.7872451Z Internal Node ID: 1 2025-12-04T08:54:09.7872596Z Compute Unit: 64 2025-12-04T08:54:09.7872736Z SIMDs per CU: 0 2025-12-04T08:54:09.7872879Z Shader Engines: 0 2025-12-04T08:54:09.7873027Z Shader Arrs. per Eng.: 0 2025-12-04T08:54:09.7873178Z WatchPts on Addr. Ranges:1 2025-12-04T08:54:09.7873352Z Memory Properties: 2025-12-04T08:54:09.7873457Z Features: None 2025-12-04T08:54:09.7873561Z Pool Info: 2025-12-04T08:54:09.7873659Z Pool 1 2025-12-04T08:54:09.7873779Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T08:54:09.7873923Z Size: 1585311828(0x5e7df054) KB 2025-12-04T08:54:09.7874063Z Allocatable: TRUE 2025-12-04T08:54:09.7874213Z Alloc Granule: 4KB 2025-12-04T08:54:09.7874404Z Alloc Recommended Granule:4KB 2025-12-04T08:54:09.7874561Z Alloc Alignment: 4KB 2025-12-04T08:54:09.7874719Z Accessible by all: TRUE 2025-12-04T08:54:09.7874856Z Pool 2 2025-12-04T08:54:09.7874982Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T08:54:09.7875143Z Size: 1585311828(0x5e7df054) KB 2025-12-04T08:54:09.7875284Z Allocatable: TRUE 2025-12-04T08:54:09.7875440Z Alloc Granule: 4KB 2025-12-04T08:54:09.7875600Z Alloc Recommended Granule:4KB 2025-12-04T08:54:09.7875756Z Alloc Alignment: 4KB 2025-12-04T08:54:09.7875919Z Accessible by all: TRUE 2025-12-04T08:54:09.7876055Z Pool 3 2025-12-04T08:54:09.7876180Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-12-04T08:54:09.7876329Z Size: 1585311828(0x5e7df054) KB 2025-12-04T08:54:09.7876469Z Allocatable: TRUE 2025-12-04T08:54:09.7876624Z Alloc Granule: 4KB 2025-12-04T08:54:09.7876793Z Alloc Recommended Granule:4KB 2025-12-04T08:54:09.7876948Z Alloc Alignment: 4KB 2025-12-04T08:54:09.7877105Z Accessible by all: TRUE 2025-12-04T08:54:09.7877241Z Pool 4 2025-12-04T08:54:09.7877367Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T08:54:09.7877513Z Size: 1585311828(0x5e7df054) KB 2025-12-04T08:54:09.7877657Z Allocatable: TRUE 2025-12-04T08:54:09.7877811Z Alloc Granule: 4KB 2025-12-04T08:54:09.7877971Z Alloc Recommended Granule:4KB 2025-12-04T08:54:09.7878126Z Alloc Alignment: 4KB 2025-12-04T08:54:09.7878281Z Accessible by all: TRUE 2025-12-04T08:54:09.7878421Z ISA Info: 2025-12-04T08:54:09.7878557Z ******* 2025-12-04T08:54:09.7878657Z Agent 3 2025-12-04T08:54:09.7878752Z ******* 2025-12-04T08:54:09.7878865Z Name: gfx942 2025-12-04T08:54:09.7879010Z Uuid: GPU-45724ba446fb6af5 2025-12-04T08:54:09.7879158Z Marketing Name: AMD Instinct MI325X 2025-12-04T08:54:09.7879314Z Vendor Name: AMD 2025-12-04T08:54:09.7879463Z Feature: KERNEL_DISPATCH 2025-12-04T08:54:09.7879609Z Profile: BASE_PROFILE 2025-12-04T08:54:09.7879761Z Float Round Mode: NEAR 2025-12-04T08:54:09.7879909Z Max Queue Number: 128(0x80) 2025-12-04T08:54:09.7880058Z Queue Min Size: 64(0x40) 2025-12-04T08:54:09.7880253Z Queue Max Size: 131072(0x20000) 2025-12-04T08:54:09.7880397Z Queue Type: MULTI 2025-12-04T08:54:09.7880536Z Node: 2 2025-12-04T08:54:09.7880676Z Device Type: GPU 2025-12-04T08:54:09.7880803Z Cache Info: 2025-12-04T08:54:09.7880919Z L1: 32(0x20) KB 2025-12-04T08:54:09.7881051Z L2: 4096(0x1000) KB 2025-12-04T08:54:09.7881177Z L3: 262144(0x40000) KB 2025-12-04T08:54:09.7881309Z Chip ID: 29861(0x74a5) 2025-12-04T08:54:09.7881449Z ASIC Revision: 1(0x1) 2025-12-04T08:54:09.7881603Z Cacheline Size: 128(0x80) 2025-12-04T08:54:09.7881760Z Max Clock Freq. (MHz): 2100 2025-12-04T08:54:09.7881921Z BDFID: 1280 2025-12-04T08:54:09.7882064Z Internal Node ID: 2 2025-12-04T08:54:09.7882213Z Compute Unit: 304 2025-12-04T08:54:09.7882354Z SIMDs per CU: 4 2025-12-04T08:54:09.7882500Z Shader Engines: 32 2025-12-04T08:54:09.7882649Z Shader Arrs. per Eng.: 1 2025-12-04T08:54:09.7882807Z WatchPts on Addr. Ranges:4 2025-12-04T08:54:09.7882967Z Coherent Host Access: FALSE 2025-12-04T08:54:09.7883100Z Memory Properties: 2025-12-04T08:54:09.7883216Z Features: KERNEL_DISPATCH 2025-12-04T08:54:09.7883359Z Fast F16 Operation: TRUE 2025-12-04T08:54:09.7883513Z Wavefront Size: 64(0x40) 2025-12-04T08:54:09.7883666Z Workgroup Max Size: 1024(0x400) 2025-12-04T08:54:09.7883804Z Workgroup Max Size per Dimension: 2025-12-04T08:54:09.7883931Z x 1024(0x400) 2025-12-04T08:54:09.7884058Z y 1024(0x400) 2025-12-04T08:54:09.7884180Z z 1024(0x400) 2025-12-04T08:54:09.7884318Z Max Waves Per CU: 32(0x20) 2025-12-04T08:54:09.7884475Z Max Work-item Per CU: 2048(0x800) 2025-12-04T08:54:09.7884625Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T08:54:09.7884763Z Grid Max Size per Dimension: 2025-12-04T08:54:09.7884876Z x 4294967295(0xffffffff) 2025-12-04T08:54:09.7885007Z y 4294967295(0xffffffff) 2025-12-04T08:54:09.7885544Z z 4294967295(0xffffffff) 2025-12-04T08:54:09.7885691Z Max fbarriers/Workgrp: 32 2025-12-04T08:54:09.7890936Z Packet Processor uCode:: 185 2025-12-04T08:54:09.7891109Z SDMA engine uCode:: 24 2025-12-04T08:54:09.7891262Z IOMMU Support:: None 2025-12-04T08:54:09.7891399Z Pool Info: 2025-12-04T08:54:09.7891509Z Pool 1 2025-12-04T08:54:09.7891637Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T08:54:09.7891788Z Size: 268419072(0xfffc000) KB 2025-12-04T08:54:09.7891989Z Allocatable: TRUE 2025-12-04T08:54:09.7892143Z Alloc Granule: 4KB 2025-12-04T08:54:09.7892386Z Alloc Recommended Granule:2048KB 2025-12-04T08:54:09.7892545Z Alloc Alignment: 4KB 2025-12-04T08:54:09.7892704Z Accessible by all: FALSE 2025-12-04T08:54:09.7892841Z Pool 2 2025-12-04T08:54:09.7892968Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T08:54:09.7893117Z Size: 268419072(0xfffc000) KB 2025-12-04T08:54:09.7893257Z Allocatable: TRUE 2025-12-04T08:54:09.7893412Z Alloc Granule: 4KB 2025-12-04T08:54:09.7893574Z Alloc Recommended Granule:2048KB 2025-12-04T08:54:09.7893734Z Alloc Alignment: 4KB 2025-12-04T08:54:09.7893894Z Accessible by all: FALSE 2025-12-04T08:54:09.7894036Z Pool 3 2025-12-04T08:54:09.7894161Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T08:54:09.7894309Z Size: 268419072(0xfffc000) KB 2025-12-04T08:54:09.7894458Z Allocatable: TRUE 2025-12-04T08:54:09.7894609Z Alloc Granule: 4KB 2025-12-04T08:54:09.7894771Z Alloc Recommended Granule:2048KB 2025-12-04T08:54:09.7894928Z Alloc Alignment: 4KB 2025-12-04T08:54:09.7895087Z Accessible by all: FALSE 2025-12-04T08:54:09.7895224Z Pool 4 2025-12-04T08:54:09.7895344Z Segment: GROUP 2025-12-04T08:54:09.7895487Z Size: 64(0x40) KB 2025-12-04T08:54:09.7895633Z Allocatable: FALSE 2025-12-04T08:54:09.7895794Z Alloc Granule: 0KB 2025-12-04T08:54:09.7895955Z Alloc Recommended Granule:0KB 2025-12-04T08:54:09.7896114Z Alloc Alignment: 0KB 2025-12-04T08:54:09.7896272Z Accessible by all: FALSE 2025-12-04T08:54:09.7896411Z ISA Info: 2025-12-04T08:54:09.7896514Z ISA 1 2025-12-04T08:54:09.7896649Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T08:54:09.7896815Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T08:54:09.7896972Z Profiles: HSA_PROFILE_BASE 2025-12-04T08:54:09.7897134Z Default Rounding Mode: NEAR 2025-12-04T08:54:09.7897295Z Default Rounding Mode: NEAR 2025-12-04T08:54:09.7897495Z Fast f16: TRUE 2025-12-04T08:54:09.7897648Z Workgroup Max Size: 1024(0x400) 2025-12-04T08:54:09.7897792Z Workgroup Max Size per Dimension: 2025-12-04T08:54:09.7898110Z x 1024(0x400) 2025-12-04T08:54:09.7898275Z y 1024(0x400) 2025-12-04T08:54:09.7898402Z z 1024(0x400) 2025-12-04T08:54:09.7898549Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T08:54:09.7898692Z Grid Max Size per Dimension: 2025-12-04T08:54:09.7898810Z x 4294967295(0xffffffff) 2025-12-04T08:54:09.7898944Z y 4294967295(0xffffffff) 2025-12-04T08:54:09.7899076Z z 4294967295(0xffffffff) 2025-12-04T08:54:09.7899253Z FBarrier Max Size: 32 2025-12-04T08:54:09.7899392Z ISA 2 2025-12-04T08:54:09.7899533Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T08:54:09.7899709Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T08:54:09.7899870Z Profiles: HSA_PROFILE_BASE 2025-12-04T08:54:09.7900026Z Default Rounding Mode: NEAR 2025-12-04T08:54:09.7900190Z Default Rounding Mode: NEAR 2025-12-04T08:54:09.7900342Z Fast f16: TRUE 2025-12-04T08:54:09.7900491Z Workgroup Max Size: 1024(0x400) 2025-12-04T08:54:09.7900637Z Workgroup Max Size per Dimension: 2025-12-04T08:54:09.7900761Z x 1024(0x400) 2025-12-04T08:54:09.7900898Z y 1024(0x400) 2025-12-04T08:54:09.7901030Z z 1024(0x400) 2025-12-04T08:54:09.7901167Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T08:54:09.7901309Z Grid Max Size per Dimension: 2025-12-04T08:54:09.7901432Z x 4294967295(0xffffffff) 2025-12-04T08:54:09.7901560Z y 4294967295(0xffffffff) 2025-12-04T08:54:09.7901692Z z 4294967295(0xffffffff) 2025-12-04T08:54:09.7901838Z FBarrier Max Size: 32 2025-12-04T08:54:09.7902024Z *** Done *** 2025-12-04T08:54:09.7913074Z ##[group]Run ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-12-04T08:54:09.7913266Z ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-12-04T08:54:09.7913543Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2025-12-04T08:54:09.7913807Z if [[ $ngpu -eq 0 ]]; then 2025-12-04T08:54:09.7913955Z  echo "Error: Failed to detect any GPUs on the runner" 2025-12-04T08:54:09.7914098Z  echo "$msg" 2025-12-04T08:54:09.7914206Z  exit 1 2025-12-04T08:54:09.7914296Z fi 2025-12-04T08:54:09.7917610Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:54:09.7917755Z env: 2025-12-04T08:54:09.7929096Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:09.7929215Z ##[endgroup] 2025-12-04T08:54:09.8617416Z ##[group]Run pytorch/pytorch/.github/actions/diskspace-cleanup@main 2025-12-04T08:54:09.8617588Z with: 2025-12-04T08:54:09.8617684Z diskspace-cutoff: 70 2025-12-04T08:54:09.8617783Z env: 2025-12-04T08:54:09.8617880Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:09.8617989Z ##[endgroup] 2025-12-04T08:54:09.8636472Z ##[group]Run set -ex 2025-12-04T08:54:09.8636604Z set -ex 2025-12-04T08:54:09.8636789Z diskspace_cutoff=70 2025-12-04T08:54:09.8636933Z docker_root_dir=$(docker info -f '{{.DockerRootDir}}') 2025-12-04T08:54:09.8637093Z if [ ! -d "$docker_root_dir" ]; then 2025-12-04T08:54:09.8637292Z  echo "Docker root directory ($docker_root_dir) does not exist. Skipping disk space check." 2025-12-04T08:54:09.8637477Z  exit 0 2025-12-04T08:54:09.8637572Z fi 2025-12-04T08:54:09.8637735Z diskspace=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2025-12-04T08:54:09.8638056Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2025-12-04T08:54:09.8638338Z if [[ "$diskspace" -ge "$diskspace_cutoff" ]] ; then 2025-12-04T08:54:09.8638487Z  docker system prune -af 2025-12-04T08:54:09.8638678Z  diskspace_new=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2025-12-04T08:54:09.8638942Z  if [[ "$diskspace_new" -gt "$diskspace_cutoff" ]] ; then 2025-12-04T08:54:09.8639108Z  diskspace_cutoff_int=$((diskspace_cutoff + 0)) 2025-12-04T08:54:09.8639260Z  difference=$((100 - diskspace_cutoff_int)) 2025-12-04T08:54:09.8639478Z  echo "Error: Available diskspace is less than $difference percent. Not enough diskspace." 2025-12-04T08:54:09.8639677Z  echo "$msg" 2025-12-04T08:54:09.8639785Z  exit 1 2025-12-04T08:54:09.8639884Z  else 2025-12-04T08:54:09.8640000Z  difference=$((diskspace - diskspace_new)) 2025-12-04T08:54:09.8640151Z  echo "Diskspace saved: $difference percent" 2025-12-04T08:54:09.8640280Z  fi 2025-12-04T08:54:09.8640366Z fi 2025-12-04T08:54:09.8643279Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:54:09.8643428Z env: 2025-12-04T08:54:09.8643519Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:09.8643622Z ##[endgroup] 2025-12-04T08:54:09.8657509Z + diskspace_cutoff=70 2025-12-04T08:54:09.8661521Z ++ docker info -f '{{.DockerRootDir}}' 2025-12-04T08:54:09.9007700Z + docker_root_dir=/home/runner/docker-data 2025-12-04T08:54:09.9008081Z + '[' '!' -d /home/runner/docker-data ']' 2025-12-04T08:54:09.9013293Z ++ df -H --output=pcent /home/runner/docker-data 2025-12-04T08:54:09.9014191Z ++ sed -n 2p 2025-12-04T08:54:09.9015578Z ++ sed s/%// 2025-12-04T08:54:09.9015677Z ++ sed 's/ //' 2025-12-04T08:54:09.9029419Z + diskspace=' 4' 2025-12-04T08:54:09.9029849Z + msg='Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified' 2025-12-04T08:54:09.9030111Z + [[ 4 -ge 70 ]] 2025-12-04T08:54:09.9053809Z ##[group]Run RUNNER_ARTIFACT_DIR="${RUNNER_TEMP}/artifacts" 2025-12-04T08:54:09.9054051Z RUNNER_ARTIFACT_DIR="${RUNNER_TEMP}/artifacts" 2025-12-04T08:54:09.9054226Z rm -rf "${RUNNER_ARTIFACT_DIR}" 2025-12-04T08:54:09.9054370Z mkdir -p "${RUNNER_ARTIFACT_DIR}" 2025-12-04T08:54:09.9054547Z echo "RUNNER_ARTIFACT_DIR=${RUNNER_ARTIFACT_DIR}" >> "${GITHUB_ENV}" 2025-12-04T08:54:09.9054706Z  2025-12-04T08:54:09.9054830Z RUNNER_TEST_RESULTS_DIR="${RUNNER_TEMP}/test-results" 2025-12-04T08:54:09.9054991Z rm -rf "${RUNNER_TEST_RESULTS_DIR}" 2025-12-04T08:54:09.9055124Z mkdir -p "${RUNNER_TEST_RESULTS_DIR}" 2025-12-04T08:54:09.9055305Z echo "RUNNER_TEST_RESULTS_DIR=${RUNNER_TEST_RESULTS_DIR}" >> "${GITHUB_ENV}" 2025-12-04T08:54:09.9055470Z  2025-12-04T08:54:09.9055571Z RUNNER_DOCS_DIR="${RUNNER_TEMP}/docs" 2025-12-04T08:54:09.9055709Z rm -rf "${RUNNER_DOCS_DIR}" 2025-12-04T08:54:09.9055832Z mkdir -p "${RUNNER_DOCS_DIR}" 2025-12-04T08:54:09.9055987Z echo "RUNNER_DOCS_DIR=${RUNNER_DOCS_DIR}" >> "${GITHUB_ENV}" 2025-12-04T08:54:09.9060378Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:54:09.9060522Z env: 2025-12-04T08:54:09.9060620Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:09.9060725Z ##[endgroup] 2025-12-04T08:54:09.9128015Z ##[group]Run env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T08:54:09.9128227Z env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T08:54:09.9128407Z env | grep '^CI' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T08:54:09.9131378Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:54:09.9131517Z env: 2025-12-04T08:54:09.9131609Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:09.9131740Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T08:54:09.9131964Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T08:54:09.9132126Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T08:54:09.9132330Z ##[endgroup] 2025-12-04T08:54:09.9173102Z ##[group]Run # All GPUs are visible to the runner; visibility, if needed, will be set by run_test.py. 2025-12-04T08:54:09.9173391Z # All GPUs are visible to the runner; visibility, if needed, will be set by run_test.py. 2025-12-04T08:54:09.9173593Z # Add render group for container creation. 2025-12-04T08:54:09.9173766Z render_gid=`cat /etc/group | grep render | cut -d: -f3` 2025-12-04T08:54:09.9173972Z # Ensure GPU isolation if pod is part of kubernetes setup with DEVICE_FLAG. 2025-12-04T08:54:09.9174175Z if [ -f "/etc/podinfo/gha-render-devices" ]; then 2025-12-04T08:54:09.9174338Z  DEVICE_FLAG=$(cat /etc/podinfo/gha-render-devices) 2025-12-04T08:54:09.9174480Z else 2025-12-04T08:54:09.9174587Z  DEVICE_FLAG="--device /dev/dri" 2025-12-04T08:54:09.9174705Z fi 2025-12-04T08:54:09.9174887Z # The --group-add daemon and --group-add bin are needed in the Ubuntu 24.04 and Almalinux OSs respectively. 2025-12-04T08:54:09.9175172Z # This is due to the device files (/dev/kfd & /dev/dri) being owned by video group on bare metal. 2025-12-04T08:54:09.9175421Z # This video group ID maps to subgid 1 inside the docker image due to the /etc/subgid entries. 2025-12-04T08:54:09.9175684Z # The group name corresponding to group ID 1 can change depending on the OS, so both are necessary. 2025-12-04T08:54:09.9176121Z echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd $DEVICE_FLAG --group-add video --group-add $render_gid --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host" >> "${GITHUB_ENV}" 2025-12-04T08:54:09.9179037Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:54:09.9179178Z env: 2025-12-04T08:54:09.9179274Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:09.9179409Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T08:54:09.9179588Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T08:54:09.9179757Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T08:54:09.9179881Z ##[endgroup] 2025-12-04T08:54:09.9235477Z ##[group]Run aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722 2025-12-04T08:54:09.9235676Z with: 2025-12-04T08:54:09.9235826Z role-to-assume: arn:aws:iam::308535385114:role/gha_workflow_s3_and_ecr_read_only 2025-12-04T08:54:09.9235994Z aws-region: us-east-1 2025-12-04T08:54:09.9236107Z role-duration-seconds: 18000 2025-12-04T08:54:09.9236227Z audience: sts.amazonaws.com 2025-12-04T08:54:09.9236333Z env: 2025-12-04T08:54:09.9236422Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:09.9236549Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T08:54:09.9236721Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T08:54:09.9236880Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T08:54:09.9237353Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T08:54:09.9237713Z ##[endgroup] 2025-12-04T08:54:10.2463290Z Assuming role with OIDC 2025-12-04T08:54:10.5863768Z Authenticated as assumedRoleId AROAUPVRELQNLLCOPFEJR:GitHubActions 2025-12-04T08:54:10.6822052Z ##[group]Run aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076 2025-12-04T08:54:10.6822287Z with: 2025-12-04T08:54:10.6822395Z mask-password: true 2025-12-04T08:54:10.6822515Z registry-type: private 2025-12-04T08:54:10.6822640Z skip-logout: false 2025-12-04T08:54:10.6822753Z env: 2025-12-04T08:54:10.6822842Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:10.6822982Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T08:54:10.6823158Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T08:54:10.6823325Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T08:54:10.6823862Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T08:54:10.6824233Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T08:54:10.6824345Z AWS_REGION: us-east-1 2025-12-04T08:54:10.6824792Z AWS_ACCESS_KEY_ID: *** 2025-12-04T08:54:10.6824941Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T08:54:10.6827171Z AWS_SESSION_TOKEN: *** 2025-12-04T08:54:10.6827277Z ##[endgroup] 2025-12-04T08:54:11.0966871Z Logging into registry 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:11.7385056Z ##[group]Run env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T08:54:11.7385383Z env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T08:54:11.7385642Z env | grep '^CI' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T08:54:11.7385917Z env | grep '^RUNNER' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T08:54:11.7391374Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:54:11.7391584Z env: 2025-12-04T08:54:11.7391721Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:11.7391976Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T08:54:11.7392225Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T08:54:11.7392449Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T08:54:11.7392964Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T08:54:11.7393473Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T08:54:11.7393634Z AWS_REGION: us-east-1 2025-12-04T08:54:11.7393919Z AWS_ACCESS_KEY_ID: *** 2025-12-04T08:54:11.7394173Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T08:54:11.7396617Z AWS_SESSION_TOKEN: *** 2025-12-04T08:54:11.7396737Z ##[endgroup] 2025-12-04T08:54:11.7556683Z ##[group]Run pytorch/test-infra/.github/actions/calculate-docker-image@main 2025-12-04T08:54:11.7556876Z with: 2025-12-04T08:54:11.7557167Z docker-image-name: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T08:54:11.7557477Z use-custom-docker-registry: true 2025-12-04T08:54:11.7557612Z docker-build-dir: .ci/docker 2025-12-04T08:54:11.7557735Z docker-build-script: ./build.sh 2025-12-04T08:54:11.7557861Z working-directory: . 2025-12-04T08:54:11.7558009Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:11.7558169Z force-push: false 2025-12-04T08:54:11.7558272Z env: 2025-12-04T08:54:11.7558367Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:11.7558509Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T08:54:11.7558705Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T08:54:11.7558886Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T08:54:11.7559273Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T08:54:11.7559646Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T08:54:11.7559764Z AWS_REGION: us-east-1 2025-12-04T08:54:11.7560050Z AWS_ACCESS_KEY_ID: *** 2025-12-04T08:54:11.7560205Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T08:54:11.7562463Z AWS_SESSION_TOKEN: *** 2025-12-04T08:54:11.7562570Z ##[endgroup] 2025-12-04T08:54:11.7571355Z ##[group]Run set -ex 2025-12-04T08:54:11.7571505Z set -ex 2025-12-04T08:54:11.7571602Z  2025-12-04T08:54:11.7571761Z # If the docker build directory or the build script doesn't exist, the action will 2025-12-04T08:54:11.7572215Z # gracefully return the docker image name as it is. Pulling docker image in Linux 2025-12-04T08:54:11.7572431Z # job could then download the pre-built image as usual 2025-12-04T08:54:11.7572687Z if [[ -d "${DOCKER_BUILD_DIR}" ]] && [[ -f "${DOCKER_BUILD_DIR}/${DOCKER_BUILD_SCRIPT}" ]] && [[ "${USE_CUSTOM_DOCKER_REGISTRY}" == "true" ]]; then 2025-12-04T08:54:11.7572925Z  echo "skip=false" >> "${GITHUB_OUTPUT}" 2025-12-04T08:54:11.7573058Z else 2025-12-04T08:54:11.7573172Z  echo "skip=true" >> "${GITHUB_OUTPUT}" 2025-12-04T08:54:11.7573348Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-12-04T08:54:11.7573506Z  2025-12-04T08:54:11.7573715Z  echo "Not using custom ECR registry. Either it was not requested or there is no Docker build script in the ${REPO_NAME} repo..." 2025-12-04T08:54:11.7573946Z  exit 0 2025-12-04T08:54:11.7574042Z fi 2025-12-04T08:54:11.7574142Z  2025-12-04T08:54:11.7574279Z if [[ "${DOCKER_IMAGE_NAME}" == *"${DOCKER_REGISTRY}/${REPO_NAME}"* ]]; then 2025-12-04T08:54:11.7574505Z  # The docker image name already includes the ECR prefix and tag, so we can just 2025-12-04T08:54:11.7574704Z  # use it as it is, but first let's extract the tag 2025-12-04T08:54:11.7574890Z  DOCKER_TAG=$(echo "${DOCKER_IMAGE_NAME}" | awk -F '[:,]' '{print $2}') 2025-12-04T08:54:11.7575084Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-12-04T08:54:11.7575269Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-12-04T08:54:11.7575423Z else 2025-12-04T08:54:11.7575535Z  if [[ "${DOCKER_IMAGE_NAME}" == *:* ]]; then 2025-12-04T08:54:11.7575690Z  CUSTOM_TAG_PREFIX=${DOCKER_IMAGE_NAME#*:} 2025-12-04T08:54:11.7575838Z  DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME%%:*} 2025-12-04T08:54:11.7575963Z  fi 2025-12-04T08:54:11.7576230Z  DOCKER_TAG=${CUSTOM_TAG_PREFIX:+${CUSTOM_TAG_PREFIX}-}$(git rev-parse HEAD:"${DOCKER_BUILD_DIR}") 2025-12-04T08:54:11.7576452Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-12-04T08:54:11.7576683Z  echo "docker-image=${DOCKER_REGISTRY}/${REPO_NAME}/${DOCKER_IMAGE_NAME}:${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-12-04T08:54:11.7576933Z  echo "custom-tag-prefix=${CUSTOM_TAG_PREFIX}" >> "${GITHUB_OUTPUT}" 2025-12-04T08:54:11.7577089Z fi 2025-12-04T08:54:11.7581463Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:54:11.7581610Z env: 2025-12-04T08:54:11.7581702Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:11.7581837Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T08:54:11.7582068Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T08:54:11.7582233Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T08:54:11.7582615Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T08:54:11.7582986Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T08:54:11.7583099Z AWS_REGION: us-east-1 2025-12-04T08:54:11.7583235Z AWS_ACCESS_KEY_ID: *** 2025-12-04T08:54:11.7583386Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T08:54:11.7585588Z AWS_SESSION_TOKEN: *** 2025-12-04T08:54:11.7585694Z REPO_NAME: pytorch 2025-12-04T08:54:11.7585975Z DOCKER_IMAGE_NAME: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T08:54:11.7586270Z DOCKER_BUILD_DIR: .ci/docker 2025-12-04T08:54:11.7586392Z DOCKER_BUILD_SCRIPT: ./build.sh 2025-12-04T08:54:11.7586544Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:11.7586703Z USE_CUSTOM_DOCKER_REGISTRY: true 2025-12-04T08:54:11.7586873Z CUSTOM_TAG_PREFIX: 2025-12-04T08:54:11.7586981Z ##[endgroup] 2025-12-04T08:54:11.7606852Z + [[ -d .ci/docker ]] 2025-12-04T08:54:11.7607211Z + [[ -f .ci/docker/./build.sh ]] 2025-12-04T08:54:11.7607428Z + [[ true == \t\r\u\e ]] 2025-12-04T08:54:11.7607616Z + echo skip=false 2025-12-04T08:54:11.7608295Z + [[ 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a == *\3\0\8\5\3\5\3\8\5\1\1\4\.\d\k\r\.\e\c\r\.\u\s\-\e\a\s\t\-\1\.\a\m\a\z\o\n\a\w\s\.\c\o\m\/\p\y\t\o\r\c\h* ]] 2025-12-04T08:54:11.7613071Z ++ echo 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T08:54:11.7613566Z ++ awk -F '[:,]' '{print $2}' 2025-12-04T08:54:11.7621725Z + DOCKER_TAG=pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T08:54:11.7622213Z + echo docker-tag=pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T08:54:11.7622891Z + echo docker-image=308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T08:54:11.7654109Z ##[group]Run set +e 2025-12-04T08:54:11.7654284Z set +e 2025-12-04T08:54:11.7654409Z set -x 2025-12-04T08:54:11.7654521Z  2025-12-04T08:54:11.7654636Z login() { 2025-12-04T08:54:11.7654860Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-12-04T08:54:11.7655088Z } 2025-12-04T08:54:11.7655192Z  2025-12-04T08:54:11.7655302Z retry () { 2025-12-04T08:54:11.7655441Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-12-04T08:54:11.7655592Z } 2025-12-04T08:54:11.7655695Z  2025-12-04T08:54:11.7655812Z retry login "${DOCKER_REGISTRY}" 2025-12-04T08:54:11.7655954Z  2025-12-04T08:54:11.7656063Z START_TIME=$(date +%s) 2025-12-04T08:54:11.7656226Z # Wait up to 120 minutes 2025-12-04T08:54:11.7656561Z while [[ $(( $(date +%s) - 7200 )) -lt $START_TIME ]]; do 2025-12-04T08:54:11.7656775Z  # Check if image already exists, if it does then skip building it 2025-12-04T08:54:11.7656991Z  if docker manifest inspect "${DOCKER_IMAGE}"; then 2025-12-04T08:54:11.7657152Z  exit 0 2025-12-04T08:54:11.7657266Z  fi 2025-12-04T08:54:11.7657373Z  2025-12-04T08:54:11.7657549Z  # NB: This flag is used by Docker build workflow to push the image to ECR, so we can 2025-12-04T08:54:11.7657824Z  # use this to differentiate between the Docker build and regular build jobs. For the 2025-12-04T08:54:11.7658104Z  # latter, it will wait for the Docker images to become available before continuing 2025-12-04T08:54:11.7658329Z  if [ "${DOCKER_PUSH:-false}" == "true" ]; then 2025-12-04T08:54:11.7658514Z  # It's a Docker build job, let's build the image 2025-12-04T08:54:11.7658677Z  break 2025-12-04T08:54:11.7658797Z  else 2025-12-04T08:54:11.7658955Z  # It's a regular build job, wait for the image to become available 2025-12-04T08:54:11.7659141Z  sleep 300 2025-12-04T08:54:11.7659263Z  fi 2025-12-04T08:54:11.7659379Z done 2025-12-04T08:54:11.7659484Z  2025-12-04T08:54:11.7659646Z # NB: This part requires a full checkout. Otherwise, the merge base will 2025-12-04T08:54:11.7659890Z # be empty. The default action would be to continue rebuild the image 2025-12-04T08:54:11.7660115Z if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then 2025-12-04T08:54:11.7660318Z  # if we're on the base branch then use the parent commit 2025-12-04T08:54:11.7660497Z  MERGE_BASE=$(git rev-parse HEAD~) 2025-12-04T08:54:11.7660646Z else 2025-12-04T08:54:11.7660799Z  # otherwise we're on a PR, so use the most recent base commit 2025-12-04T08:54:11.7661141Z  MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION") 2025-12-04T08:54:11.7661301Z fi 2025-12-04T08:54:11.7661408Z  2025-12-04T08:54:11.7661530Z if [[ -z "${MERGE_BASE}" ]]; then 2025-12-04T08:54:11.7661696Z  echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-12-04T08:54:11.7662030Z  2025-12-04T08:54:11.7662233Z  echo "Finding merge base only works with full checkout, please set fetch-depth to 0, continuing ..." 2025-12-04T08:54:11.7662468Z  exit 0 2025-12-04T08:54:11.7662577Z fi 2025-12-04T08:54:11.7662674Z  2025-12-04T08:54:11.7662821Z if ! git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}"; then 2025-12-04T08:54:11.7663110Z  echo "Directory '${DOCKER_BUILD_DIR}' not found in commit $MERGE_BASE, you should rebase onto a more recent commit" 2025-12-04T08:54:11.7663348Z  exit 1 2025-12-04T08:54:11.7663449Z fi 2025-12-04T08:54:11.7663544Z  2025-12-04T08:54:11.7663700Z PREVIOUS_DOCKER_TAG=$(git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}") 2025-12-04T08:54:11.7663950Z # If no image exists but the hash is the same as the previous hash then we should error out here 2025-12-04T08:54:11.7664173Z if [[ "${PREVIOUS_DOCKER_TAG}" == "${DOCKER_TAG}" ]]; then 2025-12-04T08:54:11.7664430Z  echo "WARNING: Something has gone wrong and the previous image isn't available for the merge-base of your branch" 2025-12-04T08:54:11.7664709Z  echo " Will re-build docker image to store in local cache, TTS may be longer" 2025-12-04T08:54:11.7664884Z fi 2025-12-04T08:54:11.7664978Z  2025-12-04T08:54:11.7665091Z echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-12-04T08:54:11.7669354Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:54:11.7669500Z env: 2025-12-04T08:54:11.7669599Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:11.7669738Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T08:54:11.7669960Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T08:54:11.7670124Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T08:54:11.7670505Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T08:54:11.7670873Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T08:54:11.7670986Z AWS_REGION: us-east-1 2025-12-04T08:54:11.7671185Z AWS_ACCESS_KEY_ID: *** 2025-12-04T08:54:11.7671338Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T08:54:11.7673681Z AWS_SESSION_TOKEN: *** 2025-12-04T08:54:11.7673790Z DOCKER_BUILD_DIR: .ci/docker 2025-12-04T08:54:11.7673930Z BASE_REVISION: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T08:54:11.7674248Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T08:54:11.7674609Z DOCKER_TAG: pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T08:54:11.7674834Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:11.7674983Z DOCKER_PUSH: 2025-12-04T08:54:11.7675079Z ##[endgroup] 2025-12-04T08:54:11.7692686Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:11.7692866Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:11.7695850Z + aws ecr get-login-password --region us-east-1 2025-12-04T08:54:11.7696050Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:11.7696448Z /home/runner/_work/_temp/9aa50f40-9f7d-468f-b112-b8e6f273ac05.sh: line 5: aws: command not found 2025-12-04T08:54:11.7787517Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T08:54:11.7797305Z + sleep 1 2025-12-04T08:54:12.7806484Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:12.7811335Z + aws ecr get-login-password --region us-east-1 2025-12-04T08:54:12.7811718Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:12.7812227Z /home/runner/_work/_temp/9aa50f40-9f7d-468f-b112-b8e6f273ac05.sh: line 5: aws: command not found 2025-12-04T08:54:12.7906162Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T08:54:12.7916159Z + sleep 2 2025-12-04T08:54:14.7926025Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:14.7931440Z + aws ecr get-login-password --region us-east-1 2025-12-04T08:54:14.7932424Z /home/runner/_work/_temp/9aa50f40-9f7d-468f-b112-b8e6f273ac05.sh: line 5: aws: command not found 2025-12-04T08:54:14.7933154Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:14.8009333Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T08:54:14.8021210Z ++ date +%s 2025-12-04T08:54:14.8029221Z + START_TIME=1764838454 2025-12-04T08:54:14.8032118Z ++ date +%s 2025-12-04T08:54:14.8040524Z + [[ 1764831254 -lt 1764838454 ]] 2025-12-04T08:54:14.8041223Z + docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T08:54:16.1659462Z { 2025-12-04T08:54:16.1659765Z "schemaVersion": 2, 2025-12-04T08:54:16.1660218Z "mediaType": "application/vnd.docker.distribution.manifest.v2+json", 2025-12-04T08:54:16.1660625Z "config": { 2025-12-04T08:54:16.1660930Z "mediaType": "application/vnd.docker.container.image.v1+json", 2025-12-04T08:54:16.1661292Z "size": 30520, 2025-12-04T08:54:16.1661658Z "digest": "sha256:45252333063339f104d56e41f20304e9511ab21c7768e8d156b95ddf24a9dbe5" 2025-12-04T08:54:16.1662155Z }, 2025-12-04T08:54:16.1662340Z "layers": [ 2025-12-04T08:54:16.1662532Z { 2025-12-04T08:54:16.1662885Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1663216Z "size": 30447951, 2025-12-04T08:54:16.1664340Z "digest": "sha256:63e5bc7682b85ae57a1221210f64d62e7a90b0a30f19af4ca734b8242ae49d63" 2025-12-04T08:54:16.1664720Z }, 2025-12-04T08:54:16.1664883Z { 2025-12-04T08:54:16.1665159Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1665490Z "size": 1554, 2025-12-04T08:54:16.1665940Z "digest": "sha256:835841cca3b7e1464290cdb78e48773e03583413fbed852c3cc5165a392ea44d" 2025-12-04T08:54:16.1666309Z }, 2025-12-04T08:54:16.1666467Z { 2025-12-04T08:54:16.1666731Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1667071Z "size": 313275691, 2025-12-04T08:54:16.1667422Z "digest": "sha256:aac69780afc8611a5f94a235792d39ae055249c8319ef43b78675998a9b2f825" 2025-12-04T08:54:16.1667795Z }, 2025-12-04T08:54:16.1667953Z { 2025-12-04T08:54:16.1668226Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1668562Z "size": 704, 2025-12-04T08:54:16.1668911Z "digest": "sha256:029495b23122c840ca0e52d487afa8d2c4dbf1991cd7f204ec3e434dcf947bf4" 2025-12-04T08:54:16.1669274Z }, 2025-12-04T08:54:16.1669436Z { 2025-12-04T08:54:16.1669696Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1670020Z "size": 1218, 2025-12-04T08:54:16.1670357Z "digest": "sha256:d0fb85b008332051a3f7c052721ef68bde404b46c23fa43ad040373bd367826c" 2025-12-04T08:54:16.1670717Z }, 2025-12-04T08:54:16.1670877Z { 2025-12-04T08:54:16.1671145Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1671473Z "size": 484, 2025-12-04T08:54:16.1671806Z "digest": "sha256:59b63930883363c7d2aaab27cc61555d9f3e119dc18247a8624c98ebdaa354a5" 2025-12-04T08:54:16.1672229Z }, 2025-12-04T08:54:16.1672392Z { 2025-12-04T08:54:16.1672664Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1673149Z "size": 110363202, 2025-12-04T08:54:16.1673511Z "digest": "sha256:dc112c89d57aa1e85082e40a56e5bc743d64f834ae2f98afe91f60c248354d38" 2025-12-04T08:54:16.1673886Z }, 2025-12-04T08:54:16.1674048Z { 2025-12-04T08:54:16.1674265Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1674507Z "size": 4436, 2025-12-04T08:54:16.1674741Z "digest": "sha256:522eab2402e5001810155ef7eb56940b7c01a4fef62ac588886981c3b8ee8e1e" 2025-12-04T08:54:16.1675003Z }, 2025-12-04T08:54:16.1675123Z { 2025-12-04T08:54:16.1675320Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1675548Z "size": 1755, 2025-12-04T08:54:16.1675790Z "digest": "sha256:2b5a11b41761d8ea3b829e4772e4064cb6c4e4989126af324d0057661e4493a1" 2025-12-04T08:54:16.1676048Z }, 2025-12-04T08:54:16.1676164Z { 2025-12-04T08:54:16.1676354Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1676582Z "size": 724, 2025-12-04T08:54:16.1676825Z "digest": "sha256:9681563a88ff9e62494a2740e537440d3df978d466c9478d6a941fae8b57b084" 2025-12-04T08:54:16.1677087Z }, 2025-12-04T08:54:16.1677214Z { 2025-12-04T08:54:16.1677402Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1677644Z "size": 3185588166, 2025-12-04T08:54:16.1677897Z "digest": "sha256:73e33534e9eb94cf29418d65944168962b65fe21f55e9b8bad18c76e9b3a37b8" 2025-12-04T08:54:16.1678157Z }, 2025-12-04T08:54:16.1678280Z { 2025-12-04T08:54:16.1678470Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1678706Z "size": 396, 2025-12-04T08:54:16.1678948Z "digest": "sha256:5bfdaeb5578d6ffcd7db29c48303cbceb13c591210feaa216a8daa7a6d445b4b" 2025-12-04T08:54:16.1679217Z }, 2025-12-04T08:54:16.1679333Z { 2025-12-04T08:54:16.1679530Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1679763Z "size": 236863, 2025-12-04T08:54:16.1680012Z "digest": "sha256:c07d27e4d3a5ba4ad5325bb785b2e4f058fe5e10ec1aeeb413a1e152b073f203" 2025-12-04T08:54:16.1680292Z }, 2025-12-04T08:54:16.1680486Z { 2025-12-04T08:54:16.1680671Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1680909Z "size": 787, 2025-12-04T08:54:16.1681148Z "digest": "sha256:b21856d1bf420da6fa8ec7331b82ab355d4f4178644e7d3a3d3d0fbc3610109a" 2025-12-04T08:54:16.1681419Z }, 2025-12-04T08:54:16.1681538Z { 2025-12-04T08:54:16.1681730Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1682265Z "size": 106, 2025-12-04T08:54:16.1682604Z "digest": "sha256:cb19d84867e4063f55db9459c28c50a2abc37c06d3c1ca82ba95fa8427cc438a" 2025-12-04T08:54:16.1682874Z }, 2025-12-04T08:54:16.1682990Z { 2025-12-04T08:54:16.1683179Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1683413Z "size": 1496, 2025-12-04T08:54:16.1683641Z "digest": "sha256:8165374f8dccf88a7791a5d31afbe29e4d4542b4f1cf1904945e07f9af6bf8ba" 2025-12-04T08:54:16.1683916Z }, 2025-12-04T08:54:16.1684034Z { 2025-12-04T08:54:16.1684215Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1684433Z "size": 458789560, 2025-12-04T08:54:16.1684656Z "digest": "sha256:1aecc77354ceba59ec6f0d37a558f2dbb6d5c0854553ee8505ac8707b422da6d" 2025-12-04T08:54:16.1684921Z }, 2025-12-04T08:54:16.1685009Z { 2025-12-04T08:54:16.1685147Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1685321Z "size": 164, 2025-12-04T08:54:16.1685500Z "digest": "sha256:465d3fd643aa2ea0ad07335cda66f12f1d7e5e800c4e9385ec466bc8a1ceabda" 2025-12-04T08:54:16.1685699Z }, 2025-12-04T08:54:16.1685788Z { 2025-12-04T08:54:16.1685929Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1686101Z "size": 104, 2025-12-04T08:54:16.1686279Z "digest": "sha256:6c503e779d6f41ca7f51309875df2b725c171926aece7009c4b8a64d1ba3f58e" 2025-12-04T08:54:16.1686533Z }, 2025-12-04T08:54:16.1686620Z { 2025-12-04T08:54:16.1686765Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1686939Z "size": 724, 2025-12-04T08:54:16.1687115Z "digest": "sha256:9681563a88ff9e62494a2740e537440d3df978d466c9478d6a941fae8b57b084" 2025-12-04T08:54:16.1687313Z }, 2025-12-04T08:54:16.1687400Z { 2025-12-04T08:54:16.1687542Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1687716Z "size": 196, 2025-12-04T08:54:16.1687890Z "digest": "sha256:f7e9a021f0ee3d11a50dcb96378af8103a21f6c3c142f54529207648f3ed00b2" 2025-12-04T08:54:16.1688087Z }, 2025-12-04T08:54:16.1688175Z { 2025-12-04T08:54:16.1688317Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1688492Z "size": 2583, 2025-12-04T08:54:16.1688673Z "digest": "sha256:8e023b349080fb11ee55491bc9b842b30e9e3a90246d05b303a73dc62038caf2" 2025-12-04T08:54:16.1688868Z }, 2025-12-04T08:54:16.1688963Z { 2025-12-04T08:54:16.1689105Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1689283Z "size": 7577171420, 2025-12-04T08:54:16.1689471Z "digest": "sha256:8188df80e595a3dbcf84623c6a58a655269898cbb60029435f136d7f9d34ccaa" 2025-12-04T08:54:16.1689664Z }, 2025-12-04T08:54:16.1689751Z { 2025-12-04T08:54:16.1689893Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1690064Z "size": 135, 2025-12-04T08:54:16.1690246Z "digest": "sha256:3c2c2f8c74bfa16c4bf9a832c97bbb1d55205b2b4a2cead02cf74301ca1001fb" 2025-12-04T08:54:16.1690445Z }, 2025-12-04T08:54:16.1690530Z { 2025-12-04T08:54:16.1690670Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1690843Z "size": 104, 2025-12-04T08:54:16.1691026Z "digest": "sha256:2aa7784fbe3300f8bbfb6bb51cff3b01fd091e829c2bc7ab9e25261a0dd9b3bd" 2025-12-04T08:54:16.1691228Z }, 2025-12-04T08:54:16.1691316Z { 2025-12-04T08:54:16.1691462Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1691635Z "size": 612, 2025-12-04T08:54:16.1691927Z "digest": "sha256:2b3b5215d3ebe8789f0444457bfd5a6e218289b64aa07653ac3d03ddda5e6708" 2025-12-04T08:54:16.1692125Z }, 2025-12-04T08:54:16.1692213Z { 2025-12-04T08:54:16.1692356Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1692532Z "size": 838191945, 2025-12-04T08:54:16.1692721Z "digest": "sha256:99b1f1ea3e857834cebd01763d90fbd700aeb9c2d2ef23eda2cfff5652c9708b" 2025-12-04T08:54:16.1692921Z }, 2025-12-04T08:54:16.1693009Z { 2025-12-04T08:54:16.1693151Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1693324Z "size": 111, 2025-12-04T08:54:16.1693503Z "digest": "sha256:18d6daba0a5768a37ad106b57974f6b7efd35c43a87c246bcd3f43fea88f2d2b" 2025-12-04T08:54:16.1693701Z }, 2025-12-04T08:54:16.1693789Z { 2025-12-04T08:54:16.1693934Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1694111Z "size": 1555, 2025-12-04T08:54:16.1694296Z "digest": "sha256:5277f2a503ebd17ba9d9b86cc9bac86265504adeb449c0647616ddaacd3cbc41" 2025-12-04T08:54:16.1694494Z }, 2025-12-04T08:54:16.1694584Z { 2025-12-04T08:54:16.1694726Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1694885Z "size": 107, 2025-12-04T08:54:16.1695039Z "digest": "sha256:3198a9717aace920fd5de085319adf75091af05fc4318ce4b16a8a5b0e8d449e" 2025-12-04T08:54:16.1695212Z }, 2025-12-04T08:54:16.1695289Z { 2025-12-04T08:54:16.1695414Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1695565Z "size": 166, 2025-12-04T08:54:16.1695713Z "digest": "sha256:99a4918e5808277879449e97ccd7190db6b9aa2d742b57a3b831ce0198522bdd" 2025-12-04T08:54:16.1695882Z }, 2025-12-04T08:54:16.1695960Z { 2025-12-04T08:54:16.1696082Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1696275Z "size": 3526081, 2025-12-04T08:54:16.1696437Z "digest": "sha256:15bb11dfc6acc3537d527d6771c8e711e5605e99f82ec41e805d4600b8a97516" 2025-12-04T08:54:16.1696607Z }, 2025-12-04T08:54:16.1696686Z { 2025-12-04T08:54:16.1696810Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1696961Z "size": 107, 2025-12-04T08:54:16.1697117Z "digest": "sha256:bd87c8766e90e33db17514558ac591cc3f4149afd7abeaef4dd5770bbfa14210" 2025-12-04T08:54:16.1697288Z }, 2025-12-04T08:54:16.1697364Z { 2025-12-04T08:54:16.1697489Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1697641Z "size": 829, 2025-12-04T08:54:16.1697794Z "digest": "sha256:1969e15d0c13874ea5883ed829235a19ef6dc21c8aa6172032b78a8ffa6ff262" 2025-12-04T08:54:16.1697965Z }, 2025-12-04T08:54:16.1698042Z { 2025-12-04T08:54:16.1698168Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1698322Z "size": 26973054, 2025-12-04T08:54:16.1698491Z "digest": "sha256:24a03847d382b73c11969f8f73916a6bedf5ccea12f6f4290b3880f29ceda32a" 2025-12-04T08:54:16.1698664Z }, 2025-12-04T08:54:16.1698742Z { 2025-12-04T08:54:16.1698868Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1699019Z "size": 104, 2025-12-04T08:54:16.1699176Z "digest": "sha256:816e2e34e01839a35d624dbf4bd9ac9bea4c975104af47a0e6b6b6dee6c6f98d" 2025-12-04T08:54:16.1699348Z }, 2025-12-04T08:54:16.1699425Z { 2025-12-04T08:54:16.1699548Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1699698Z "size": 424, 2025-12-04T08:54:16.1699853Z "digest": "sha256:b168858b85373f8ddca549d79267a06de4fa945d04bf791c55c9ddc93957fa3c" 2025-12-04T08:54:16.1700024Z }, 2025-12-04T08:54:16.1700100Z { 2025-12-04T08:54:16.1700220Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1700373Z "size": 19309386, 2025-12-04T08:54:16.1700536Z "digest": "sha256:6b8d5ff02e267e38322afbb8a58ed63ce9d75b10e9e73255e6affcbc6b6539bf" 2025-12-04T08:54:16.1700719Z }, 2025-12-04T08:54:16.1700838Z { 2025-12-04T08:54:16.1700961Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1701110Z "size": 826, 2025-12-04T08:54:16.1701264Z "digest": "sha256:4e3b10a5dd6aed29f238d604925e2a4f873141c1087c8dd4fdde5c61e7560893" 2025-12-04T08:54:16.1701435Z }, 2025-12-04T08:54:16.1701511Z { 2025-12-04T08:54:16.1701634Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1701784Z "size": 724, 2025-12-04T08:54:16.1701975Z "digest": "sha256:9681563a88ff9e62494a2740e537440d3df978d466c9478d6a941fae8b57b084" 2025-12-04T08:54:16.1702142Z }, 2025-12-04T08:54:16.1702219Z { 2025-12-04T08:54:16.1702343Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1702494Z "size": 149, 2025-12-04T08:54:16.1702650Z "digest": "sha256:3092fab73b59190b9facfc49bf18f58612172bc2fd68dfa339a1118632616939" 2025-12-04T08:54:16.1702826Z }, 2025-12-04T08:54:16.1702903Z { 2025-12-04T08:54:16.1703030Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1703183Z "size": 136, 2025-12-04T08:54:16.1703342Z "digest": "sha256:20020dd28a15ba092fcbfe906ee39cdddfcc9d0b7eb42fdd6f4c08a984fa9c00" 2025-12-04T08:54:16.1703516Z }, 2025-12-04T08:54:16.1703594Z { 2025-12-04T08:54:16.1703716Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1703867Z "size": 140, 2025-12-04T08:54:16.1704023Z "digest": "sha256:ae5280ce969dcff08c091e9a5f7641f13561b2b0ee44d78b7c3f81d8fe8e6d32" 2025-12-04T08:54:16.1704195Z }, 2025-12-04T08:54:16.1704271Z { 2025-12-04T08:54:16.1704398Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1704549Z "size": 32, 2025-12-04T08:54:16.1704708Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T08:54:16.1704962Z }, 2025-12-04T08:54:16.1705035Z { 2025-12-04T08:54:16.1705162Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1705314Z "size": 222, 2025-12-04T08:54:16.1705471Z "digest": "sha256:fe17d9eb0fd26d3af4c724bf570d833978b131cedb7dc17a800aa388a246b3cd" 2025-12-04T08:54:16.1705643Z }, 2025-12-04T08:54:16.1705719Z { 2025-12-04T08:54:16.1705843Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1705996Z "size": 346, 2025-12-04T08:54:16.1706147Z "digest": "sha256:a51e0dab2d596e6563483f27c12660007160847d177ba4c31812a8f44ada5754" 2025-12-04T08:54:16.1706314Z }, 2025-12-04T08:54:16.1706389Z { 2025-12-04T08:54:16.1706513Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1706665Z "size": 88300, 2025-12-04T08:54:16.1706825Z "digest": "sha256:6eb176cefd72d37ecbcdf074289a8f1de732d8816cc695ece7e4709d098094d6" 2025-12-04T08:54:16.1706998Z }, 2025-12-04T08:54:16.1707081Z { 2025-12-04T08:54:16.1707204Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1707356Z "size": 106, 2025-12-04T08:54:16.1707510Z "digest": "sha256:e7b8cf2e8d5a4c56db9726ce62c1176032408b3b1c25a000592361cb4245e2b5" 2025-12-04T08:54:16.1707680Z }, 2025-12-04T08:54:16.1707755Z { 2025-12-04T08:54:16.1707880Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1708030Z "size": 1671, 2025-12-04T08:54:16.1708193Z "digest": "sha256:ef3a5060abce88884bc8bd815aa41c46427f34eeb132fe0ddd85a3f86e6dc83d" 2025-12-04T08:54:16.1708366Z }, 2025-12-04T08:54:16.1708443Z { 2025-12-04T08:54:16.1708566Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1708715Z "size": 724, 2025-12-04T08:54:16.1708867Z "digest": "sha256:9681563a88ff9e62494a2740e537440d3df978d466c9478d6a941fae8b57b084" 2025-12-04T08:54:16.1709034Z }, 2025-12-04T08:54:16.1709110Z { 2025-12-04T08:54:16.1709234Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1709389Z "size": 138, 2025-12-04T08:54:16.1720539Z "digest": "sha256:a6f4ec14b42b8f0a83d20aa6a985ddb6a1bf64e0ed3d44afd3484b87d4ed5ad3" 2025-12-04T08:54:16.1720743Z }, 2025-12-04T08:54:16.1720824Z { 2025-12-04T08:54:16.1720955Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1721111Z "size": 119, 2025-12-04T08:54:16.1721274Z "digest": "sha256:7e5a0c956cfbd6f8074fbfd3b1d416e6635d632835ec00c8dd4c015a21da19b4" 2025-12-04T08:54:16.1721449Z }, 2025-12-04T08:54:16.1721527Z { 2025-12-04T08:54:16.1721650Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1721807Z "size": 6238423049, 2025-12-04T08:54:16.1722009Z "digest": "sha256:b4f78730cfe76ce091b78b2e2e3d52be03f1097b3e4c3de5bd79f8d13a853132" 2025-12-04T08:54:16.1722182Z }, 2025-12-04T08:54:16.1722258Z { 2025-12-04T08:54:16.1722383Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1722537Z "size": 174, 2025-12-04T08:54:16.1722690Z "digest": "sha256:081028f24389b112683689fd362e8c0d6f358082710e72feab91cea6383feb4d" 2025-12-04T08:54:16.1722859Z }, 2025-12-04T08:54:16.1722935Z { 2025-12-04T08:54:16.1723064Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1723214Z "size": 1896, 2025-12-04T08:54:16.1723378Z "digest": "sha256:a534dcf4b9a9e5fabed742c8a8fc43c9cfe7346ea88ab3c177c3b14fd3afe00a" 2025-12-04T08:54:16.1723558Z }, 2025-12-04T08:54:16.1723635Z { 2025-12-04T08:54:16.1723759Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1723911Z "size": 197577597, 2025-12-04T08:54:16.1724072Z "digest": "sha256:2e77500302cc13224427e1d74e471bd79d5109ba6a5099a83df1d10b786f71ba" 2025-12-04T08:54:16.1724238Z }, 2025-12-04T08:54:16.1724314Z { 2025-12-04T08:54:16.1724437Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1724643Z "size": 304, 2025-12-04T08:54:16.1724806Z "digest": "sha256:bc08246bb4ba18c3ec5bc69e16b6b4e929c5bd0f3fae10eeb0b1a622a63d6fa2" 2025-12-04T08:54:16.1724980Z }, 2025-12-04T08:54:16.1725057Z { 2025-12-04T08:54:16.1725181Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1725331Z "size": 32, 2025-12-04T08:54:16.1725492Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T08:54:16.1725664Z }, 2025-12-04T08:54:16.1725741Z { 2025-12-04T08:54:16.1725861Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1726010Z "size": 106, 2025-12-04T08:54:16.1726167Z "digest": "sha256:ff0c473ca120ebdcaa2ba10b3274e82032edd5196019e76d4e7584553704ae81" 2025-12-04T08:54:16.1726339Z }, 2025-12-04T08:54:16.1726415Z { 2025-12-04T08:54:16.1726539Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T08:54:16.1726692Z "size": 54145662, 2025-12-04T08:54:16.1726864Z "digest": "sha256:6bbc14b250efb3cdaad12c91573c6bb9129ad3e3432f0ed1a7eaebc9958d162f" 2025-12-04T08:54:16.1727039Z } 2025-12-04T08:54:16.1727119Z ] 2025-12-04T08:54:16.1727199Z } 2025-12-04T08:54:16.1727286Z + exit 0 2025-12-04T08:54:16.1743215Z ##[group]Run set -eux 2025-12-04T08:54:16.1743332Z set -eux 2025-12-04T08:54:16.1743492Z # It's ok if this steps fails, it would then be an anonymous user like what we used to have 2025-12-04T08:54:16.1743904Z aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token | jq --raw-output '.SecretString' | jq -r .docker_hub_readonly_token | docker login --username pytorchbot --password-stdin || true 2025-12-04T08:54:16.1748397Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:54:16.1748545Z env: 2025-12-04T08:54:16.1748639Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:16.1748775Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T08:54:16.1748949Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T08:54:16.1749116Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T08:54:16.1749555Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T08:54:16.1749926Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T08:54:16.1750041Z AWS_REGION: us-east-1 2025-12-04T08:54:16.1750230Z AWS_ACCESS_KEY_ID: *** 2025-12-04T08:54:16.1750381Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T08:54:16.1752657Z AWS_SESSION_TOKEN: *** 2025-12-04T08:54:16.1752763Z ##[endgroup] 2025-12-04T08:54:16.1778564Z + aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token 2025-12-04T08:54:16.1778821Z + jq --raw-output .SecretString 2025-12-04T08:54:16.1779095Z /home/runner/_work/_temp/a6843953-5e83-4ae8-a94f-a33660c41c60.sh: line 3: aws: command not found 2025-12-04T08:54:16.1780812Z + jq -r .docker_hub_readonly_token 2025-12-04T08:54:16.1781992Z + docker login --username pytorchbot --password-stdin 2025-12-04T08:54:16.1881719Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T08:54:16.1889039Z + true 2025-12-04T08:54:16.1950957Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main 2025-12-04T08:54:16.1951145Z with: 2025-12-04T08:54:16.1951418Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T08:54:16.1951742Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:16.1951963Z env: 2025-12-04T08:54:16.1952060Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:16.1952203Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T08:54:16.1952383Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T08:54:16.1952551Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T08:54:16.1953076Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T08:54:16.1953469Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T08:54:16.1953589Z AWS_REGION: us-east-1 2025-12-04T08:54:16.1953778Z AWS_ACCESS_KEY_ID: *** 2025-12-04T08:54:16.1953932Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T08:54:16.1956162Z AWS_SESSION_TOKEN: *** 2025-12-04T08:54:16.1956270Z ##[endgroup] 2025-12-04T08:54:16.1962951Z ##[group]Run set -x 2025-12-04T08:54:16.1963067Z set -x 2025-12-04T08:54:16.1963162Z set +e 2025-12-04T08:54:16.1963253Z  2025-12-04T08:54:16.1963343Z login() { 2025-12-04T08:54:16.1963530Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-12-04T08:54:16.1963725Z } 2025-12-04T08:54:16.1963813Z  2025-12-04T08:54:16.1963910Z retry () { 2025-12-04T08:54:16.1964023Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-12-04T08:54:16.1964151Z } 2025-12-04T08:54:16.1964240Z  2025-12-04T08:54:16.1964343Z retry login "${DOCKER_REGISTRY}" 2025-12-04T08:54:16.1964464Z  2025-12-04T08:54:16.1964654Z IMAGE_SIZE=$(docker manifest inspect "${DOCKER_IMAGE}" | jq '[.layers[].size, .config.size] | add / 1024 / 1024') 2025-12-04T08:54:16.1964900Z echo "Compressed size of image in MB: ${IMAGE_SIZE}" 2025-12-04T08:54:16.1965049Z  2025-12-04T08:54:16.1965132Z set -e 2025-12-04T08:54:16.1965269Z # ignore output since only exit code is used for conditional 2025-12-04T08:54:16.1965452Z # only pull docker image if it's not available locally 2025-12-04T08:54:16.1965655Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2025-12-04T08:54:16.1965841Z  retry docker pull "${DOCKER_IMAGE}" 2025-12-04T08:54:16.1965968Z fi 2025-12-04T08:54:16.1970157Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T08:54:16.1970297Z env: 2025-12-04T08:54:16.1970389Z GIT_DEFAULT_BRANCH: main 2025-12-04T08:54:16.1970526Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T08:54:16.1970699Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T08:54:16.1970862Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T08:54:16.1971239Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T08:54:16.1971608Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T08:54:16.1971724Z AWS_REGION: us-east-1 2025-12-04T08:54:16.1971911Z AWS_ACCESS_KEY_ID: *** 2025-12-04T08:54:16.1972061Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T08:54:16.1974281Z AWS_SESSION_TOKEN: *** 2025-12-04T08:54:16.1974562Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T08:54:16.1974977Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:16.1975128Z ##[endgroup] 2025-12-04T08:54:16.1992642Z + set +e 2025-12-04T08:54:16.1992772Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:16.1992947Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:16.1995729Z + aws ecr get-login-password --region us-east-1 2025-12-04T08:54:16.1995963Z /home/runner/_work/_temp/c3a0ea26-7c94-4de1-826d-10caf821cb52.sh: line 5: aws: command not found 2025-12-04T08:54:16.1997188Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:16.2080432Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T08:54:16.2092232Z + sleep 1 2025-12-04T08:54:17.2101241Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:17.2106146Z + aws ecr get-login-password --region us-east-1 2025-12-04T08:54:17.2106767Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:17.2107531Z /home/runner/_work/_temp/c3a0ea26-7c94-4de1-826d-10caf821cb52.sh: line 5: aws: command not found 2025-12-04T08:54:17.2202640Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T08:54:17.2215512Z + sleep 2 2025-12-04T08:54:19.2229141Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:19.2233792Z + aws ecr get-login-password --region us-east-1 2025-12-04T08:54:19.2234330Z /home/runner/_work/_temp/c3a0ea26-7c94-4de1-826d-10caf821cb52.sh: line 5: aws: command not found 2025-12-04T08:54:19.2234946Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T08:54:19.2331732Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T08:54:19.2347597Z ++ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T08:54:19.2348779Z ++ jq '[.layers[].size, .config.size] | add / 1024 / 1024' 2025-12-04T08:54:20.5885603Z + IMAGE_SIZE=18171.470620155334 2025-12-04T08:54:20.5885896Z + echo 'Compressed size of image in MB: 18171.470620155334' 2025-12-04T08:54:20.5886132Z + set -e 2025-12-04T08:54:20.5886299Z Compressed size of image in MB: 18171.470620155334 2025-12-04T08:54:20.5886810Z + docker inspect --type=image 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T08:54:20.6004131Z + retry docker pull 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T08:54:20.6004771Z + docker pull 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T08:54:21.6520679Z pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a: Pulling from pytorch/ci-image 2025-12-04T08:54:21.6521453Z 63e5bc7682b8: Pulling fs layer 2025-12-04T08:54:21.6521831Z 835841cca3b7: Pulling fs layer 2025-12-04T08:54:21.6522347Z aac69780afc8: Pulling fs layer 2025-12-04T08:54:21.6522600Z 029495b23122: Pulling fs layer 2025-12-04T08:54:21.6522813Z d0fb85b00833: Pulling fs layer 2025-12-04T08:54:21.6523016Z 59b639308833: Pulling fs layer 2025-12-04T08:54:21.6523207Z dc112c89d57a: Pulling fs layer 2025-12-04T08:54:21.6523397Z 522eab2402e5: Pulling fs layer 2025-12-04T08:54:21.6523586Z 029495b23122: Waiting 2025-12-04T08:54:21.6523772Z 2b5a11b41761: Pulling fs layer 2025-12-04T08:54:21.6523962Z 9681563a88ff: Pulling fs layer 2025-12-04T08:54:21.6524150Z d0fb85b00833: Waiting 2025-12-04T08:54:21.6524330Z 73e33534e9eb: Pulling fs layer 2025-12-04T08:54:21.6524512Z 59b639308833: Waiting 2025-12-04T08:54:21.6524689Z 5bfdaeb5578d: Pulling fs layer 2025-12-04T08:54:21.6524895Z dc112c89d57a: Waiting 2025-12-04T08:54:21.6525067Z c07d27e4d3a5: Pulling fs layer 2025-12-04T08:54:21.6525265Z b21856d1bf42: Pulling fs layer 2025-12-04T08:54:21.6525895Z cb19d84867e4: Pulling fs layer 2025-12-04T08:54:21.6526089Z 8165374f8dcc: Pulling fs layer 2025-12-04T08:54:21.6526281Z 1aecc77354ce: Pulling fs layer 2025-12-04T08:54:21.6542779Z 465d3fd643aa: Pulling fs layer 2025-12-04T08:54:21.6543081Z 6c503e779d6f: Pulling fs layer 2025-12-04T08:54:21.6543300Z f7e9a021f0ee: Pulling fs layer 2025-12-04T08:54:21.6543493Z 8e023b349080: Pulling fs layer 2025-12-04T08:54:21.6543700Z 522eab2402e5: Waiting 2025-12-04T08:54:21.6543880Z 8188df80e595: Pulling fs layer 2025-12-04T08:54:21.6546292Z 3c2c2f8c74bf: Pulling fs layer 2025-12-04T08:54:21.6546602Z 2aa7784fbe33: Pulling fs layer 2025-12-04T08:54:21.6546816Z 2b3b5215d3eb: Pulling fs layer 2025-12-04T08:54:21.6547007Z 99b1f1ea3e85: Pulling fs layer 2025-12-04T08:54:21.6547194Z 18d6daba0a57: Pulling fs layer 2025-12-04T08:54:21.6547670Z 5277f2a503eb: Pulling fs layer 2025-12-04T08:54:21.6547854Z 2b5a11b41761: Waiting 2025-12-04T08:54:21.6548025Z 3198a9717aac: Pulling fs layer 2025-12-04T08:54:21.6548211Z 9681563a88ff: Waiting 2025-12-04T08:54:21.6548389Z 99a4918e5808: Pulling fs layer 2025-12-04T08:54:21.6548570Z b21856d1bf42: Waiting 2025-12-04T08:54:21.6548729Z 73e33534e9eb: Waiting 2025-12-04T08:54:21.6549178Z 15bb11dfc6ac: Pulling fs layer 2025-12-04T08:54:21.6552283Z cb19d84867e4: Waiting 2025-12-04T08:54:21.6552669Z 5bfdaeb5578d: Waiting 2025-12-04T08:54:21.6552944Z 8165374f8dcc: Waiting 2025-12-04T08:54:21.6553221Z bd87c8766e90: Pulling fs layer 2025-12-04T08:54:21.6553531Z 1969e15d0c13: Pulling fs layer 2025-12-04T08:54:21.6553778Z 24a03847d382: Pulling fs layer 2025-12-04T08:54:21.6554005Z 1aecc77354ce: Waiting 2025-12-04T08:54:21.6554212Z 816e2e34e018: Pulling fs layer 2025-12-04T08:54:21.6554428Z c07d27e4d3a5: Waiting 2025-12-04T08:54:21.6554628Z b168858b8537: Pulling fs layer 2025-12-04T08:54:21.6554871Z 6b8d5ff02e26: Pulling fs layer 2025-12-04T08:54:21.6555093Z 4e3b10a5dd6a: Pulling fs layer 2025-12-04T08:54:21.6555311Z 3092fab73b59: Pulling fs layer 2025-12-04T08:54:21.6555527Z 20020dd28a15: Pulling fs layer 2025-12-04T08:54:21.6555749Z ae5280ce969d: Pulling fs layer 2025-12-04T08:54:21.6555971Z 4f4fb700ef54: Pulling fs layer 2025-12-04T08:54:21.6556192Z fe17d9eb0fd2: Pulling fs layer 2025-12-04T08:54:21.6556415Z a51e0dab2d59: Pulling fs layer 2025-12-04T08:54:21.6556637Z 6eb176cefd72: Pulling fs layer 2025-12-04T08:54:21.6556855Z e7b8cf2e8d5a: Pulling fs layer 2025-12-04T08:54:21.6557072Z 8e023b349080: Waiting 2025-12-04T08:54:21.6557265Z f7e9a021f0ee: Waiting 2025-12-04T08:54:21.6557455Z 8188df80e595: Waiting 2025-12-04T08:54:21.6557655Z ef3a5060abce: Pulling fs layer 2025-12-04T08:54:21.6557869Z 465d3fd643aa: Waiting 2025-12-04T08:54:21.6558116Z a6f4ec14b42b: Pulling fs layer 2025-12-04T08:54:21.6558328Z ae5280ce969d: Waiting 2025-12-04T08:54:21.6558521Z 3c2c2f8c74bf: Waiting 2025-12-04T08:54:21.6558728Z 7e5a0c956cfb: Pulling fs layer 2025-12-04T08:54:21.6558942Z fe17d9eb0fd2: Waiting 2025-12-04T08:54:21.6559133Z 6c503e779d6f: Waiting 2025-12-04T08:54:21.6559371Z b4f78730cfe7: Pulling fs layer 2025-12-04T08:54:21.6559592Z a51e0dab2d59: Waiting 2025-12-04T08:54:21.6559791Z 081028f24389: Pulling fs layer 2025-12-04T08:54:21.6560001Z b168858b8537: Waiting 2025-12-04T08:54:21.6560205Z a534dcf4b9a9: Pulling fs layer 2025-12-04T08:54:21.6560416Z 4f4fb700ef54: Waiting 2025-12-04T08:54:21.6560606Z 2aa7784fbe33: Waiting 2025-12-04T08:54:21.6560792Z 99a4918e5808: Waiting 2025-12-04T08:54:21.6560977Z 2b3b5215d3eb: Waiting 2025-12-04T08:54:21.6561167Z e7b8cf2e8d5a: Waiting 2025-12-04T08:54:21.6561355Z 4e3b10a5dd6a: Waiting 2025-12-04T08:54:21.6561544Z 5277f2a503eb: Waiting 2025-12-04T08:54:21.6561733Z 15bb11dfc6ac: Waiting 2025-12-04T08:54:21.6561962Z 6eb176cefd72: Waiting 2025-12-04T08:54:21.6562149Z 24a03847d382: Waiting 2025-12-04T08:54:21.6562333Z 1969e15d0c13: Waiting 2025-12-04T08:54:21.6562519Z 3198a9717aac: Waiting 2025-12-04T08:54:21.6562713Z 99b1f1ea3e85: Waiting 2025-12-04T08:54:21.6562898Z 3092fab73b59: Waiting 2025-12-04T08:54:21.6563083Z bd87c8766e90: Waiting 2025-12-04T08:54:21.6563266Z 6b8d5ff02e26: Waiting 2025-12-04T08:54:21.6563700Z b4f78730cfe7: Waiting 2025-12-04T08:54:21.6563838Z a534dcf4b9a9: Waiting 2025-12-04T08:54:21.6563979Z 081028f24389: Waiting 2025-12-04T08:54:21.6564118Z 20020dd28a15: Waiting 2025-12-04T08:54:21.6564267Z 18d6daba0a57: Waiting 2025-12-04T08:54:21.6564421Z 2e77500302cc: Pulling fs layer 2025-12-04T08:54:21.6564593Z bc08246bb4ba: Pulling fs layer 2025-12-04T08:54:21.6564750Z 2e77500302cc: Waiting 2025-12-04T08:54:21.6564892Z 7e5a0c956cfb: Waiting 2025-12-04T08:54:21.6565027Z a6f4ec14b42b: Waiting 2025-12-04T08:54:21.6565164Z 816e2e34e018: Waiting 2025-12-04T08:54:21.6565307Z ef3a5060abce: Waiting 2025-12-04T08:54:21.6565451Z ff0c473ca120: Pulling fs layer 2025-12-04T08:54:21.6565611Z bc08246bb4ba: Waiting 2025-12-04T08:54:21.6565756Z 6bbc14b250ef: Pulling fs layer 2025-12-04T08:54:21.6565979Z ff0c473ca120: Waiting 2025-12-04T08:54:21.6566125Z 6bbc14b250ef: Waiting 2025-12-04T08:54:23.3072125Z 835841cca3b7: Verifying Checksum 2025-12-04T08:54:23.3072603Z 835841cca3b7: Download complete 2025-12-04T08:54:23.3623447Z 63e5bc7682b8: Download complete 2025-12-04T08:54:23.8765407Z 63e5bc7682b8: Pull complete 2025-12-04T08:54:23.8836923Z 835841cca3b7: Pull complete 2025-12-04T08:54:23.9054940Z 029495b23122: Verifying Checksum 2025-12-04T08:54:23.9055169Z 029495b23122: Download complete 2025-12-04T08:54:23.9634679Z d0fb85b00833: Verifying Checksum 2025-12-04T08:54:23.9634875Z d0fb85b00833: Download complete 2025-12-04T08:54:24.4829761Z 59b639308833: Verifying Checksum 2025-12-04T08:54:24.4830104Z 59b639308833: Download complete 2025-12-04T08:54:25.0677158Z 522eab2402e5: Verifying Checksum 2025-12-04T08:54:25.0677524Z 522eab2402e5: Download complete 2025-12-04T08:54:25.6852161Z 2b5a11b41761: Verifying Checksum 2025-12-04T08:54:25.6852536Z 2b5a11b41761: Download complete 2025-12-04T08:54:26.2783609Z 9681563a88ff: Download complete 2025-12-04T08:54:27.5351811Z dc112c89d57a: Verifying Checksum 2025-12-04T08:54:27.5352323Z dc112c89d57a: Download complete 2025-12-04T08:54:28.1141233Z 5bfdaeb5578d: Download complete 2025-12-04T08:54:28.9194108Z c07d27e4d3a5: Download complete 2025-12-04T08:54:29.5250845Z b21856d1bf42: Verifying Checksum 2025-12-04T08:54:29.5251204Z b21856d1bf42: Download complete 2025-12-04T08:54:30.1287895Z cb19d84867e4: Download complete 2025-12-04T08:54:30.7088344Z 8165374f8dcc: Verifying Checksum 2025-12-04T08:54:30.7088689Z 8165374f8dcc: Download complete 2025-12-04T08:54:44.6018111Z 1aecc77354ce: Verifying Checksum 2025-12-04T08:54:44.6018459Z 1aecc77354ce: Download complete 2025-12-04T08:54:45.2518211Z 465d3fd643aa: Verifying Checksum 2025-12-04T08:54:45.2518556Z 465d3fd643aa: Download complete 2025-12-04T08:54:45.9070145Z 6c503e779d6f: Download complete 2025-12-04T08:54:46.5283769Z f7e9a021f0ee: Verifying Checksum 2025-12-04T08:54:46.5284091Z f7e9a021f0ee: Download complete 2025-12-04T08:54:47.2011348Z 8e023b349080: Verifying Checksum 2025-12-04T08:54:47.2011498Z 8e023b349080: Download complete 2025-12-04T08:55:23.9384148Z aac69780afc8: Download complete 2025-12-04T08:55:24.5112054Z 3c2c2f8c74bf: Verifying Checksum 2025-12-04T08:55:24.5112487Z 3c2c2f8c74bf: Download complete 2025-12-04T08:55:25.0884834Z 2aa7784fbe33: Verifying Checksum 2025-12-04T08:55:25.0885035Z 2aa7784fbe33: Download complete 2025-12-04T08:55:25.6695630Z 2b3b5215d3eb: Verifying Checksum 2025-12-04T08:55:25.6695973Z 2b3b5215d3eb: Download complete 2025-12-04T08:55:27.9937668Z aac69780afc8: Pull complete 2025-12-04T08:55:27.9975576Z 029495b23122: Pull complete 2025-12-04T08:55:28.0029209Z d0fb85b00833: Pull complete 2025-12-04T08:55:28.0069820Z 59b639308833: Pull complete 2025-12-04T08:55:29.0610574Z dc112c89d57a: Pull complete 2025-12-04T08:55:29.0660849Z 522eab2402e5: Pull complete 2025-12-04T08:55:29.0705459Z 2b5a11b41761: Pull complete 2025-12-04T08:55:29.0746182Z 9681563a88ff: Pull complete 2025-12-04T08:55:44.6484989Z 99b1f1ea3e85: Verifying Checksum 2025-12-04T08:55:44.6485214Z 99b1f1ea3e85: Download complete 2025-12-04T08:55:45.2279964Z 18d6daba0a57: Download complete 2025-12-04T08:55:45.7967275Z 5277f2a503eb: Download complete 2025-12-04T08:55:46.3639053Z 3198a9717aac: Verifying Checksum 2025-12-04T08:55:46.3639257Z 3198a9717aac: Download complete 2025-12-04T08:55:47.0609396Z 99a4918e5808: Verifying Checksum 2025-12-04T08:55:47.0610397Z 99a4918e5808: Download complete 2025-12-04T08:55:48.5138362Z 15bb11dfc6ac: Verifying Checksum 2025-12-04T08:55:48.5138600Z 15bb11dfc6ac: Download complete 2025-12-04T08:55:49.2314702Z bd87c8766e90: Verifying Checksum 2025-12-04T08:55:49.2314932Z bd87c8766e90: Download complete 2025-12-04T08:55:49.9253736Z 1969e15d0c13: Verifying Checksum 2025-12-04T08:55:49.9253980Z 1969e15d0c13: Download complete 2025-12-04T08:55:51.9250887Z 24a03847d382: Verifying Checksum 2025-12-04T08:55:51.9251125Z 24a03847d382: Download complete 2025-12-04T08:55:52.6442707Z 816e2e34e018: Download complete 2025-12-04T08:55:53.3491223Z b168858b8537: Verifying Checksum 2025-12-04T08:55:53.3491447Z b168858b8537: Download complete 2025-12-04T08:55:55.6195684Z 6b8d5ff02e26: Verifying Checksum 2025-12-04T08:55:55.6196203Z 6b8d5ff02e26: Download complete 2025-12-04T08:55:56.4015833Z 4e3b10a5dd6a: Verifying Checksum 2025-12-04T08:55:56.4016241Z 4e3b10a5dd6a: Download complete 2025-12-04T08:55:57.0750689Z 3092fab73b59: Verifying Checksum 2025-12-04T08:55:57.0751024Z 3092fab73b59: Download complete 2025-12-04T08:55:57.8040315Z 20020dd28a15: Download complete 2025-12-04T08:55:58.5174058Z ae5280ce969d: Download complete 2025-12-04T08:55:58.9068777Z 4f4fb700ef54: Verifying Checksum 2025-12-04T08:55:58.9069087Z 4f4fb700ef54: Download complete 2025-12-04T08:55:59.6206166Z fe17d9eb0fd2: Verifying Checksum 2025-12-04T08:55:59.6206433Z fe17d9eb0fd2: Download complete 2025-12-04T08:56:00.3503468Z a51e0dab2d59: Verifying Checksum 2025-12-04T08:56:00.3503694Z a51e0dab2d59: Download complete 2025-12-04T08:56:01.2336029Z 6eb176cefd72: Download complete 2025-12-04T08:56:01.9166655Z e7b8cf2e8d5a: Verifying Checksum 2025-12-04T08:56:01.9167098Z e7b8cf2e8d5a: Download complete 2025-12-04T08:56:02.6567940Z ef3a5060abce: Verifying Checksum 2025-12-04T08:56:02.6568148Z ef3a5060abce: Download complete 2025-12-04T08:56:03.3932926Z a6f4ec14b42b: Download complete 2025-12-04T08:56:04.0296699Z 7e5a0c956cfb: Download complete 2025-12-04T09:11:41.3506445Z 73e33534e9eb: Verifying Checksum 2025-12-04T09:11:41.3506839Z 73e33534e9eb: Download complete 2025-12-04T09:11:41.9419801Z 081028f24389: Verifying Checksum 2025-12-04T09:11:41.9420194Z 081028f24389: Download complete 2025-12-04T09:11:42.5289006Z a534dcf4b9a9: Verifying Checksum 2025-12-04T09:11:42.5289279Z a534dcf4b9a9: Download complete 2025-12-04T09:11:48.0056513Z 2e77500302cc: Verifying Checksum 2025-12-04T09:11:48.0056845Z 2e77500302cc: Download complete 2025-12-04T09:11:48.6525235Z bc08246bb4ba: Verifying Checksum 2025-12-04T09:11:48.6525448Z bc08246bb4ba: Download complete 2025-12-04T09:11:49.2382716Z ff0c473ca120: Verifying Checksum 2025-12-04T09:11:49.2382982Z ff0c473ca120: Download complete 2025-12-04T09:11:51.4735451Z 6bbc14b250ef: Verifying Checksum 2025-12-04T09:11:51.4735671Z 6bbc14b250ef: Download complete 2025-12-04T09:12:02.7244677Z 73e33534e9eb: Pull complete 2025-12-04T09:12:02.7282211Z 5bfdaeb5578d: Pull complete 2025-12-04T09:12:02.7430594Z c07d27e4d3a5: Pull complete 2025-12-04T09:12:02.7483016Z b21856d1bf42: Pull complete 2025-12-04T09:12:02.7522492Z cb19d84867e4: Pull complete 2025-12-04T09:12:02.7557599Z 8165374f8dcc: Pull complete 2025-12-04T09:12:06.2737006Z 1aecc77354ce: Pull complete 2025-12-04T09:12:06.2788934Z 465d3fd643aa: Pull complete 2025-12-04T09:12:06.2839960Z 6c503e779d6f: Pull complete 2025-12-04T09:12:06.2919544Z f7e9a021f0ee: Pull complete 2025-12-04T09:12:06.2955365Z 8e023b349080: Pull complete 2025-12-04T09:15:42.9990012Z 8188df80e595: Download complete 2025-12-04T09:16:26.3770673Z 8188df80e595: Pull complete 2025-12-04T09:16:26.3819248Z 3c2c2f8c74bf: Pull complete 2025-12-04T09:16:26.3865204Z 2aa7784fbe33: Pull complete 2025-12-04T09:16:26.3909039Z 2b3b5215d3eb: Pull complete 2025-12-04T09:16:31.2512219Z 99b1f1ea3e85: Pull complete 2025-12-04T09:16:31.2555202Z 18d6daba0a57: Pull complete 2025-12-04T09:16:31.2598743Z 5277f2a503eb: Pull complete 2025-12-04T09:16:31.2637702Z 3198a9717aac: Pull complete 2025-12-04T09:16:31.2679343Z 99a4918e5808: Pull complete 2025-12-04T09:16:31.2983745Z 15bb11dfc6ac: Pull complete 2025-12-04T09:16:31.3023085Z bd87c8766e90: Pull complete 2025-12-04T09:16:31.3056828Z 1969e15d0c13: Pull complete 2025-12-04T09:16:31.5183395Z 24a03847d382: Pull complete 2025-12-04T09:16:31.5215351Z 816e2e34e018: Pull complete 2025-12-04T09:16:31.5262680Z b168858b8537: Pull complete 2025-12-04T09:16:31.6260159Z 6b8d5ff02e26: Pull complete 2025-12-04T09:16:31.6292080Z 4e3b10a5dd6a: Pull complete 2025-12-04T09:16:31.6377897Z 3092fab73b59: Pull complete 2025-12-04T09:16:31.6407023Z 20020dd28a15: Pull complete 2025-12-04T09:16:31.6446573Z ae5280ce969d: Pull complete 2025-12-04T09:16:31.6486249Z 4f4fb700ef54: Pull complete 2025-12-04T09:16:31.6519765Z fe17d9eb0fd2: Pull complete 2025-12-04T09:16:31.6574090Z a51e0dab2d59: Pull complete 2025-12-04T09:16:31.6621174Z 6eb176cefd72: Pull complete 2025-12-04T09:16:31.6660432Z e7b8cf2e8d5a: Pull complete 2025-12-04T09:16:31.6700152Z ef3a5060abce: Pull complete 2025-12-04T09:16:31.6783918Z a6f4ec14b42b: Pull complete 2025-12-04T09:16:31.6821670Z 7e5a0c956cfb: Pull complete 2025-12-04T09:16:51.1384237Z b4f78730cfe7: Verifying Checksum 2025-12-04T09:16:51.1384661Z b4f78730cfe7: Download complete 2025-12-04T09:17:28.5183845Z b4f78730cfe7: Pull complete 2025-12-04T09:17:28.5229971Z 081028f24389: Pull complete 2025-12-04T09:17:28.5277727Z a534dcf4b9a9: Pull complete 2025-12-04T09:17:31.1440208Z 2e77500302cc: Pull complete 2025-12-04T09:17:31.1484857Z bc08246bb4ba: Pull complete 2025-12-04T09:17:31.1563132Z ff0c473ca120: Pull complete 2025-12-04T09:17:31.7711306Z 6bbc14b250ef: Pull complete 2025-12-04T09:17:31.7728674Z Digest: sha256:5e190224966743059cf8506170eaec525eada34e38cf646e02d1dbeadfe5a366 2025-12-04T09:17:31.7730927Z Status: Downloaded newer image for 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:17:31.7736588Z 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:17:31.7787409Z Prepare all required actions 2025-12-04T09:17:31.7802392Z ##[group]Run ./.github/actions/get-workflow-job-id 2025-12-04T09:17:31.7802536Z with: 2025-12-04T09:17:31.7802795Z github-token: *** 2025-12-04T09:17:31.7802896Z env: 2025-12-04T09:17:31.7802991Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:17:31.7803131Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:17:31.7803314Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:17:31.7803483Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:17:31.7803887Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:17:31.7804261Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:17:31.7804379Z AWS_REGION: us-east-1 2025-12-04T09:17:31.7804500Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:17:31.7804664Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:17:31.7806888Z AWS_SESSION_TOKEN: *** 2025-12-04T09:17:31.7806993Z ##[endgroup] 2025-12-04T09:17:31.7813384Z ##[group]Run set -eux 2025-12-04T09:17:31.7813499Z set -eux 2025-12-04T09:17:31.7813668Z python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-12-04T09:17:31.7817977Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:17:31.7818125Z env: 2025-12-04T09:17:31.7818217Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:17:31.7818349Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:17:31.7818534Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:17:31.7818699Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:17:31.7819081Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:17:31.7819448Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:17:31.7819565Z AWS_REGION: us-east-1 2025-12-04T09:17:31.7819701Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:17:31.7819868Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:17:31.7822102Z AWS_SESSION_TOKEN: *** 2025-12-04T09:17:31.7822253Z GITHUB_TOKEN: *** 2025-12-04T09:17:31.7822350Z ##[endgroup] 2025-12-04T09:17:31.7842564Z + python3 .github/scripts/get_workflow_job_id.py 19922849170 linux.rocm.gpu.gfx942.1.b-gwk9b-runner-jfbtd 2025-12-04T09:17:32.8233170Z Setting output job-id=57116213137 2025-12-04T09:17:32.8234080Z Setting output job-name=linux-jammy-rocm-py3.10 / test (default, 3, 6, linux.rocm.gpu.gfx942.1.b, mem_leak_check, unstable) 2025-12-04T09:17:32.8362231Z Prepare all required actions 2025-12-04T09:17:32.8362439Z Getting action download info 2025-12-04T09:17:33.2357773Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:1da556a7aa0a088e3153970611f6c432d58e80e6) 2025-12-04T09:17:34.9174827Z Download action repository 'actions/download-artifact@v4' (SHA:d3f86a106a0bac45b974a628896c90dbdf5c8093) 2025-12-04T09:17:37.0874146Z ##[group]Run ./.github/actions/download-build-artifacts 2025-12-04T09:17:37.0874318Z with: 2025-12-04T09:17:37.0874429Z name: linux-jammy-rocm-py3.10 2025-12-04T09:17:37.0874557Z s3-bucket: gha-artifacts 2025-12-04T09:17:37.0874673Z env: 2025-12-04T09:17:37.0874771Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:17:37.0874910Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:17:37.0875093Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:17:37.0875270Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:17:37.0875682Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:17:37.0876058Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:17:37.0876180Z AWS_REGION: us-east-1 2025-12-04T09:17:37.0876405Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:17:37.0876675Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:17:37.0878896Z AWS_SESSION_TOKEN: *** 2025-12-04T09:17:37.0879003Z ##[endgroup] 2025-12-04T09:17:37.0893049Z ##[group]Run seemethere/download-artifact-s3@v4 2025-12-04T09:17:37.0893192Z with: 2025-12-04T09:17:37.0893296Z name: linux-jammy-rocm-py3.10 2025-12-04T09:17:37.0893418Z s3-bucket: gha-artifacts 2025-12-04T09:17:37.0893526Z region: us-east-1 2025-12-04T09:17:37.0893618Z env: 2025-12-04T09:17:37.0893717Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:17:37.0893850Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:17:37.0894025Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:17:37.0894190Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:17:37.0894567Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:17:37.0894931Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:17:37.0895048Z AWS_REGION: us-east-1 2025-12-04T09:17:37.0895207Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:17:37.0895360Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:17:37.0897553Z AWS_SESSION_TOKEN: *** 2025-12-04T09:17:37.0897658Z ##[endgroup] 2025-12-04T09:17:37.3198733Z (node:17223) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-12-04T09:17:37.3199020Z 2025-12-04T09:17:37.3199137Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-12-04T09:17:37.3199458Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-12-04T09:17:37.3199753Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-12-04T09:17:37.6123494Z Found 1 objects with prefix pytorch/pytorch/19922849170/linux-jammy-rocm-py3.10/ 2025-12-04T09:17:37.6124196Z Starting download (1/1): /home/runner/_work/pytorch/pytorch/artifacts.zip 2025-12-04T09:18:18.4193721Z Finished download (1/1): /home/runner/_work/pytorch/pytorch/artifacts.zip 2025-12-04T09:18:18.4196955Z Artifact download has finished successfully 2025-12-04T09:18:18.4383050Z ##[group]Run unzip -o artifacts.zip 2025-12-04T09:18:18.4383222Z unzip -o artifacts.zip 2025-12-04T09:18:18.4387689Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:18.4387846Z env: 2025-12-04T09:18:18.4387947Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:18.4388273Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:18:18.4388632Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:18:18.4388814Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:18:18.4389228Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:18:18.4389600Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:18:18.4389714Z AWS_REGION: us-east-1 2025-12-04T09:18:18.4389900Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:18:18.4390050Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:18:18.4392470Z AWS_SESSION_TOKEN: *** 2025-12-04T09:18:18.4392579Z ##[endgroup] 2025-12-04T09:18:18.4426067Z Archive: artifacts.zip 2025-12-04T09:18:18.4427051Z creating: dist/ 2025-12-04T09:18:18.4510052Z inflating: dist/.ninja_log 2025-12-04T09:18:21.3742362Z inflating: dist/torch-2.10.0a0+gitffd9b0f-cp310-cp310-linux_x86_64.whl 2025-12-04T09:18:21.3744383Z creating: build/ 2025-12-04T09:18:21.3744871Z creating: build/custom_test_artifacts/ 2025-12-04T09:18:21.3745131Z creating: build/custom_test_artifacts/custom-op-build/ 2025-12-04T09:18:21.3745366Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2025-12-04T09:18:21.3748036Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/pkgRedirects/ 2025-12-04T09:18:21.3748359Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeConfigureLog.yaml 2025-12-04T09:18:21.3748655Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/ 2025-12-04T09:18:21.3749036Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-12-04T09:18:21.3749343Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-12-04T09:18:21.3749642Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-12-04T09:18:21.3749991Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-12-04T09:18:21.3750337Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-12-04T09:18:21.3750667Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-12-04T09:18:21.3750980Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-12-04T09:18:21.3751283Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-12-04T09:18:21.3751642Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-12-04T09:18:21.3752070Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-12-04T09:18:21.3752412Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-12-04T09:18:21.3752780Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-12-04T09:18:21.3753183Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-12-04T09:18:21.3753504Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeScratch/ 2025-12-04T09:18:21.3753796Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2025-12-04T09:18:21.3754080Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2025-12-04T09:18:21.3754372Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2025-12-04T09:18:21.3754702Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2025-12-04T09:18:21.3755062Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2025-12-04T09:18:21.3756069Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2025-12-04T09:18:21.3756398Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2025-12-04T09:18:21.3756729Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2025-12-04T09:18:21.3757068Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2025-12-04T09:18:21.3757403Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2025-12-04T09:18:21.3757743Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2025-12-04T09:18:21.3758083Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2025-12-04T09:18:21.3766367Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2025-12-04T09:18:21.3873067Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2025-12-04T09:18:21.3873736Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2025-12-04T09:18:21.3874031Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2025-12-04T09:18:21.3874354Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2025-12-04T09:18:21.3874666Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2025-12-04T09:18:21.3874951Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2025-12-04T09:18:21.3875247Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2025-12-04T09:18:21.3875552Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2025-12-04T09:18:21.3875857Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2025-12-04T09:18:21.3876158Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2025-12-04T09:18:21.3876450Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2025-12-04T09:18:21.3887402Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2025-12-04T09:18:21.3930646Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2025-12-04T09:18:21.3930971Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-12-04T09:18:21.3931260Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2025-12-04T09:18:21.3931531Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2025-12-04T09:18:21.3931771Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2025-12-04T09:18:21.3932244Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2025-12-04T09:18:21.3932607Z inflating: build/custom_test_artifacts/custom-op-build/hipblaslt_test_outer_vec.cc 2025-12-04T09:18:21.3932916Z inflating: build/custom_test_artifacts/custom-op-build/hipblaslt_test_vec_ext.cc 2025-12-04T09:18:21.3933737Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2025-12-04T09:18:21.3933990Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2025-12-04T09:18:21.3934240Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2025-12-04T09:18:21.4025433Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2025-12-04T09:18:21.4055080Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2025-12-04T09:18:21.4055484Z creating: build/custom_test_artifacts/jit-hook-build/ 2025-12-04T09:18:21.4055742Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2025-12-04T09:18:21.4055975Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/pkgRedirects/ 2025-12-04T09:18:21.4057797Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeConfigureLog.yaml 2025-12-04T09:18:21.4058136Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/ 2025-12-04T09:18:21.4058403Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-12-04T09:18:21.4058689Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-12-04T09:18:21.4058959Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-12-04T09:18:21.4059814Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-12-04T09:18:21.4060533Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-12-04T09:18:21.4060847Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-12-04T09:18:21.4061139Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-12-04T09:18:21.4061418Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-12-04T09:18:21.4062445Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-12-04T09:18:21.4063193Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-12-04T09:18:21.4063515Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-12-04T09:18:21.4064715Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-12-04T09:18:21.4065506Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-12-04T09:18:21.4065806Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeScratch/ 2025-12-04T09:18:21.4066051Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2025-12-04T09:18:21.4066307Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2025-12-04T09:18:21.4066575Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2025-12-04T09:18:21.4066883Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2025-12-04T09:18:21.4067213Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2025-12-04T09:18:21.4067534Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2025-12-04T09:18:21.4067840Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2025-12-04T09:18:21.4068151Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2025-12-04T09:18:21.4068443Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2025-12-04T09:18:21.4068730Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2025-12-04T09:18:21.4069021Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2025-12-04T09:18:21.4069302Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2025-12-04T09:18:21.4079861Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2025-12-04T09:18:21.4113761Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2025-12-04T09:18:21.4114148Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-12-04T09:18:21.4114488Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2025-12-04T09:18:21.4114753Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2025-12-04T09:18:21.4114994Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2025-12-04T09:18:21.4115293Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2025-12-04T09:18:21.4115558Z inflating: build/custom_test_artifacts/jit-hook-build/hipblaslt_test_outer_vec.cc 2025-12-04T09:18:21.4115807Z inflating: build/custom_test_artifacts/jit-hook-build/hipblaslt_test_vec_ext.cc 2025-12-04T09:18:21.4116658Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2025-12-04T09:18:21.4116927Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2025-12-04T09:18:21.4117176Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2025-12-04T09:18:21.4137714Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2025-12-04T09:18:21.4137925Z creating: build/custom_test_artifacts/custom-backend-build/ 2025-12-04T09:18:21.4138133Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2025-12-04T09:18:21.4138374Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/pkgRedirects/ 2025-12-04T09:18:21.4140418Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeConfigureLog.yaml 2025-12-04T09:18:21.4140686Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/ 2025-12-04T09:18:21.4140949Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-12-04T09:18:21.4141232Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-12-04T09:18:21.4141504Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-12-04T09:18:21.4142441Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-12-04T09:18:21.4143173Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-12-04T09:18:21.4143474Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-12-04T09:18:21.4143765Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-12-04T09:18:21.4144042Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-12-04T09:18:21.4145055Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-12-04T09:18:21.4145811Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-12-04T09:18:21.4146122Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-12-04T09:18:21.4147213Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-12-04T09:18:21.4147978Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-12-04T09:18:21.4148289Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeScratch/ 2025-12-04T09:18:21.4148540Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2025-12-04T09:18:21.4148792Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2025-12-04T09:18:21.4149066Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2025-12-04T09:18:21.4149363Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2025-12-04T09:18:21.4149795Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2025-12-04T09:18:21.4150116Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2025-12-04T09:18:21.4150415Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2025-12-04T09:18:21.4150728Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2025-12-04T09:18:21.4151038Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2025-12-04T09:18:21.4151348Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2025-12-04T09:18:21.4151657Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2025-12-04T09:18:21.4152008Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2025-12-04T09:18:21.4152856Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2025-12-04T09:18:21.4216514Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2025-12-04T09:18:21.4216827Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2025-12-04T09:18:21.4217140Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2025-12-04T09:18:21.4217494Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2025-12-04T09:18:21.4217828Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2025-12-04T09:18:21.4218147Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2025-12-04T09:18:21.4218484Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2025-12-04T09:18:21.4218810Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2025-12-04T09:18:21.4219137Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2025-12-04T09:18:21.4219475Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2025-12-04T09:18:21.4219788Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2025-12-04T09:18:21.4230241Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2025-12-04T09:18:21.4259625Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2025-12-04T09:18:21.4259968Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-12-04T09:18:21.4260264Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2025-12-04T09:18:21.4260534Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2025-12-04T09:18:21.4260782Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2025-12-04T09:18:21.4261803Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2025-12-04T09:18:21.4262193Z inflating: build/custom_test_artifacts/custom-backend-build/hipblaslt_test_outer_vec.cc 2025-12-04T09:18:21.4262461Z inflating: build/custom_test_artifacts/custom-backend-build/hipblaslt_test_vec_ext.cc 2025-12-04T09:18:21.4262713Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2025-12-04T09:18:21.4263276Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2025-12-04T09:18:21.4263664Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2025-12-04T09:18:21.4316898Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2025-12-04T09:18:21.4337678Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2025-12-04T09:18:21.4337867Z creating: build/lib/ 2025-12-04T09:18:21.4383026Z inflating: build/lib/libprotobuf-lite.a 2025-12-04T09:18:21.4626010Z inflating: build/lib/libprotobuf.a 2025-12-04T09:18:21.4898218Z inflating: build/lib/libprotoc.a 2025-12-04T09:18:21.4903519Z inflating: build/lib/libpthreadpool.a 2025-12-04T09:18:21.4907484Z inflating: build/lib/libcpuinfo.a 2025-12-04T09:18:21.4911469Z inflating: build/lib/libcpuinfo_internals.a 2025-12-04T09:18:21.4912121Z inflating: build/lib/libclog.a 2025-12-04T09:18:21.4922382Z inflating: build/lib/libpytorch_qnnpack.a 2025-12-04T09:18:21.4923332Z inflating: build/lib/libnnpack_reference_layers.a 2025-12-04T09:18:21.4933098Z inflating: build/lib/libnnpack.a 2025-12-04T09:18:21.5033818Z inflating: build/lib/libmicrokernels-prod.a 2025-12-04T09:18:21.5500194Z inflating: build/lib/libmicrokernels-all.a 2025-12-04T09:18:21.5537856Z inflating: build/lib/libgtest.a 2025-12-04T09:18:21.5547024Z inflating: build/lib/libgmock.a 2025-12-04T09:18:21.5547228Z inflating: build/lib/libgtest_main.a 2025-12-04T09:18:21.5547416Z inflating: build/lib/libgmock_main.a 2025-12-04T09:18:21.5596745Z inflating: build/lib/libXNNPACK.a 2025-12-04T09:18:21.5638068Z inflating: build/lib/libbenchmark.a 2025-12-04T09:18:21.5638282Z inflating: build/lib/libbenchmark_main.a 2025-12-04T09:18:21.5638482Z inflating: build/lib/libjitprofiling.a 2025-12-04T09:18:21.5642861Z inflating: build/lib/libittnotify.a 2025-12-04T09:18:21.5679678Z inflating: build/lib/libasmjit.a 2025-12-04T09:18:21.6299672Z inflating: build/lib/libfbgemm.a 2025-12-04T09:18:21.6316292Z inflating: build/lib/libtensorpipe_uv.a 2025-12-04T09:18:21.6610975Z inflating: build/lib/libtensorpipe.a 2025-12-04T09:18:21.6677074Z inflating: build/lib/libgloo.a 2025-12-04T09:18:21.6702400Z inflating: build/lib/libonnx_proto.a 2025-12-04T09:18:21.6924331Z inflating: build/lib/libgloo_hip.a 2025-12-04T09:18:21.7315009Z inflating: build/lib/libonnx.a 2025-12-04T09:18:22.2824821Z inflating: build/lib/libdnnl.a 2025-12-04T09:18:22.2835383Z inflating: build/lib/libfmt.a 2025-12-04T09:18:22.3003786Z inflating: build/lib/libkineto.a 2025-12-04T09:18:22.3070574Z inflating: build/lib/libc10.so 2025-12-04T09:18:22.3071053Z inflating: build/lib/libtorch_global_deps.so 2025-12-04T09:18:22.3071457Z inflating: build/lib/libcaffe2_nvrtc.so 2025-12-04T09:18:22.3096769Z inflating: build/lib/libc10_hip.so 2025-12-04T09:18:22.3369874Z inflating: build/lib/libfbgemm_genai.a 2025-12-04T09:18:24.0249612Z inflating: build/lib/libtorch_cpu.so 2025-12-04T09:18:24.0274340Z inflating: build/lib/libshm.so 2025-12-04T09:18:24.8508903Z inflating: build/lib/libtorch_hip.so 2025-12-04T09:18:24.8509545Z inflating: build/lib/libtorch.so 2025-12-04T09:18:24.8520545Z inflating: build/lib/libjitbackend_test.so 2025-12-04T09:18:24.8533948Z inflating: build/lib/libbackend_with_compiler.so 2025-12-04T09:18:24.8572955Z inflating: build/lib/libtorchbind_test.so 2025-12-04T09:18:24.8587316Z inflating: build/lib/libaoti_custom_ops.so 2025-12-04T09:18:24.9872599Z inflating: build/lib/libtorch_python.so 2025-12-04T09:18:24.9892173Z inflating: build/lib/libnnapi_backend.so 2025-12-04T09:18:24.9892510Z creating: build/bin/ 2025-12-04T09:18:24.9892767Z creating: build/bin/CMakeFiles/ 2025-12-04T09:18:24.9893058Z inflating: build/bin/cmake_install.cmake 2025-12-04T09:18:24.9893368Z inflating: build/bin/CTestTestfile.cmake 2025-12-04T09:18:25.0144155Z inflating: build/bin/protoc-3.13.0.0 2025-12-04T09:18:25.0394252Z inflating: build/bin/protoc 2025-12-04T09:18:25.0426983Z inflating: build/bin/c10_AllocatorConfig_test 2025-12-04T09:18:25.0457327Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2025-12-04T09:18:25.0488683Z inflating: build/bin/c10_DeviceGuard_test 2025-12-04T09:18:25.0520179Z inflating: build/bin/c10_Device_test 2025-12-04T09:18:25.0556130Z inflating: build/bin/c10_DispatchKeySet_test 2025-12-04T09:18:25.0588627Z inflating: build/bin/c10_Scalar_test 2025-12-04T09:18:25.0618323Z inflating: build/bin/c10_StreamGuard_test 2025-12-04T09:18:25.0652738Z inflating: build/bin/c10_SymInt_test 2025-12-04T09:18:25.0686654Z inflating: build/bin/c10_SizesAndStrides_test 2025-12-04T09:18:25.0718748Z inflating: build/bin/c10_Bitset_test 2025-12-04T09:18:25.0760308Z inflating: build/bin/c10_cow_test 2025-12-04T09:18:25.0793279Z inflating: build/bin/c10_InlineDeviceGuard_test 2025-12-04T09:18:25.0826988Z inflating: build/bin/c10_InlineStreamGuard_test 2025-12-04T09:18:25.0858046Z inflating: build/bin/c10_ArrayRef_test 2025-12-04T09:18:25.0887924Z inflating: build/bin/c10_ConstexprCrc_test 2025-12-04T09:18:25.0918113Z inflating: build/bin/c10_DeadlockDetection_test 2025-12-04T09:18:25.0950131Z inflating: build/bin/c10_IntrusiveList_test 2025-12-04T09:18:25.0981125Z inflating: build/bin/c10_Half_test 2025-12-04T09:18:25.1015526Z inflating: build/bin/c10_Enumerate_test 2025-12-04T09:18:25.1049367Z inflating: build/bin/c10_LeftRight_test 2025-12-04T09:18:25.1081711Z inflating: build/bin/c10_NetworkFlow_test 2025-12-04T09:18:25.1111953Z inflating: build/bin/c10_Semaphore_test 2025-12-04T09:18:25.1142504Z inflating: build/bin/c10_Synchronized_test 2025-12-04T09:18:25.1173998Z inflating: build/bin/c10_TypeIndex_test 2025-12-04T09:18:25.1207449Z inflating: build/bin/c10_ThreadLocal_test 2025-12-04T09:18:25.1238975Z inflating: build/bin/c10_accumulate_test 2025-12-04T09:18:25.1273056Z inflating: build/bin/c10_bfloat16_test 2025-12-04T09:18:25.1303089Z inflating: build/bin/c10_error_test 2025-12-04T09:18:25.1333796Z inflating: build/bin/c10_bit_cast_test 2025-12-04T09:18:25.1367199Z inflating: build/bin/c10_complex_test 2025-12-04T09:18:25.1398923Z inflating: build/bin/c10_exception_test 2025-12-04T09:18:25.1433166Z inflating: build/bin/c10_complex_math_test 2025-12-04T09:18:25.1463962Z inflating: build/bin/c10_flags_test 2025-12-04T09:18:25.1494939Z inflating: build/bin/c10_irange_test 2025-12-04T09:18:25.1525619Z inflating: build/bin/c10_generic_math_test 2025-12-04T09:18:25.1614512Z inflating: build/bin/c10_intrusive_ptr_test 2025-12-04T09:18:25.1649156Z inflating: build/bin/c10_logging_test 2025-12-04T09:18:25.1679488Z inflating: build/bin/c10_nofatal_test 2025-12-04T09:18:25.1711981Z inflating: build/bin/c10_lazy_test 2025-12-04T09:18:25.1749137Z inflating: build/bin/c10_ordered_preserving_dict_test 2025-12-04T09:18:25.1781425Z inflating: build/bin/c10_registry_test 2025-12-04T09:18:25.1812942Z inflating: build/bin/c10_ssize_test 2025-12-04T09:18:25.1857395Z inflating: build/bin/c10_optional_test 2025-12-04T09:18:25.1944262Z inflating: build/bin/c10_small_vector_test 2025-12-04T09:18:25.1978376Z inflating: build/bin/c10_string_util_test 2025-12-04T09:18:25.2008812Z inflating: build/bin/c10_tempfile_test 2025-12-04T09:18:25.2038735Z inflating: build/bin/c10_string_view_test 2025-12-04T09:18:25.2065525Z inflating: build/bin/c10_intrusive_ptr_benchmark 2025-12-04T09:18:25.2099260Z inflating: build/bin/c10_typeid_test 2025-12-04T09:18:25.2129158Z inflating: build/bin/c10_hip_HIPAssertionsTest_1_var_test 2025-12-04T09:18:25.2159072Z inflating: build/bin/c10_hip_HIPAssertionsTest_catches_stream 2025-12-04T09:18:25.2188917Z inflating: build/bin/c10_hip_HIPAssertionsTest_catches_thread_and_block_and_device 2025-12-04T09:18:25.2218708Z inflating: build/bin/c10_hip_HIPAssertionsTest_from_2_processes 2025-12-04T09:18:25.2248670Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_blocks_and_threads 2025-12-04T09:18:25.2278345Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_multiple_blocks 2025-12-04T09:18:25.2308063Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_same_block 2025-12-04T09:18:25.2338072Z inflating: build/bin/c10_hip_HIPTest 2025-12-04T09:18:25.2664545Z inflating: build/bin/vec_test_all_types_DEFAULT 2025-12-04T09:18:25.3000259Z inflating: build/bin/vec_test_all_types_AVX512 2025-12-04T09:18:25.3342013Z inflating: build/bin/vec_test_all_types_AVX2 2025-12-04T09:18:25.3399209Z inflating: build/bin/test_aoti_abi_check 2025-12-04T09:18:25.3429581Z inflating: build/bin/test_vec_half_DEFAULT 2025-12-04T09:18:25.3462125Z inflating: build/bin/test_vec_half_AVX2 2025-12-04T09:18:25.3492573Z inflating: build/bin/test_vec_half_AVX512 2025-12-04T09:18:25.3524594Z inflating: build/bin/BackoffTest 2025-12-04T09:18:25.3556789Z inflating: build/bin/FileStoreTest 2025-12-04T09:18:25.3591069Z inflating: build/bin/TCPStoreTest 2025-12-04T09:18:25.3623736Z inflating: build/bin/HashStoreTest 2025-12-04T09:18:25.3664198Z inflating: build/bin/ProcessGroupGlooTest 2025-12-04T09:18:25.3665719Z inflating: build/bin/example_allreduce 2025-12-04T09:18:25.3667738Z inflating: build/bin/torch_shm_manager 2025-12-04T09:18:25.3700608Z inflating: build/bin/static_runtime_bench 2025-12-04T09:18:25.3843288Z inflating: build/bin/static_runtime_test 2025-12-04T09:18:25.3887182Z inflating: build/bin/Dict_test 2025-12-04T09:18:25.3919313Z inflating: build/bin/Dimname_test 2025-12-04T09:18:25.3958300Z inflating: build/bin/MaybeOwned_test 2025-12-04T09:18:25.3992858Z inflating: build/bin/NamedTensor_test 2025-12-04T09:18:25.4028632Z inflating: build/bin/apply_utils_test 2025-12-04T09:18:25.4064427Z inflating: build/bin/atest 2025-12-04T09:18:25.4102931Z inflating: build/bin/basic 2025-12-04T09:18:25.4135853Z inflating: build/bin/broadcast_test 2025-12-04T09:18:25.4167037Z inflating: build/bin/cpu_allocator_test 2025-12-04T09:18:25.4202162Z inflating: build/bin/cpu_generator_test 2025-12-04T09:18:25.4234264Z inflating: build/bin/cpu_profiling_allocator_test 2025-12-04T09:18:25.4289147Z inflating: build/bin/cpu_rng_test 2025-12-04T09:18:25.4320880Z inflating: build/bin/dlconvertor_test 2025-12-04T09:18:25.4355718Z inflating: build/bin/extension_backend_test 2025-12-04T09:18:25.4389279Z inflating: build/bin/half_test 2025-12-04T09:18:25.4446622Z inflating: build/bin/ivalue_test 2025-12-04T09:18:25.4476998Z inflating: build/bin/lazy_tensor_test 2025-12-04T09:18:25.4509080Z inflating: build/bin/math_kernel_test 2025-12-04T09:18:25.4541238Z inflating: build/bin/memory_format_test 2025-12-04T09:18:25.4573811Z inflating: build/bin/memory_overlapping_test 2025-12-04T09:18:25.4606180Z inflating: build/bin/mobile_memory_cleanup 2025-12-04T09:18:25.4639969Z inflating: build/bin/native_test 2025-12-04T09:18:25.4671389Z inflating: build/bin/operator_name_test 2025-12-04T09:18:25.4702370Z inflating: build/bin/operators_test 2025-12-04T09:18:25.4733930Z inflating: build/bin/packedtensoraccessor_test 2025-12-04T09:18:25.4774347Z inflating: build/bin/pow_test 2025-12-04T09:18:25.4808612Z inflating: build/bin/quantized_test 2025-12-04T09:18:25.4839396Z inflating: build/bin/reduce_ops_test 2025-12-04T09:18:25.4870418Z inflating: build/bin/reportMemoryUsage_test 2025-12-04T09:18:25.4904139Z inflating: build/bin/scalar_tensor_test 2025-12-04T09:18:25.4938713Z inflating: build/bin/scalar_test 2025-12-04T09:18:25.4970066Z inflating: build/bin/StorageUtils_test 2025-12-04T09:18:25.5001603Z inflating: build/bin/stride_properties_test 2025-12-04T09:18:25.5048568Z inflating: build/bin/tensor_iterator_test 2025-12-04T09:18:25.5081484Z inflating: build/bin/test_parallel 2025-12-04T09:18:25.5112610Z inflating: build/bin/thread_init_test 2025-12-04T09:18:25.5145842Z inflating: build/bin/type_ptr_test 2025-12-04T09:18:25.5181458Z inflating: build/bin/type_test 2025-12-04T09:18:25.5213197Z inflating: build/bin/undefined_tensor_test 2025-12-04T09:18:25.5243351Z inflating: build/bin/verify_api_visibility 2025-12-04T09:18:25.5285651Z inflating: build/bin/legacy_vmap_test 2025-12-04T09:18:25.5316879Z inflating: build/bin/weakref_test 2025-12-04T09:18:25.5348149Z inflating: build/bin/wrapdim_test 2025-12-04T09:18:25.5408983Z inflating: build/bin/List_test 2025-12-04T09:18:25.5440119Z inflating: build/bin/xla_tensor_test 2025-12-04T09:18:25.5475738Z inflating: build/bin/IListRef_test 2025-12-04T09:18:25.5545065Z inflating: build/bin/kernel_function_legacy_test 2025-12-04T09:18:25.5584756Z inflating: build/bin/KernelFunction_test 2025-12-04T09:18:25.5640787Z inflating: build/bin/kernel_function_test 2025-12-04T09:18:25.5714380Z inflating: build/bin/kernel_lambda_legacy_test 2025-12-04T09:18:25.5774119Z inflating: build/bin/kernel_lambda_test 2025-12-04T09:18:25.5810192Z inflating: build/bin/kernel_stackbased_test 2025-12-04T09:18:25.5866154Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2025-12-04T09:18:25.5897551Z inflating: build/bin/CppSignature_test 2025-12-04T09:18:25.5927419Z inflating: build/bin/op_allowlist_test 2025-12-04T09:18:25.6103366Z inflating: build/bin/op_registration_test 2025-12-04T09:18:25.6133180Z inflating: build/bin/hip_complex_math_test 2025-12-04T09:18:25.6166513Z inflating: build/bin/backend_fallback_test 2025-12-04T09:18:25.6196420Z inflating: build/bin/hip_complex_test 2025-12-04T09:18:25.6236430Z inflating: build/bin/inline_container_test 2025-12-04T09:18:25.6268477Z inflating: build/bin/hip_apply_test 2025-12-04T09:18:25.6298469Z inflating: build/bin/hip_distributions_test 2025-12-04T09:18:25.6328295Z inflating: build/bin/hip_generator_test 2025-12-04T09:18:25.6358151Z inflating: build/bin/hip_half_test 2025-12-04T09:18:25.6387945Z inflating: build/bin/hip_integer_divider_test 2025-12-04T09:18:25.6418574Z inflating: build/bin/hip_optional_test 2025-12-04T09:18:25.6448416Z inflating: build/bin/hip_packedtensoraccessor_test 2025-12-04T09:18:25.6478296Z inflating: build/bin/hip_vectorized_test 2025-12-04T09:18:25.6510051Z inflating: build/bin/hip_dlconvertor_test 2025-12-04T09:18:25.7125369Z inflating: build/bin/test_jit 2025-12-04T09:18:25.7322140Z inflating: build/bin/test_lazy 2025-12-04T09:18:25.7355699Z inflating: build/bin/test_dist_autograd 2025-12-04T09:18:25.7396622Z inflating: build/bin/test_cpp_rpc 2025-12-04T09:18:25.7397999Z inflating: build/bin/parallel_benchmark 2025-12-04T09:18:25.8050845Z inflating: build/bin/test_api 2025-12-04T09:18:25.8051100Z creating: .additional_ci_files/ 2025-12-04T09:18:25.8086748Z inflating: .additional_ci_files/test-times.json 2025-12-04T09:18:25.8217279Z inflating: .additional_ci_files/test-class-times.json 2025-12-04T09:18:25.8247790Z ##[group]Run rm artifacts.zip 2025-12-04T09:18:25.8248019Z rm artifacts.zip 2025-12-04T09:18:25.8252933Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:25.8253125Z env: 2025-12-04T09:18:25.8253244Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:25.8253413Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:18:25.8253629Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:18:25.8253837Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:18:25.8254293Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:18:25.8254733Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:18:25.8255076Z AWS_REGION: us-east-1 2025-12-04T09:18:25.8255304Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:18:25.8255639Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:18:25.8258242Z AWS_SESSION_TOKEN: *** 2025-12-04T09:18:25.8258354Z ##[endgroup] 2025-12-04T09:18:25.9175115Z ##[group]Run df -H 2025-12-04T09:18:25.9175235Z df -H 2025-12-04T09:18:25.9177913Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:25.9178082Z env: 2025-12-04T09:18:25.9178192Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:25.9178348Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:18:25.9178544Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:18:25.9178717Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:18:25.9179104Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:18:25.9179483Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:18:25.9179607Z AWS_REGION: us-east-1 2025-12-04T09:18:25.9179759Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:18:25.9179931Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:18:25.9182172Z AWS_SESSION_TOKEN: *** 2025-12-04T09:18:25.9182284Z ##[endgroup] 2025-12-04T09:18:25.9553852Z Filesystem Size Used Avail Use% Mounted on 2025-12-04T09:18:25.9554278Z overlay 16T 619G 15T 5% / 2025-12-04T09:18:25.9554635Z tmpfs 68M 0 68M 0% /dev 2025-12-04T09:18:25.9554976Z /dev/md0 16T 619G 15T 5% /run 2025-12-04T09:18:25.9555328Z shm 68M 4.1k 68M 1% /dev/shm 2025-12-04T09:18:25.9555764Z amdprj2-k8s_2 5.5T 120G 5.4T 3% /home/runner/pytorch-data 2025-12-04T09:18:25.9556304Z tmpfs 3.3T 13k 3.3T 1% /run/secrets/kubernetes.io/serviceaccount 2025-12-04T09:18:25.9556748Z tmpfs 1.7T 0 1.7T 0% /proc/acpi 2025-12-04T09:18:25.9557130Z tmpfs 1.7T 0 1.7T 0% /proc/scsi 2025-12-04T09:18:25.9557492Z tmpfs 1.7T 0 1.7T 0% /sys/firmware 2025-12-04T09:18:25.9557919Z tmpfs 1.7T 0 1.7T 0% /sys/devices/virtual/powercap 2025-12-04T09:18:25.9583934Z Prepare all required actions 2025-12-04T09:18:25.9584158Z Getting action download info 2025-12-04T09:18:26.1870407Z ##[group]Run ./.github/actions/download-td-artifacts 2025-12-04T09:18:26.1870555Z with: 2025-12-04T09:18:26.1870652Z env: 2025-12-04T09:18:26.1870753Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:26.1870897Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:18:26.1871079Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:18:26.1871250Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:18:26.1871636Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:18:26.1872060Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:18:26.1872181Z AWS_REGION: us-east-1 2025-12-04T09:18:26.1872352Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:18:26.1872535Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:18:26.1874731Z AWS_SESSION_TOKEN: *** 2025-12-04T09:18:26.1874843Z ##[endgroup] 2025-12-04T09:18:26.1887739Z ##[group]Run seemethere/download-artifact-s3@v4 2025-12-04T09:18:26.1887872Z with: 2025-12-04T09:18:26.1887961Z name: td_results 2025-12-04T09:18:26.1888063Z s3-bucket: gha-artifacts 2025-12-04T09:18:26.1888171Z region: us-east-1 2025-12-04T09:18:26.1888265Z env: 2025-12-04T09:18:26.1888357Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:26.1888487Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:18:26.1888659Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:18:26.1888822Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:18:26.1889197Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:18:26.1889667Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:18:26.1889779Z AWS_REGION: us-east-1 2025-12-04T09:18:26.1889906Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:18:26.1890051Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:18:26.1892272Z AWS_SESSION_TOKEN: *** 2025-12-04T09:18:26.1892374Z ##[endgroup] 2025-12-04T09:18:26.4103061Z (node:17257) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-12-04T09:18:26.4103341Z 2025-12-04T09:18:26.4103471Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-12-04T09:18:26.4103787Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-12-04T09:18:26.4104104Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-12-04T09:18:26.6842602Z Found 1 objects with prefix pytorch/pytorch/19922849170/td_results/ 2025-12-04T09:18:26.6843099Z Starting download (1/1): /home/runner/_work/pytorch/pytorch/td_results.json 2025-12-04T09:18:27.1389730Z Finished download (1/1): /home/runner/_work/pytorch/pytorch/td_results.json 2025-12-04T09:18:27.1401058Z Artifact download has finished successfully 2025-12-04T09:18:27.1554337Z ##[group]Run mkdir -p .additional_ci_files 2025-12-04T09:18:27.1554561Z mkdir -p .additional_ci_files 2025-12-04T09:18:27.1554786Z mv td_results.json .additional_ci_files/td_results.json || true 2025-12-04T09:18:27.1559882Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:27.1560046Z env: 2025-12-04T09:18:27.1560149Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:27.1560295Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:18:27.1560486Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:18:27.1560668Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:18:27.1561292Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:18:27.1561692Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:18:27.1561815Z AWS_REGION: us-east-1 2025-12-04T09:18:27.1562048Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:18:27.1562212Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:18:27.1564714Z AWS_SESSION_TOKEN: *** 2025-12-04T09:18:27.1564826Z ##[endgroup] 2025-12-04T09:18:27.1632547Z ##[group]Run .github/scripts/parse_ref.py 2025-12-04T09:18:27.1632701Z .github/scripts/parse_ref.py 2025-12-04T09:18:27.1637824Z shell: /usr/bin/bash -e {0} 2025-12-04T09:18:27.1637934Z env: 2025-12-04T09:18:27.1638027Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:27.1638166Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:18:27.1638345Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:18:27.1638520Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:18:27.1638898Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:18:27.1639266Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:18:27.1639382Z AWS_REGION: us-east-1 2025-12-04T09:18:27.1639516Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:18:27.1639678Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:18:27.1642144Z AWS_SESSION_TOKEN: *** 2025-12-04T09:18:27.1642247Z ##[endgroup] 2025-12-04T09:18:27.1752040Z Setting output branch=main 2025-12-04T09:18:27.1822800Z Prepare all required actions 2025-12-04T09:18:27.1823022Z Getting action download info 2025-12-04T09:18:27.3958907Z ##[group]Run ./.github/actions/filter-test-configs 2025-12-04T09:18:27.3959059Z with: 2025-12-04T09:18:27.3959404Z github-token: *** 2025-12-04T09:18:27.3962587Z test-matrix: {"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}]} 2025-12-04T09:18:27.3965738Z job-name: linux-jammy-rocm-py3.10 / test (default, 3, 6, linux.rocm.gpu.gfx942.1.b, mem_leak_check, unstable) 2025-12-04T09:18:27.3965949Z env: 2025-12-04T09:18:27.3966042Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:27.3966178Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:18:27.3966355Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:18:27.3966516Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:18:27.3966896Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:18:27.3967259Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:18:27.3967373Z AWS_REGION: us-east-1 2025-12-04T09:18:27.3967638Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:18:27.3967786Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:18:27.3970015Z AWS_SESSION_TOKEN: *** 2025-12-04T09:18:27.3970118Z ##[endgroup] 2025-12-04T09:18:27.3985837Z ##[group]Run nick-fields/retry@v3.0.0 2025-12-04T09:18:27.3986013Z with: 2025-12-04T09:18:27.3986102Z shell: bash 2025-12-04T09:18:27.3986197Z timeout_minutes: 10 2025-12-04T09:18:27.3986299Z max_attempts: 5 2025-12-04T09:18:27.3986399Z retry_wait_seconds: 30 2025-12-04T09:18:27.3986693Z command: set -eux # PyYAML 6.0 doesn't work with MacOS x86 anymore # This must run on Python-3.7 (AmazonLinux2) so can't use request=3.32.2 python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-12-04T09:18:27.3986993Z polling_interval_seconds: 1 2025-12-04T09:18:27.3987106Z warning_on_retry: true 2025-12-04T09:18:27.3987212Z continue_on_error: false 2025-12-04T09:18:27.3987316Z env: 2025-12-04T09:18:27.3987405Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:27.3987539Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:18:27.3987717Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:18:27.3987883Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:18:27.3988260Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:18:27.3988631Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:18:27.3988746Z AWS_REGION: us-east-1 2025-12-04T09:18:27.3988878Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:18:27.3989025Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:18:27.3991366Z AWS_SESSION_TOKEN: *** 2025-12-04T09:18:27.3991533Z GITHUB_TOKEN: *** 2025-12-04T09:18:27.3991631Z ##[endgroup] 2025-12-04T09:18:27.4375583Z + python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-12-04T09:18:27.5800662Z Defaulting to user installation because normal site-packages is not writeable 2025-12-04T09:18:27.6752337Z Collecting requests==2.27.1 2025-12-04T09:18:27.7109149Z Downloading requests-2.27.1-py2.py3-none-any.whl (63 kB) 2025-12-04T09:18:27.7210536Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.1/63.1 KB 6.3 MB/s eta 0:00:00 2025-12-04T09:18:27.7669190Z Collecting pyyaml==6.0.2 2025-12-04T09:18:27.7727535Z Downloading PyYAML-6.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (751 kB) 2025-12-04T09:18:27.7942380Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 751.2/751.2 KB 36.8 MB/s eta 0:00:00 2025-12-04T09:18:27.8150649Z Collecting idna<4,>=2.5 2025-12-04T09:18:27.8214441Z Downloading idna-3.11-py3-none-any.whl (71 kB) 2025-12-04T09:18:27.8242455Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 71.0/71.0 KB 54.4 MB/s eta 0:00:00 2025-12-04T09:18:27.8451538Z Collecting certifi>=2017.4.17 2025-12-04T09:18:27.8511434Z Downloading certifi-2025.11.12-py3-none-any.whl (159 kB) 2025-12-04T09:18:27.8532602Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 159.4/159.4 KB 172.2 MB/s eta 0:00:00 2025-12-04T09:18:27.8833434Z Collecting urllib3<1.27,>=1.21.1 2025-12-04T09:18:27.8888902Z Downloading urllib3-1.26.20-py2.py3-none-any.whl (144 kB) 2025-12-04T09:18:27.8910412Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 144.2/144.2 KB 166.3 MB/s eta 0:00:00 2025-12-04T09:18:27.9804385Z Collecting charset-normalizer~=2.0.0 2025-12-04T09:18:27.9859110Z Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB) 2025-12-04T09:18:28.0430417Z Installing collected packages: urllib3, pyyaml, idna, charset-normalizer, certifi, requests 2025-12-04T09:18:28.1388075Z WARNING: The script normalizer is installed in '/home/runner/.local/bin' which is not on PATH. 2025-12-04T09:18:28.1388737Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-12-04T09:18:28.1557234Z Successfully installed certifi-2025.11.12 charset-normalizer-2.0.12 idna-3.11 pyyaml-6.0.2 requests-2.27.1 urllib3-1.26.20 2025-12-04T09:18:28.4373905Z Command completed after 1 attempt(s). 2025-12-04T09:18:28.4429107Z ##[group]Run set -x 2025-12-04T09:18:28.4429282Z set -x 2025-12-04T09:18:28.4429401Z  2025-12-04T09:18:28.4429615Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-12-04T09:18:28.4430022Z # in runner workspace 2025-12-04T09:18:28.4430232Z python3 "${GITHUB_ACTION_PATH}/../../scripts/parse_ref.py" 2025-12-04T09:18:28.4435510Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:28.4435707Z env: 2025-12-04T09:18:28.4435842Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:28.4436022Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:18:28.4436265Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:18:28.4436486Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:18:28.4437020Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:18:28.4437542Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:18:28.4437670Z AWS_REGION: us-east-1 2025-12-04T09:18:28.4437861Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:18:28.4438032Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:18:28.4440367Z AWS_SESSION_TOKEN: *** 2025-12-04T09:18:28.4440482Z ##[endgroup] 2025-12-04T09:18:28.4464144Z + python3 /home/runner/_work/pytorch/pytorch/./.github/actions/filter-test-configs/../../scripts/parse_ref.py 2025-12-04T09:18:28.4554117Z Setting output branch=main 2025-12-04T09:18:28.4594217Z ##[group]Run echo "Workflow: ${GITHUB_WORKFLOW}" 2025-12-04T09:18:28.4594472Z echo "Workflow: ${GITHUB_WORKFLOW}" 2025-12-04T09:18:28.4594669Z echo "Job name: ${JOB_NAME}" 2025-12-04T09:18:28.4594837Z  2025-12-04T09:18:28.4595048Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-12-04T09:18:28.4595319Z # in runner workspace 2025-12-04T09:18:28.4595560Z python3 "${GITHUB_ACTION_PATH}/../../scripts/filter_test_configs.py" \ 2025-12-04T09:18:28.4595824Z  --workflow "${GITHUB_WORKFLOW}" \ 2025-12-04T09:18:28.4596026Z  --job-name "${JOB_NAME}" \ 2025-12-04T09:18:28.4600139Z  --test-matrix "{"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}]}" \ 2025-12-04T09:18:28.4603824Z  --selected-test-configs "" \ 2025-12-04T09:18:28.4603982Z  --pr-number "${PR_NUMBER}" \ 2025-12-04T09:18:28.4604123Z  --tag "${TAG}" \ 2025-12-04T09:18:28.4604257Z  --event-name "${EVENT_NAME}" \ 2025-12-04T09:18:28.4604398Z  --schedule "${SCHEDULE}" \ 2025-12-04T09:18:28.4604539Z  --branch "${HEAD_BRANCH}" 2025-12-04T09:18:28.4608993Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:28.4609145Z env: 2025-12-04T09:18:28.4609245Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:28.4609386Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:18:28.4609565Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:18:28.4609737Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:18:28.4610122Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:18:28.4610492Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:18:28.4610610Z AWS_REGION: us-east-1 2025-12-04T09:18:28.4610777Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:18:28.4610939Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:18:28.4613181Z AWS_SESSION_TOKEN: *** 2025-12-04T09:18:28.4613389Z GITHUB_TOKEN: *** 2025-12-04T09:18:28.4613587Z JOB_NAME: linux-jammy-rocm-py3.10 / test (default, 3, 6, linux.rocm.gpu.gfx942.1.b, mem_leak_check, unstable) 2025-12-04T09:18:28.4613795Z PR_NUMBER: 2025-12-04T09:18:28.4613892Z TAG: 2025-12-04T09:18:28.4613985Z EVENT_NAME: schedule 2025-12-04T09:18:28.4614090Z SCHEDULE: 29 8 * * * 2025-12-04T09:18:28.4614196Z HEAD_BRANCH: main 2025-12-04T09:18:28.4614298Z ##[endgroup] 2025-12-04T09:18:28.4635578Z Workflow: trunk-rocm-mi300 2025-12-04T09:18:28.4635918Z Job name: linux-jammy-rocm-py3.10 / test (default, 3, 6, linux.rocm.gpu.gfx942.1.b, mem_leak_check, unstable) 2025-12-04T09:18:29.0277708Z INFO:root:Issue https://github.com/pytorch/pytorch/issues/167616 created by jithunnair-amd has unstable all the test jobs for trunk-rocm-mi300 / linux-jammy-rocm-py3.10 / test (default, 3, 6, linux.rocm.gpu.gfx942.1.b, mem_leak_check, unstable) 2025-12-04T09:18:29.4631394Z Setting output keep-going=True 2025-12-04T09:18:29.4631729Z Setting output ci-verbose-test-logs=False 2025-12-04T09:18:29.4632148Z Setting output ci-test-showlocals=False 2025-12-04T09:18:29.4632384Z Setting output ci-no-test-timeout=False 2025-12-04T09:18:29.4632640Z Setting output ci-no-td=False 2025-12-04T09:18:29.4632858Z Setting output ci-td-distributed=False 2025-12-04T09:18:29.4633078Z Setting output is-unstable=True 2025-12-04T09:18:29.4633291Z Setting output reenabled-issues= 2025-12-04T09:18:29.4644874Z Setting output test-matrix={"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}]} 2025-12-04T09:18:29.4653903Z Setting output is-test-matrix-empty=False 2025-12-04T09:18:29.4751096Z ##[group]Run echo "Filtered matrix:" 2025-12-04T09:18:29.4751269Z echo "Filtered matrix:" 2025-12-04T09:18:29.4758206Z echo "{"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}]}" 2025-12-04T09:18:29.4765254Z  2025-12-04T09:18:29.4765349Z echo 2025-12-04T09:18:29.4765468Z echo "Is the current job unstable? True" 2025-12-04T09:18:29.4765602Z  2025-12-04T09:18:29.4765693Z echo 2025-12-04T09:18:29.4765805Z echo "Is keep-going label set? True" 2025-12-04T09:18:29.4765934Z  2025-12-04T09:18:29.4766022Z echo 2025-12-04T09:18:29.4766124Z echo "Reenabled issues? " 2025-12-04T09:18:29.4770309Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:29.4770463Z env: 2025-12-04T09:18:29.4770567Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:29.4770707Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:18:29.4770887Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:18:29.4771058Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:18:29.4771446Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:18:29.4771817Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:18:29.4771972Z AWS_REGION: us-east-1 2025-12-04T09:18:29.4772156Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:18:29.4772311Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:18:29.4774499Z AWS_SESSION_TOKEN: *** 2025-12-04T09:18:29.4774613Z ##[endgroup] 2025-12-04T09:18:29.4797281Z Filtered matrix: 2025-12-04T09:18:29.4814508Z {include: [{config: default, shard: 1, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 1, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 1, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 1, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: default, shard: 2, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 2, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 2, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 2, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: default, shard: 3, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 3, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 3, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 3, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: default, shard: 4, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 4, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 4, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 4, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: default, shard: 5, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 5, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 5, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 5, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: default, shard: 6, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 6, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 6, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 6, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}]} 2025-12-04T09:18:29.4824474Z 2025-12-04T09:18:29.4824584Z Is the current job unstable? True 2025-12-04T09:18:29.4824699Z 2025-12-04T09:18:29.4824766Z Is keep-going label set? True 2025-12-04T09:18:29.4824873Z 2025-12-04T09:18:29.4824925Z Reenabled issues? 2025-12-04T09:18:29.4852311Z ##[group]Run echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-12-04T09:18:29.4852570Z echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-12-04T09:18:29.4856996Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:29.4857140Z env: 2025-12-04T09:18:29.4857235Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:29.4857371Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:18:29.4857546Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:18:29.4857710Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:18:29.4858091Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:18:29.4858462Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:18:29.4858579Z AWS_REGION: us-east-1 2025-12-04T09:18:29.4858769Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:18:29.4859019Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:18:29.4861226Z AWS_SESSION_TOKEN: *** 2025-12-04T09:18:29.4861329Z JOB_TIMEOUT: 600 2025-12-04T09:18:29.4861428Z ##[endgroup] 2025-12-04T09:18:29.4906900Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T09:18:29.4907189Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T09:18:29.4907428Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T09:18:29.4911988Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T09:18:29.4912151Z env: 2025-12-04T09:18:29.4912262Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:29.4912410Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:18:29.4912602Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:18:29.4912784Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:18:29.4913192Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:18:29.4913630Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:18:29.4913765Z AWS_REGION: us-east-1 2025-12-04T09:18:29.4913951Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:18:29.4914116Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:18:29.4916703Z AWS_SESSION_TOKEN: *** 2025-12-04T09:18:29.4916827Z ##[endgroup] 2025-12-04T09:18:29.4986872Z ##[group]Run set -x 2025-12-04T09:18:29.4987008Z set -x 2025-12-04T09:18:29.4987101Z  2025-12-04T09:18:29.4987209Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2025-12-04T09:18:29.4987367Z  TEST_COMMAND=.ci/pytorch/multigpu-test.sh 2025-12-04T09:18:29.4987522Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2025-12-04T09:18:29.4987668Z  TEST_COMMAND=.ci/caffe2/test.sh 2025-12-04T09:18:29.4987786Z else 2025-12-04T09:18:29.4987891Z  TEST_COMMAND=.ci/pytorch/test.sh 2025-12-04T09:18:29.4988020Z fi 2025-12-04T09:18:29.4988106Z  2025-12-04T09:18:29.4988239Z # detached container should get cleaned up by teardown_ec2_linux 2025-12-04T09:18:29.4988441Z # TODO: Stop building test binaries as part of the build phase 2025-12-04T09:18:29.4988618Z # Used for GPU_FLAG since that doesn't play nice 2025-12-04T09:18:29.4988780Z # shellcheck disable=SC2086,SC2090 2025-12-04T09:18:29.4988915Z container_name=$(docker run \ 2025-12-04T09:18:29.4989043Z  ${GPU_FLAG:-} \ 2025-12-04T09:18:29.4989160Z  -e BUILD_ENVIRONMENT \ 2025-12-04T09:18:29.4989280Z  -e PR_NUMBER \ 2025-12-04T09:18:29.4989393Z  -e GITHUB_ACTIONS \ 2025-12-04T09:18:29.4989509Z  -e GITHUB_REPOSITORY \ 2025-12-04T09:18:29.4989628Z  -e GITHUB_WORKFLOW \ 2025-12-04T09:18:29.4989741Z  -e GITHUB_JOB \ 2025-12-04T09:18:29.4989848Z  -e GITHUB_RUN_ID \ 2025-12-04T09:18:29.4990070Z  -e GITHUB_RUN_NUMBER \ 2025-12-04T09:18:29.4990187Z  -e GITHUB_RUN_ATTEMPT \ 2025-12-04T09:18:29.4990305Z  -e JOB_ID \ 2025-12-04T09:18:29.4990409Z  -e JOB_NAME \ 2025-12-04T09:18:29.4990516Z  -e BASE_SHA \ 2025-12-04T09:18:29.4990617Z  -e BRANCH \ 2025-12-04T09:18:29.4990716Z  -e SHA1 \ 2025-12-04T09:18:29.4990822Z  -e AWS_DEFAULT_REGION \ 2025-12-04T09:18:29.4990941Z  -e IN_WHEEL_TEST \ 2025-12-04T09:18:29.4991052Z  -e SHARD_NUMBER \ 2025-12-04T09:18:29.4991161Z  -e TEST_CONFIG \ 2025-12-04T09:18:29.4991272Z  -e NUM_TEST_SHARDS \ 2025-12-04T09:18:29.4991387Z  -e REENABLED_ISSUES \ 2025-12-04T09:18:29.4991507Z  -e CONTINUE_THROUGH_ERROR \ 2025-12-04T09:18:29.4991631Z  -e VERBOSE_TEST_LOGS \ 2025-12-04T09:18:29.4991748Z  -e TEST_SHOWLOCALS \ 2025-12-04T09:18:29.4992067Z  -e NO_TEST_TIMEOUT \ 2025-12-04T09:18:29.4992180Z  -e NO_TD \ 2025-12-04T09:18:29.4992297Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2025-12-04T09:18:29.4992438Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2025-12-04T09:18:29.4992580Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2025-12-04T09:18:29.4992713Z  -e TESTS_TO_INCLUDE \ 2025-12-04T09:18:29.4992831Z  -e HUGGING_FACE_HUB_TOKEN \ 2025-12-04T09:18:29.4992954Z  -e DASHBOARD_TAG \ 2025-12-04T09:18:29.4993098Z  --env-file="${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" \ 2025-12-04T09:18:29.4993262Z  --ulimit stack=10485760:83886080 \ 2025-12-04T09:18:29.4993385Z  --ulimit core=0 \ 2025-12-04T09:18:29.4993521Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2025-12-04T09:18:29.4993674Z  --security-opt seccomp=unconfined \ 2025-12-04T09:18:29.4993809Z  --cap-add=SYS_PTRACE \ 2025-12-04T09:18:29.4993929Z  --shm-size="8g" \ 2025-12-04T09:18:29.4994035Z  --tty \ 2025-12-04T09:18:29.4994136Z  --detach \ 2025-12-04T09:18:29.4994245Z  --name="${container_name}" \ 2025-12-04T09:18:29.4994372Z  --user jenkins \ 2025-12-04T09:18:29.4994515Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2025-12-04T09:18:29.4994676Z  -w /var/lib/jenkins/workspace \ 2025-12-04T09:18:29.4994884Z  "${DOCKER_IMAGE}" 2025-12-04T09:18:29.4994999Z ) 2025-12-04T09:18:29.4995119Z # save container name for later step 2025-12-04T09:18:29.4995284Z echo "CONTAINER_NAME=${container_name}" >> "$GITHUB_ENV" 2025-12-04T09:18:29.4995557Z # jenkins user does not have write permission to mounted workspace; work-around by copying within container to jenkins home 2025-12-04T09:18:29.4995904Z docker exec -t "${container_name}" sh -c "cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && ${TEST_COMMAND}" 2025-12-04T09:18:29.4998973Z shell: /usr/bin/bash -e {0} 2025-12-04T09:18:29.4999087Z env: 2025-12-04T09:18:29.4999186Z GIT_DEFAULT_BRANCH: main 2025-12-04T09:18:29.4999324Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T09:18:29.4999505Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T09:18:29.4999674Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T09:18:29.5000058Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T09:18:29.5000431Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T09:18:29.5000550Z AWS_REGION: us-east-1 2025-12-04T09:18:29.5000697Z AWS_ACCESS_KEY_ID: *** 2025-12-04T09:18:29.5000852Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T09:18:29.5003109Z AWS_SESSION_TOKEN: *** 2025-12-04T09:18:29.5003234Z BUILD_ENVIRONMENT: linux-jammy-rocm-py3.10 2025-12-04T09:18:29.5003423Z PR_NUMBER: 2025-12-04T09:18:29.5003531Z GITHUB_REPOSITORY: pytorch/pytorch 2025-12-04T09:18:29.5003666Z GITHUB_WORKFLOW: trunk-rocm-mi300 2025-12-04T09:18:29.5003788Z GITHUB_JOB: test 2025-12-04T09:18:29.5003893Z GITHUB_RUN_ID: 19922849170 2025-12-04T09:18:29.5004007Z GITHUB_RUN_NUMBER: 689 2025-12-04T09:18:29.5004118Z GITHUB_RUN_ATTEMPT: 1 2025-12-04T09:18:29.5004224Z JOB_ID: 57116213137 2025-12-04T09:18:29.5004429Z JOB_NAME: linux-jammy-rocm-py3.10 / test (default, 3, 6, linux.rocm.gpu.gfx942.1.b, mem_leak_check, unstable) 2025-12-04T09:18:29.5004635Z BRANCH: main 2025-12-04T09:18:29.5004750Z SHA1: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:18:29.5004908Z BASE_SHA: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:18:29.5005043Z TEST_CONFIG: default 2025-12-04T09:18:29.5005148Z SHARD_NUMBER: 3 2025-12-04T09:18:29.5005246Z NUM_TEST_SHARDS: 6 2025-12-04T09:18:29.5005347Z REENABLED_ISSUES: 2025-12-04T09:18:29.5005458Z CONTINUE_THROUGH_ERROR: True 2025-12-04T09:18:29.5005575Z VERBOSE_TEST_LOGS: False 2025-12-04T09:18:29.5005681Z TEST_SHOWLOCALS: False 2025-12-04T09:18:29.5005788Z NO_TEST_TIMEOUT: False 2025-12-04T09:18:29.5005890Z NO_TD: False 2025-12-04T09:18:29.5006161Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:18:29.5006462Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 1 2025-12-04T09:18:29.5006596Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2025-12-04T09:18:29.5006724Z TESTS_TO_INCLUDE: 2025-12-04T09:18:29.5006826Z DASHBOARD_TAG: 2025-12-04T09:18:29.5006971Z HUGGING_FACE_HUB_TOKEN: *** 2025-12-04T09:18:29.5007086Z ##[endgroup] 2025-12-04T09:18:29.5019845Z + [[ default == \m\u\l\t\i\g\p\u ]] 2025-12-04T09:18:29.5019995Z + [[ linux-jammy-rocm-py3.10 == *onnx* ]] 2025-12-04T09:18:29.5020404Z + TEST_COMMAND=.ci/pytorch/test.sh 2025-12-04T09:18:29.5026796Z +++ nproc --ignore=2 2025-12-04T09:18:29.5033709Z ++ docker run --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e GITHUB_REPOSITORY -e GITHUB_WORKFLOW -e GITHUB_JOB -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e JOB_ID -e JOB_NAME -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e REENABLED_ISSUES -e CONTINUE_THROUGH_ERROR -e VERBOSE_TEST_LOGS -e TEST_SHOWLOCALS -e NO_TEST_TIMEOUT -e NO_TD -e MAX_JOBS=126 -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS -e TESTS_TO_INCLUDE -e HUGGING_FACE_HUB_TOKEN -e DASHBOARD_TAG --env-file=/home/runner/_work/_temp/github_env_19922849170 --ulimit stack=10485760:83886080 --ulimit core=0 --env-file=/tmp/github_env_19922849170 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --shm-size=8g --tty --detach --name= --user jenkins -v /home/runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T09:18:29.6574440Z + container_name=617120bab89fb30c7a4b0a74db2c320a66c2c538ba97c2989f5c244419625fc0 2025-12-04T09:18:29.6574809Z + echo CONTAINER_NAME=617120bab89fb30c7a4b0a74db2c320a66c2c538ba97c2989f5c244419625fc0 2025-12-04T09:18:29.6575299Z + docker exec -t 617120bab89fb30c7a4b0a74db2c320a66c2c538ba97c2989f5c244419625fc0 sh -c 'cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && .ci/pytorch/test.sh' 2025-12-04T09:18:32.8373229Z Processing ./dist/torch-2.10.0a0+gitffd9b0f-cp310-cp310-linux_x86_64.whl 2025-12-04T09:18:33.3769159Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (3.18.0) 2025-12-04T09:18:33.3770689Z Requirement already satisfied: typing-extensions>=4.10.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (4.12.2) 2025-12-04T09:18:33.3771818Z Requirement already satisfied: sympy>=1.13.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (1.13.3) 2025-12-04T09:18:33.3773118Z Requirement already satisfied: networkx>=2.5.1 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (2.8.8) 2025-12-04T09:18:33.3774295Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (3.1.6) 2025-12-04T09:18:33.3776177Z Requirement already satisfied: fsspec>=0.8.5 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (2025.10.0) 2025-12-04T09:18:33.3940495Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.3->torch==2.10.0a0+gitffd9b0f) (1.3.0) 2025-12-04T09:18:33.3963734Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch==2.10.0a0+gitffd9b0f) (3.0.3) 2025-12-04T09:18:33.5921611Z Installing collected packages: torch 2025-12-04T09:18:39.1559252Z Successfully installed torch-2.10.0a0+gitffd9b0f 2025-12-04T09:18:39.1937081Z + export TERM=vt100 2025-12-04T09:18:39.1937344Z + TERM=vt100 2025-12-04T09:18:39.1941637Z ++ dirname .ci/pytorch/test.sh 2025-12-04T09:18:39.1954951Z + source .ci/pytorch/common.sh 2025-12-04T09:18:39.1960481Z +++ dirname .ci/pytorch/common.sh 2025-12-04T09:18:39.1971304Z ++ source .ci/pytorch/common_utils.sh 2025-12-04T09:18:39.1973550Z +++ declare -f -t trap_add 2025-12-04T09:18:39.1978789Z ++ set -ex -o pipefail 2025-12-04T09:18:39.1979000Z ++ [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-12-04T09:18:39.1979213Z ++ unset HIP_PLATFORM 2025-12-04T09:18:39.1979400Z ++ export PYTORCH_TEST_WITH_ROCM=1 2025-12-04T09:18:39.1979600Z ++ PYTORCH_TEST_WITH_ROCM=1 2025-12-04T09:18:39.1979796Z ++ BUILD_TEST_LIBTORCH=0 2025-12-04T09:18:39.1985610Z ++ dirname .ci/pytorch/test.sh 2025-12-04T09:18:39.1998162Z + source .ci/pytorch/common-build.sh 2025-12-04T09:18:39.1998597Z ++ [[ linux-jammy-rocm-py3.10 != *win-* ]] 2025-12-04T09:18:39.2006942Z ++++ dirname .ci/pytorch/common-build.sh 2025-12-04T09:18:39.2017441Z +++ cd .ci/pytorch 2025-12-04T09:18:39.2017740Z +++ pwd -P 2025-12-04T09:18:39.2021426Z ++ script_dir=/var/lib/jenkins/pytorch/.ci/pytorch 2025-12-04T09:18:39.2021759Z ++ [[ linux-jammy-rocm-py3.10 == *-pch* ]] 2025-12-04T09:18:39.2022031Z ++ which sccache 2025-12-04T09:18:39.2035235Z ++ [[ -z '' ]] 2025-12-04T09:18:39.2035408Z ++ unset SCCACHE_BUCKET 2025-12-04T09:18:39.2035588Z ++ unset SCCACHE_REGION 2025-12-04T09:18:39.2035767Z ++ sccache --stop-server 2025-12-04T09:18:39.2057601Z ++ true 2025-12-04T09:18:39.2057769Z ++ rm -f /var/lib/jenkins/sccache_error.log 2025-12-04T09:18:39.2070388Z ++ trap_add sccache_epilogue EXIT 2025-12-04T09:18:39.2070587Z ++ trap_add_cmd=sccache_epilogue 2025-12-04T09:18:39.2070769Z ++ shift 2025-12-04T09:18:39.2070917Z ++ for trap_add_name in "$@" 2025-12-04T09:18:39.2076086Z ++++ trap -p EXIT 2025-12-04T09:18:39.2077954Z +++ eval 'extract_trap_cmd ' 2025-12-04T09:18:39.2078124Z ++++ extract_trap_cmd 2025-12-04T09:18:39.2078275Z ++++ printf '%s\n' '' 2025-12-04T09:18:39.2078439Z +++ printf '%s\n' sccache_epilogue 2025-12-04T09:18:39.2080269Z ++ trap -- ' 2025-12-04T09:18:39.2080420Z sccache_epilogue' EXIT 2025-12-04T09:18:39.2080602Z ++ [[ -n '' ]] 2025-12-04T09:18:39.2080774Z ++ [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-12-04T09:18:39.2081001Z ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-12-04T09:18:39.2081217Z ++ SCCACHE_IDLE_TIMEOUT=0 2025-12-04T09:18:39.2081386Z ++ sccache --start-server 2025-12-04T09:18:39.2102010Z sccache: Starting the server... 2025-12-04T09:18:39.2287209Z sccache: Listening on address 127.0.0.1:4226 2025-12-04T09:18:39.2298666Z ++ sccache --zero-stats 2025-12-04T09:18:39.2312665Z Statistics zeroed. 2025-12-04T09:18:39.2316034Z ++ which ccache 2025-12-04T09:18:39.2324440Z + [[ linux-jammy-rocm-py3.10 != *rocm* ]] 2025-12-04T09:18:39.2324721Z + [[ linux-jammy-rocm-py3.10 == *cuda* ]] 2025-12-04T09:18:39.2324978Z + echo 'Environment variables:' 2025-12-04T09:18:39.2325209Z Environment variables: 2025-12-04T09:18:39.2325406Z + env 2025-12-04T09:18:39.2331057Z GITHUB_WORKSPACE=/home/runner/_work/pytorch/pytorch 2025-12-04T09:18:39.2331335Z CONTINUE_THROUGH_ERROR=True 2025-12-04T09:18:39.2331584Z BUILD_ENVIRONMENT=linux-jammy-rocm-py3.10 2025-12-04T09:18:39.2332016Z HOSTNAME=linux.rocm.gpu.gfx942.1.b-gwk9b-runner-jfbtd 2025-12-04T09:18:39.2332466Z GITHUB_PATH=/home/runner/_work/_temp/_runner_file_commands/add_path_2bdd9d81-5b60-45e5-9d95-0e21b85ebd89 2025-12-04T09:18:39.2332850Z GITHUB_ACTION=__run_2 2025-12-04T09:18:39.2333063Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 2025-12-04T09:18:39.2333286Z GITHUB_RUN_NUMBER=689 2025-12-04T09:18:39.2333472Z TEST_CONFIG=default 2025-12-04T09:18:39.2333720Z RUNNER_NAME=linux.rocm.gpu.gfx942.1.b-gwk9b-runner-jfbtd 2025-12-04T09:18:39.2334014Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-12-04T09:18:39.2334251Z AWS_DEFAULT_REGION=us-east-1 2025-12-04T09:18:39.2334502Z RUNNER_ARTIFACT_DIR=/home/runner/_work/_temp/artifacts 2025-12-04T09:18:39.2334779Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-12-04T09:18:39.2335013Z GITHUB_REF_TYPE=branch 2025-12-04T09:18:39.2335240Z BASE_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:18:39.2335704Z HUGGING_FACE_HUB_TOKEN=*** 2025-12-04T09:18:39.2338166Z *** 2025-12-04T09:18:39.2338321Z GITHUB_REPOSITORY_ID=65600975 2025-12-04T09:18:39.2338505Z GITHUB_ACTIONS=true 2025-12-04T09:18:39.2338695Z SHA1=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:18:39.2338930Z GITHUB_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:18:39.2339276Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/trunk-rocm-mi300.yml@refs/heads/main 2025-12-04T09:18:39.2339584Z UCC_HOME=/usr 2025-12-04T09:18:39.2339747Z RUNNER_ENVIRONMENT=self-hosted 2025-12-04T09:18:39.2339937Z VERBOSE_TEST_LOGS=False 2025-12-04T09:18:39.2340107Z GITHUB_REF=refs/heads/main 2025-12-04T09:18:39.2340284Z RUNNER_OS=Linux 2025-12-04T09:18:39.2340435Z SHARD_NUMBER=3 2025-12-04T09:18:39.2340593Z GITHUB_REF_PROTECTED=true 2025-12-04T09:18:39.2340764Z RUNNER_MANUALLY_TRAP_SIG=1 2025-12-04T09:18:39.2340938Z HOME=/var/lib/jenkins 2025-12-04T09:18:39.2341124Z GITHUB_API_URL=https://api.github.com 2025-12-04T09:18:39.2341445Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-12-04T09:18:39.2341661Z RUNNER_DOCS_DIR=/home/runner/_work/_temp/docs 2025-12-04T09:18:39.2341919Z LANG=C.UTF-8 2025-12-04T09:18:39.2342096Z UCX_COMMIT=29831d319e6be55cb8c768ca61de335c934ca39e 2025-12-04T09:18:39.2342326Z PYTORCH_TEST_WITH_ROCM=1 2025-12-04T09:18:39.2342550Z RUNNER_TRACKING_ID=github_96ba7c2b-4e53-4bfa-a8cd-e94dafc1fabe 2025-12-04T09:18:39.2342793Z RUNNER_ARCH=X64 2025-12-04T09:18:39.2342953Z RUNNER_TEMP=/home/runner/_work/_temp 2025-12-04T09:18:39.2343141Z NUM_TEST_SHARDS=6 2025-12-04T09:18:39.2343294Z UCX_HOME=/usr 2025-12-04T09:18:39.2343592Z GITHUB_STATE=/home/runner/_work/_temp/_runner_file_commands/save_state_2bdd9d81-5b60-45e5-9d95-0e21b85ebd89 2025-12-04T09:18:39.2344077Z JOB_NAME=linux-jammy-rocm-py3.10 / test (default, 3, 6, linux.rocm.gpu.gfx942.1.b, mem_leak_check, unstable) 2025-12-04T09:18:39.2344413Z MAGMA_HOME=/opt/rocm/magma 2025-12-04T09:18:39.2344723Z GITHUB_ENV=/home/runner/_work/_temp/_runner_file_commands/set_env_2bdd9d81-5b60-45e5-9d95-0e21b85ebd89 2025-12-04T09:18:39.2345105Z GITHUB_EVENT_PATH=/home/runner/_work/_temp/_github_workflow/event.json 2025-12-04T09:18:39.2345360Z GITHUB_EVENT_NAME=schedule 2025-12-04T09:18:39.2345614Z GITHUB_ACTIONS_RUNNER_EXTRA_USER_AGENT=actions-runner-controller/0.12.1 2025-12-04T09:18:39.2345870Z DASHBOARD_TAG= 2025-12-04T09:18:39.2346022Z GITHUB_RUN_ID=19922849170 2025-12-04T09:18:39.2346357Z GITHUB_STEP_SUMMARY=/home/runner/_work/_temp/_runner_file_commands/step_summary_2bdd9d81-5b60-45e5-9d95-0e21b85ebd89 2025-12-04T09:18:39.2346722Z GITHUB_ACTOR=pytorchmergebot 2025-12-04T09:18:39.2346960Z PR_NUMBER= 2025-12-04T09:18:39.2347079Z GITHUB_RUN_ATTEMPT=1 2025-12-04T09:18:39.2347215Z ANACONDA_PYTHON_VERSION=3.10 2025-12-04T09:18:39.2347380Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-12-04T09:18:39.2347554Z TERM=vt100 2025-12-04T09:18:39.2347669Z INSTALLED_VISION=yes 2025-12-04T09:18:39.2347794Z BRANCH=main 2025-12-04T09:18:39.2347921Z OPENSSL_ROOT_DIR=/opt/openssl 2025-12-04T09:18:39.2348063Z TESTS_TO_INCLUDE= 2025-12-04T09:18:39.2348252Z GITHUB_ACTION_PATH=/home/runner/_work/pytorch/pytorch/./.github/actions/setup-rocm 2025-12-04T09:18:39.2348491Z GITHUB_SERVER_URL=https://github.com 2025-12-04T09:18:39.2348657Z PYTORCH_ROCM_ARCH=gfx90a;gfx942;gfx950;gfx1100 2025-12-04T09:18:39.2348850Z UCC_COMMIT=9f4b242cbbd8b1462cbc732eb29316cdfa124b77 2025-12-04T09:18:39.2349019Z REENABLED_ISSUES= 2025-12-04T09:18:39.2349144Z SHLVL=1 2025-12-04T09:18:39.2349248Z MAX_JOBS=126 2025-12-04T09:18:39.2349411Z RUNNER_TEST_RESULTS_DIR=/home/runner/_work/_temp/test-results 2025-12-04T09:18:39.2349610Z GITHUB_ACTOR_ID=97764156 2025-12-04T09:18:39.2349761Z RUNNER_TOOL_CACHE=/home/runner/_work/_tool 2025-12-04T09:18:39.2349960Z GITHUB_WORKFLOW_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:18:39.2350143Z GITHUB_REF_NAME=main 2025-12-04T09:18:39.2350275Z ROCM_PATH=/opt/rocm 2025-12-04T09:18:39.2350394Z GITHUB_JOB=test 2025-12-04T09:18:39.2350520Z NO_TEST_TIMEOUT=False 2025-12-04T09:18:39.2350664Z GITHUB_REPOSITORY=pytorch/pytorch 2025-12-04T09:18:39.2350810Z LC_ALL=C.UTF-8 2025-12-04T09:18:39.2350938Z GITHUB_RETENTION_DAYS=90 2025-12-04T09:18:39.2351083Z RUNNER_WORKSPACE=/home/runner/_work/pytorch 2025-12-04T09:18:39.2351244Z OPENSSL_DIR=/opt/openssl 2025-12-04T09:18:39.2351375Z GITHUB_ACTION_REPOSITORY= 2025-12-04T09:18:39.2351814Z PATH=/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T09:18:39.2352309Z GITHUB_BASE_REF= 2025-12-04T09:18:39.2352428Z CI=true 2025-12-04T09:18:39.2352545Z GITHUB_REPOSITORY_OWNER=pytorch 2025-12-04T09:18:39.2352692Z JOB_ID=57116213137 2025-12-04T09:18:39.2352809Z GITHUB_HEAD_REF= 2025-12-04T09:18:39.2352933Z GITHUB_ACTION_REF= 2025-12-04T09:18:39.2353052Z TEST_SHOWLOCALS=False 2025-12-04T09:18:39.2353193Z GITHUB_WORKFLOW=trunk-rocm-mi300 2025-12-04T09:18:39.2353346Z DEBIAN_FRONTEND=noninteractive 2025-12-04T09:18:39.2353652Z GITHUB_OUTPUT=/home/runner/_work/_temp/_runner_file_commands/set_output_2bdd9d81-5b60-45e5-9d95-0e21b85ebd89 2025-12-04T09:18:39.2353905Z NO_TD=False 2025-12-04T09:18:39.2354022Z OLDPWD=/var/lib/jenkins 2025-12-04T09:18:39.2354153Z _=/usr/bin/env 2025-12-04T09:18:39.2354316Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2025-12-04T09:18:39.2397659Z + TORCH_INSTALL_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch 2025-12-04T09:18:39.2397891Z + TORCH_BIN_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-12-04T09:18:39.2398125Z + TORCH_LIB_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib 2025-12-04T09:18:39.2398348Z + TORCH_TEST_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/test 2025-12-04T09:18:39.2398516Z + BUILD_DIR=build 2025-12-04T09:18:39.2398623Z + BUILD_RENAMED_DIR=build_renamed 2025-12-04T09:18:39.2398746Z + BUILD_BIN_DIR=build/bin 2025-12-04T09:18:39.2398855Z + SHARD_NUMBER=3 2025-12-04T09:18:39.2398975Z + NUM_TEST_SHARDS=6 2025-12-04T09:18:39.2399085Z + export TORCH_SERIALIZATION_DEBUG=1 2025-12-04T09:18:39.2399210Z + TORCH_SERIALIZATION_DEBUG=1 2025-12-04T09:18:39.2399327Z + export VALGRIND=ON 2025-12-04T09:18:39.2399431Z + VALGRIND=ON 2025-12-04T09:18:39.2399542Z + [[ linux-jammy-rocm-py3.10 == *clang9* ]] 2025-12-04T09:18:39.2399683Z + [[ linux-jammy-rocm-py3.10 == *xpu* ]] 2025-12-04T09:18:39.2399811Z + detect_cuda_arch 2025-12-04T09:18:39.2399922Z + [[ linux-jammy-rocm-py3.10 == *cuda* ]] 2025-12-04T09:18:39.2400060Z + [[ linux-jammy-rocm-py3.10 == *s390x* ]] 2025-12-04T09:18:39.2400242Z + [[ 0 == \1 ]] 2025-12-04T09:18:39.2400340Z + [[ True == \1 ]] 2025-12-04T09:18:39.2400451Z + [[ linux-jammy-rocm-py3.10 != *bazel* ]] 2025-12-04T09:18:39.2401453Z ++ realpath build/custom_test_artifacts 2025-12-04T09:18:39.2408667Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/pytorch/build/custom_test_artifacts 2025-12-04T09:18:39.2408861Z + [[ -n '' ]] 2025-12-04T09:18:39.2408970Z + echo 'Environment variables' 2025-12-04T09:18:39.2409092Z Environment variables 2025-12-04T09:18:39.2409206Z + env 2025-12-04T09:18:39.2416833Z GITHUB_WORKSPACE=/home/runner/_work/pytorch/pytorch 2025-12-04T09:18:39.2416998Z CONTINUE_THROUGH_ERROR=True 2025-12-04T09:18:39.2417130Z BUILD_ENVIRONMENT=linux-jammy-rocm-py3.10 2025-12-04T09:18:39.2417305Z HOSTNAME=linux.rocm.gpu.gfx942.1.b-gwk9b-runner-jfbtd 2025-12-04T09:18:39.2417540Z GITHUB_PATH=/home/runner/_work/_temp/_runner_file_commands/add_path_2bdd9d81-5b60-45e5-9d95-0e21b85ebd89 2025-12-04T09:18:39.2417747Z GITHUB_ACTION=__run_2 2025-12-04T09:18:39.2417868Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 2025-12-04T09:18:39.2417989Z GITHUB_RUN_NUMBER=689 2025-12-04T09:18:39.2418094Z TEST_CONFIG=default 2025-12-04T09:18:39.2418233Z RUNNER_NAME=linux.rocm.gpu.gfx942.1.b-gwk9b-runner-jfbtd 2025-12-04T09:18:39.2418393Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-12-04T09:18:39.2418521Z AWS_DEFAULT_REGION=us-east-1 2025-12-04T09:18:39.2418661Z RUNNER_ARTIFACT_DIR=/home/runner/_work/_temp/artifacts 2025-12-04T09:18:39.2418815Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-12-04T09:18:39.2418963Z GITHUB_REF_TYPE=branch 2025-12-04T09:18:39.2419090Z BASE_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:18:39.2419283Z HUGGING_FACE_HUB_TOKEN=*** 2025-12-04T09:18:39.2419422Z *** 2025-12-04T09:18:39.2419519Z GITHUB_REPOSITORY_ID=65600975 2025-12-04T09:18:39.2419634Z GITHUB_ACTIONS=true 2025-12-04T09:18:39.2419753Z SHA1=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:18:39.2419903Z GITHUB_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:18:39.2420164Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/trunk-rocm-mi300.yml@refs/heads/main 2025-12-04T09:18:39.2420359Z UCC_HOME=/usr 2025-12-04T09:18:39.2420463Z TORCH_SERIALIZATION_DEBUG=1 2025-12-04T09:18:39.2420584Z RUNNER_ENVIRONMENT=self-hosted 2025-12-04T09:18:39.2429530Z VERBOSE_TEST_LOGS=False 2025-12-04T09:18:39.2429662Z GITHUB_REF=refs/heads/main 2025-12-04T09:18:39.2429783Z RUNNER_OS=Linux 2025-12-04T09:18:39.2429887Z SHARD_NUMBER=3 2025-12-04T09:18:39.2430076Z GITHUB_REF_PROTECTED=true 2025-12-04T09:18:39.2430197Z RUNNER_MANUALLY_TRAP_SIG=1 2025-12-04T09:18:39.2430313Z HOME=/var/lib/jenkins 2025-12-04T09:18:39.2430438Z GITHUB_API_URL=https://api.github.com 2025-12-04T09:18:39.2430581Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-12-04T09:18:39.2430727Z RUNNER_DOCS_DIR=/home/runner/_work/_temp/docs 2025-12-04T09:18:39.2430863Z LANG=C.UTF-8 2025-12-04T09:18:39.2430987Z UCX_COMMIT=29831d319e6be55cb8c768ca61de335c934ca39e 2025-12-04T09:18:39.2431137Z PYTORCH_TEST_WITH_ROCM=1 2025-12-04T09:18:39.2431294Z RUNNER_TRACKING_ID=github_96ba7c2b-4e53-4bfa-a8cd-e94dafc1fabe 2025-12-04T09:18:39.2431453Z RUNNER_ARCH=X64 2025-12-04T09:18:39.2431564Z RUNNER_TEMP=/home/runner/_work/_temp 2025-12-04T09:18:39.2431691Z NUM_TEST_SHARDS=6 2025-12-04T09:18:39.2431795Z UCX_HOME=/usr 2025-12-04T09:18:39.2432053Z GITHUB_STATE=/home/runner/_work/_temp/_runner_file_commands/save_state_2bdd9d81-5b60-45e5-9d95-0e21b85ebd89 2025-12-04T09:18:39.2432367Z JOB_NAME=linux-jammy-rocm-py3.10 / test (default, 3, 6, linux.rocm.gpu.gfx942.1.b, mem_leak_check, unstable) 2025-12-04T09:18:39.2432583Z MAGMA_HOME=/opt/rocm/magma 2025-12-04T09:18:39.2432784Z GITHUB_ENV=/home/runner/_work/_temp/_runner_file_commands/set_env_2bdd9d81-5b60-45e5-9d95-0e21b85ebd89 2025-12-04T09:18:39.2433034Z GITHUB_EVENT_PATH=/home/runner/_work/_temp/_github_workflow/event.json 2025-12-04T09:18:39.2433205Z GITHUB_EVENT_NAME=schedule 2025-12-04T09:18:39.2433370Z GITHUB_ACTIONS_RUNNER_EXTRA_USER_AGENT=actions-runner-controller/0.12.1 2025-12-04T09:18:39.2433579Z DASHBOARD_TAG= 2025-12-04T09:18:39.2433686Z GITHUB_RUN_ID=19922849170 2025-12-04T09:18:39.2433902Z GITHUB_STEP_SUMMARY=/home/runner/_work/_temp/_runner_file_commands/step_summary_2bdd9d81-5b60-45e5-9d95-0e21b85ebd89 2025-12-04T09:18:39.2434134Z GITHUB_ACTOR=pytorchmergebot 2025-12-04T09:18:39.2434252Z PR_NUMBER= 2025-12-04T09:18:39.2434351Z GITHUB_RUN_ATTEMPT=1 2025-12-04T09:18:39.2434460Z VALGRIND=ON 2025-12-04T09:18:39.2434563Z ANACONDA_PYTHON_VERSION=3.10 2025-12-04T09:18:39.2434713Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-12-04T09:18:39.2434853Z TERM=vt100 2025-12-04T09:18:39.2434952Z INSTALLED_VISION=yes 2025-12-04T09:18:39.2435060Z BRANCH=main 2025-12-04T09:18:39.2435165Z OPENSSL_ROOT_DIR=/opt/openssl 2025-12-04T09:18:39.2435286Z TESTS_TO_INCLUDE= 2025-12-04T09:18:39.2435453Z GITHUB_ACTION_PATH=/home/runner/_work/pytorch/pytorch/./.github/actions/setup-rocm 2025-12-04T09:18:39.2435649Z GITHUB_SERVER_URL=https://github.com 2025-12-04T09:18:39.2435797Z PYTORCH_ROCM_ARCH=gfx90a;gfx942;gfx950;gfx1100 2025-12-04T09:18:39.2435961Z UCC_COMMIT=9f4b242cbbd8b1462cbc732eb29316cdfa124b77 2025-12-04T09:18:39.2436104Z REENABLED_ISSUES= 2025-12-04T09:18:39.2436206Z SHLVL=1 2025-12-04T09:18:39.2436299Z MAX_JOBS=126 2025-12-04T09:18:39.2436435Z RUNNER_TEST_RESULTS_DIR=/home/runner/_work/_temp/test-results 2025-12-04T09:18:39.2436593Z GITHUB_ACTOR_ID=97764156 2025-12-04T09:18:39.2436717Z RUNNER_TOOL_CACHE=/home/runner/_work/_tool 2025-12-04T09:18:39.2436883Z GITHUB_WORKFLOW_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T09:18:39.2437032Z GITHUB_REF_NAME=main 2025-12-04T09:18:39.2437139Z ROCM_PATH=/opt/rocm 2025-12-04T09:18:39.2437244Z GITHUB_JOB=test 2025-12-04T09:18:39.2437349Z NO_TEST_TIMEOUT=False 2025-12-04T09:18:39.2437467Z GITHUB_REPOSITORY=pytorch/pytorch 2025-12-04T09:18:39.2437592Z LC_ALL=C.UTF-8 2025-12-04T09:18:39.2437696Z GITHUB_RETENTION_DAYS=90 2025-12-04T09:18:39.2437824Z RUNNER_WORKSPACE=/home/runner/_work/pytorch 2025-12-04T09:18:39.2437956Z OPENSSL_DIR=/opt/openssl 2025-12-04T09:18:39.2438074Z GITHUB_ACTION_REPOSITORY= 2025-12-04T09:18:39.2438431Z PATH=/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T09:18:39.2438783Z GITHUB_BASE_REF= 2025-12-04T09:18:39.2438882Z CI=true 2025-12-04T09:18:39.2438980Z GITHUB_REPOSITORY_OWNER=pytorch 2025-12-04T09:18:39.2439152Z JOB_ID=57116213137 2025-12-04T09:18:39.2439252Z GITHUB_HEAD_REF= 2025-12-04T09:18:39.2439351Z GITHUB_ACTION_REF= 2025-12-04T09:18:39.2439449Z TEST_SHOWLOCALS=False 2025-12-04T09:18:39.2439562Z GITHUB_WORKFLOW=trunk-rocm-mi300 2025-12-04T09:18:39.2439686Z DEBIAN_FRONTEND=noninteractive 2025-12-04T09:18:39.2439891Z GITHUB_OUTPUT=/home/runner/_work/_temp/_runner_file_commands/set_output_2bdd9d81-5b60-45e5-9d95-0e21b85ebd89 2025-12-04T09:18:39.2440097Z NO_TD=False 2025-12-04T09:18:39.2440192Z OLDPWD=/var/lib/jenkins 2025-12-04T09:18:39.2440299Z _=/usr/bin/env 2025-12-04T09:18:39.2440399Z + echo 'Testing pytorch' 2025-12-04T09:18:39.2440508Z Testing pytorch 2025-12-04T09:18:39.2440609Z + export LANG=C.UTF-8 2025-12-04T09:18:39.2440713Z + LANG=C.UTF-8 2025-12-04T09:18:39.2440807Z + PR_NUMBER= 2025-12-04T09:18:39.2440909Z + [[ default == \d\e\f\a\u\l\t ]] 2025-12-04T09:18:39.2441033Z + export CUDA_VISIBLE_DEVICES=0 2025-12-04T09:18:39.2441153Z + CUDA_VISIBLE_DEVICES=0 2025-12-04T09:18:39.2441267Z + export HIP_VISIBLE_DEVICES=0 2025-12-04T09:18:39.2441387Z + HIP_VISIBLE_DEVICES=0 2025-12-04T09:18:39.2441504Z + [[ default == \d\i\s\t\r\i\b\u\t\e\d ]] 2025-12-04T09:18:39.2441630Z + [[ default == \s\l\o\w ]] 2025-12-04T09:18:39.2441764Z + [[ linux-jammy-rocm-py3.10 == *slow-gradcheck* ]] 2025-12-04T09:18:39.2441966Z + [[ linux-jammy-rocm-py3.10 == *cuda* ]] 2025-12-04T09:18:39.2442100Z + [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-12-04T09:18:39.2442238Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-12-04T09:18:39.2442377Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-12-04T09:18:39.2442575Z + [[ default == *crossref* ]] 2025-12-04T09:18:39.2442698Z + [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-12-04T09:18:39.2442822Z + export VALGRIND=OFF 2025-12-04T09:18:39.2442926Z + VALGRIND=OFF 2025-12-04T09:18:39.2443021Z + rocminfo 2025-12-04T09:18:39.2544065Z ROCk module version 6.12.12 is loaded 2025-12-04T09:18:39.2904609Z ===================== 2025-12-04T09:18:39.2904794Z HSA System Attributes 2025-12-04T09:18:39.2904968Z ===================== 2025-12-04T09:18:39.2905106Z Runtime Version: 1.18 2025-12-04T09:18:39.2905252Z Runtime Ext Version: 1.14 2025-12-04T09:18:39.2905498Z System Timestamp Freq.: 1000.000000MHz 2025-12-04T09:18:39.2905758Z Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) 2025-12-04T09:18:39.2906081Z Machine Model: LARGE 2025-12-04T09:18:39.2906321Z System Endianness: LITTLE 2025-12-04T09:18:39.2906510Z Mwaitx: DISABLED 2025-12-04T09:18:39.2906676Z XNACK enabled: NO 2025-12-04T09:18:39.2906829Z DMAbuf Support: YES 2025-12-04T09:18:39.2906976Z VMM Support: YES 2025-12-04T09:18:39.2907070Z 2025-12-04T09:18:39.2907121Z ========== 2025-12-04T09:18:39.2907260Z HSA Agents 2025-12-04T09:18:39.2907392Z ========== 2025-12-04T09:18:39.2907525Z ******* 2025-12-04T09:18:39.2907735Z Agent 1 2025-12-04T09:18:39.2907951Z ******* 2025-12-04T09:18:39.2908111Z Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T09:18:39.2908309Z Uuid: CPU-XX 2025-12-04T09:18:39.2908514Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T09:18:39.2908726Z Vendor Name: CPU 2025-12-04T09:18:39.2908927Z Feature: None specified 2025-12-04T09:18:39.2909132Z Profile: FULL_PROFILE 2025-12-04T09:18:39.2909336Z Float Round Mode: NEAR 2025-12-04T09:18:39.2909536Z Max Queue Number: 0(0x0) 2025-12-04T09:18:39.2909724Z Queue Min Size: 0(0x0) 2025-12-04T09:18:39.2909918Z Queue Max Size: 0(0x0) 2025-12-04T09:18:39.2910505Z Queue Type: MULTI 2025-12-04T09:18:39.2910705Z Node: 0 2025-12-04T09:18:39.2910927Z Device Type: CPU 2025-12-04T09:18:39.2911109Z Cache Info: 2025-12-04T09:18:39.2911271Z L1: 49152(0xc000) KB 2025-12-04T09:18:39.2911453Z Chip ID: 0(0x0) 2025-12-04T09:18:39.2911646Z ASIC Revision: 0(0x0) 2025-12-04T09:18:39.2911902Z Cacheline Size: 64(0x40) 2025-12-04T09:18:39.2912105Z Max Clock Freq. (MHz): 3300 2025-12-04T09:18:39.2912292Z BDFID: 0 2025-12-04T09:18:39.2912481Z Internal Node ID: 0 2025-12-04T09:18:39.2912769Z Compute Unit: 64 2025-12-04T09:18:39.2912968Z SIMDs per CU: 0 2025-12-04T09:18:39.2913155Z Shader Engines: 0 2025-12-04T09:18:39.2913352Z Shader Arrs. per Eng.: 0 2025-12-04T09:18:39.2913562Z WatchPts on Addr. Ranges:1 2025-12-04T09:18:39.2913750Z Memory Properties: 2025-12-04T09:18:39.2913895Z Features: None 2025-12-04T09:18:39.2914041Z Pool Info: 2025-12-04T09:18:39.2914310Z Pool 1 2025-12-04T09:18:39.2914480Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T09:18:39.2914683Z Size: 1584777168(0x5e75c7d0) KB 2025-12-04T09:18:39.2914883Z Allocatable: TRUE 2025-12-04T09:18:39.2915085Z Alloc Granule: 4KB 2025-12-04T09:18:39.2915301Z Alloc Recommended Granule:4KB 2025-12-04T09:18:39.2915519Z Alloc Alignment: 4KB 2025-12-04T09:18:39.2915736Z Accessible by all: TRUE 2025-12-04T09:18:39.2915914Z Pool 2 2025-12-04T09:18:39.2916101Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T09:18:39.2916303Z Size: 1584777168(0x5e75c7d0) KB 2025-12-04T09:18:39.2916505Z Allocatable: TRUE 2025-12-04T09:18:39.2916708Z Alloc Granule: 4KB 2025-12-04T09:18:39.2916919Z Alloc Recommended Granule:4KB 2025-12-04T09:18:39.2917126Z Alloc Alignment: 4KB 2025-12-04T09:18:39.2917308Z Accessible by all: TRUE 2025-12-04T09:18:39.2917459Z Pool 3 2025-12-04T09:18:39.2917595Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-12-04T09:18:39.2917750Z Size: 1584777168(0x5e75c7d0) KB 2025-12-04T09:18:39.2917906Z Allocatable: TRUE 2025-12-04T09:18:39.2918068Z Alloc Granule: 4KB 2025-12-04T09:18:39.2918233Z Alloc Recommended Granule:4KB 2025-12-04T09:18:39.2918405Z Alloc Alignment: 4KB 2025-12-04T09:18:39.2918576Z Accessible by all: TRUE 2025-12-04T09:18:39.2918719Z Pool 4 2025-12-04T09:18:39.2918855Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T09:18:39.2919013Z Size: 1584777168(0x5e75c7d0) KB 2025-12-04T09:18:39.2919221Z Allocatable: TRUE 2025-12-04T09:18:39.2919384Z Alloc Granule: 4KB 2025-12-04T09:18:39.2919551Z Alloc Recommended Granule:4KB 2025-12-04T09:18:39.2919725Z Alloc Alignment: 4KB 2025-12-04T09:18:39.2919889Z Accessible by all: TRUE 2025-12-04T09:18:39.2920034Z ISA Info: 2025-12-04T09:18:39.2920143Z ******* 2025-12-04T09:18:39.2920258Z Agent 2 2025-12-04T09:18:39.2920364Z ******* 2025-12-04T09:18:39.2920491Z Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T09:18:39.2920650Z Uuid: CPU-XX 2025-12-04T09:18:39.2920810Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T09:18:39.2920976Z Vendor Name: CPU 2025-12-04T09:18:39.2921139Z Feature: None specified 2025-12-04T09:18:39.2921300Z Profile: FULL_PROFILE 2025-12-04T09:18:39.2921461Z Float Round Mode: NEAR 2025-12-04T09:18:39.2921623Z Max Queue Number: 0(0x0) 2025-12-04T09:18:39.2921783Z Queue Min Size: 0(0x0) 2025-12-04T09:18:39.2922013Z Queue Max Size: 0(0x0) 2025-12-04T09:18:39.2922212Z Queue Type: MULTI 2025-12-04T09:18:39.2922359Z Node: 1 2025-12-04T09:18:39.2922508Z Device Type: CPU 2025-12-04T09:18:39.2922651Z Cache Info: 2025-12-04T09:18:39.2922773Z L1: 49152(0xc000) KB 2025-12-04T09:18:39.2922927Z Chip ID: 0(0x0) 2025-12-04T09:18:39.2923079Z ASIC Revision: 0(0x0) 2025-12-04T09:18:39.2923238Z Cacheline Size: 64(0x40) 2025-12-04T09:18:39.2923397Z Max Clock Freq. (MHz): 3300 2025-12-04T09:18:39.2923553Z BDFID: 0 2025-12-04T09:18:39.2923706Z Internal Node ID: 1 2025-12-04T09:18:39.2923864Z Compute Unit: 64 2025-12-04T09:18:39.2924025Z SIMDs per CU: 0 2025-12-04T09:18:39.2924184Z Shader Engines: 0 2025-12-04T09:18:39.2924345Z Shader Arrs. per Eng.: 0 2025-12-04T09:18:39.2924510Z WatchPts on Addr. Ranges:1 2025-12-04T09:18:39.2924658Z Memory Properties: 2025-12-04T09:18:39.2924779Z Features: None 2025-12-04T09:18:39.2924894Z Pool Info: 2025-12-04T09:18:39.2925010Z Pool 1 2025-12-04T09:18:39.2925147Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T09:18:39.2925308Z Size: 1585311828(0x5e7df054) KB 2025-12-04T09:18:39.2925465Z Allocatable: TRUE 2025-12-04T09:18:39.2925636Z Alloc Granule: 4KB 2025-12-04T09:18:39.2925821Z Alloc Recommended Granule:4KB 2025-12-04T09:18:39.2925995Z Alloc Alignment: 4KB 2025-12-04T09:18:39.2926166Z Accessible by all: TRUE 2025-12-04T09:18:39.2926313Z Pool 2 2025-12-04T09:18:39.2926455Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T09:18:39.2926653Z Size: 1585311828(0x5e7df054) KB 2025-12-04T09:18:39.2926809Z Allocatable: TRUE 2025-12-04T09:18:39.2926973Z Alloc Granule: 4KB 2025-12-04T09:18:39.2927153Z Alloc Recommended Granule:4KB 2025-12-04T09:18:39.2927327Z Alloc Alignment: 4KB 2025-12-04T09:18:39.2927502Z Accessible by all: TRUE 2025-12-04T09:18:39.2927650Z Pool 3 2025-12-04T09:18:39.2927787Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-12-04T09:18:39.2927943Z Size: 1585311828(0x5e7df054) KB 2025-12-04T09:18:39.2928098Z Allocatable: TRUE 2025-12-04T09:18:39.2928257Z Alloc Granule: 4KB 2025-12-04T09:18:39.2928424Z Alloc Recommended Granule:4KB 2025-12-04T09:18:39.2928588Z Alloc Alignment: 4KB 2025-12-04T09:18:39.2928746Z Accessible by all: TRUE 2025-12-04T09:18:39.2928887Z Pool 4 2025-12-04T09:18:39.2929020Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T09:18:39.2929169Z Size: 1585311828(0x5e7df054) KB 2025-12-04T09:18:39.2929316Z Allocatable: TRUE 2025-12-04T09:18:39.2929507Z Alloc Granule: 4KB 2025-12-04T09:18:39.2929670Z Alloc Recommended Granule:4KB 2025-12-04T09:18:39.2929832Z Alloc Alignment: 4KB 2025-12-04T09:18:39.2929991Z Accessible by all: TRUE 2025-12-04T09:18:39.2930139Z ISA Info: 2025-12-04T09:18:39.2930245Z ******* 2025-12-04T09:18:39.2930347Z Agent 3 2025-12-04T09:18:39.2930451Z ******* 2025-12-04T09:18:39.2930573Z Name: gfx942 2025-12-04T09:18:39.2930724Z Uuid: GPU-45724ba446fb6af5 2025-12-04T09:18:39.2930879Z Marketing Name: 2025-12-04T09:18:39.2931032Z Vendor Name: AMD 2025-12-04T09:18:39.2931198Z Feature: KERNEL_DISPATCH 2025-12-04T09:18:39.2931362Z Profile: BASE_PROFILE 2025-12-04T09:18:39.2931530Z Float Round Mode: NEAR 2025-12-04T09:18:39.2931692Z Max Queue Number: 128(0x80) 2025-12-04T09:18:39.2931887Z Queue Min Size: 64(0x40) 2025-12-04T09:18:39.2932041Z Queue Max Size: 131072(0x20000) 2025-12-04T09:18:39.2932196Z Queue Type: MULTI 2025-12-04T09:18:39.2932344Z Node: 2 2025-12-04T09:18:39.2932491Z Device Type: GPU 2025-12-04T09:18:39.2932629Z Cache Info: 2025-12-04T09:18:39.2932747Z L1: 32(0x20) KB 2025-12-04T09:18:39.2932882Z L2: 4096(0x1000) KB 2025-12-04T09:18:39.2933030Z L3: 262144(0x40000) KB 2025-12-04T09:18:39.2933171Z Chip ID: 29861(0x74a5) 2025-12-04T09:18:39.2933323Z ASIC Revision: 1(0x1) 2025-12-04T09:18:39.2933483Z Cacheline Size: 128(0x80) 2025-12-04T09:18:39.2933690Z Max Clock Freq. (MHz): 2100 2025-12-04T09:18:39.2933844Z BDFID: 1280 2025-12-04T09:18:39.2933995Z Internal Node ID: 2 2025-12-04T09:18:39.2934155Z Compute Unit: 304 2025-12-04T09:18:39.2934308Z SIMDs per CU: 4 2025-12-04T09:18:39.2934461Z Shader Engines: 32 2025-12-04T09:18:39.2934616Z Shader Arrs. per Eng.: 1 2025-12-04T09:18:39.2934776Z WatchPts on Addr. Ranges:4 2025-12-04T09:18:39.2934941Z Coherent Host Access: FALSE 2025-12-04T09:18:39.2935086Z Memory Properties: 2025-12-04T09:18:39.2935207Z Features: KERNEL_DISPATCH 2025-12-04T09:18:39.2935353Z Fast F16 Operation: TRUE 2025-12-04T09:18:39.2935522Z Wavefront Size: 64(0x40) 2025-12-04T09:18:39.2935684Z Workgroup Max Size: 1024(0x400) 2025-12-04T09:18:39.2935835Z Workgroup Max Size per Dimension: 2025-12-04T09:18:39.2935970Z x 1024(0x400) 2025-12-04T09:18:39.2936110Z y 1024(0x400) 2025-12-04T09:18:39.2936238Z z 1024(0x400) 2025-12-04T09:18:39.2936383Z Max Waves Per CU: 32(0x20) 2025-12-04T09:18:39.2936595Z Max Work-item Per CU: 2048(0x800) 2025-12-04T09:18:39.2936753Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T09:18:39.2936892Z Grid Max Size per Dimension: 2025-12-04T09:18:39.2937011Z x 2147483647(0x7fffffff) 2025-12-04T09:18:39.2937150Z y 65535(0xffff) 2025-12-04T09:18:39.2937286Z z 65535(0xffff) 2025-12-04T09:18:39.2937440Z Max fbarriers/Workgrp: 32 2025-12-04T09:18:39.2937656Z Packet Processor uCode:: 185 2025-12-04T09:18:39.2937823Z SDMA engine uCode:: 24 2025-12-04T09:18:39.2937988Z IOMMU Support:: None 2025-12-04T09:18:39.2938132Z Pool Info: 2025-12-04T09:18:39.2938253Z Pool 1 2025-12-04T09:18:39.2938397Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T09:18:39.2938565Z Size: 268419072(0xfffc000) KB 2025-12-04T09:18:39.2938728Z Allocatable: TRUE 2025-12-04T09:18:39.2938896Z Alloc Granule: 4KB 2025-12-04T09:18:39.2939070Z Alloc Recommended Granule:2048KB 2025-12-04T09:18:39.2939245Z Alloc Alignment: 4KB 2025-12-04T09:18:39.2939416Z Accessible by all: FALSE 2025-12-04T09:18:39.2939559Z Pool 2 2025-12-04T09:18:39.2939700Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T09:18:39.2939862Z Size: 268419072(0xfffc000) KB 2025-12-04T09:18:39.2940021Z Allocatable: TRUE 2025-12-04T09:18:39.2940191Z Alloc Granule: 4KB 2025-12-04T09:18:39.2940363Z Alloc Recommended Granule:2048KB 2025-12-04T09:18:39.2940538Z Alloc Alignment: 4KB 2025-12-04T09:18:39.2940707Z Accessible by all: FALSE 2025-12-04T09:18:39.2940857Z Pool 3 2025-12-04T09:18:39.2941029Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T09:18:39.2941186Z Size: 268419072(0xfffc000) KB 2025-12-04T09:18:39.2941342Z Allocatable: TRUE 2025-12-04T09:18:39.2941510Z Alloc Granule: 4KB 2025-12-04T09:18:39.2941680Z Alloc Recommended Granule:2048KB 2025-12-04T09:18:39.2941983Z Alloc Alignment: 4KB 2025-12-04T09:18:39.2942155Z Accessible by all: FALSE 2025-12-04T09:18:39.2942302Z Pool 4 2025-12-04T09:18:39.2942437Z Segment: GROUP 2025-12-04T09:18:39.2942592Z Size: 64(0x40) KB 2025-12-04T09:18:39.2942747Z Allocatable: FALSE 2025-12-04T09:18:39.2942916Z Alloc Granule: 0KB 2025-12-04T09:18:39.2943086Z Alloc Recommended Granule:0KB 2025-12-04T09:18:39.2943258Z Alloc Alignment: 0KB 2025-12-04T09:18:39.2943427Z Accessible by all: FALSE 2025-12-04T09:18:39.2943576Z ISA Info: 2025-12-04T09:18:39.2943691Z ISA 1 2025-12-04T09:18:39.2943834Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T09:18:39.2944046Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T09:18:39.2944218Z Profiles: HSA_PROFILE_BASE 2025-12-04T09:18:39.2944416Z Default Rounding Mode: NEAR 2025-12-04T09:18:39.2944590Z Default Rounding Mode: NEAR 2025-12-04T09:18:39.2944761Z Fast f16: TRUE 2025-12-04T09:18:39.2944927Z Workgroup Max Size: 1024(0x400) 2025-12-04T09:18:39.2945084Z Workgroup Max Size per Dimension: 2025-12-04T09:18:39.2945227Z x 1024(0x400) 2025-12-04T09:18:39.2945372Z y 1024(0x400) 2025-12-04T09:18:39.2945508Z z 1024(0x400) 2025-12-04T09:18:39.2945663Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T09:18:39.2945823Z Grid Max Size per Dimension: 2025-12-04T09:18:39.2945958Z x 2147483647(0x7fffffff) 2025-12-04T09:18:39.2946102Z y 65535(0xffff) 2025-12-04T09:18:39.2946243Z z 65535(0xffff) 2025-12-04T09:18:39.2946401Z FBarrier Max Size: 32 2025-12-04T09:18:39.2946548Z ISA 2 2025-12-04T09:18:39.2946705Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T09:18:39.2946892Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T09:18:39.2947064Z Profiles: HSA_PROFILE_BASE 2025-12-04T09:18:39.2947235Z Default Rounding Mode: NEAR 2025-12-04T09:18:39.2947412Z Default Rounding Mode: NEAR 2025-12-04T09:18:39.2947578Z Fast f16: TRUE 2025-12-04T09:18:39.2947739Z Workgroup Max Size: 1024(0x400) 2025-12-04T09:18:39.2947894Z Workgroup Max Size per Dimension: 2025-12-04T09:18:39.2948030Z x 1024(0x400) 2025-12-04T09:18:39.2948208Z y 1024(0x400) 2025-12-04T09:18:39.2948348Z z 1024(0x400) 2025-12-04T09:18:39.2948501Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T09:18:39.2948653Z Grid Max Size per Dimension: 2025-12-04T09:18:39.2948785Z x 2147483647(0x7fffffff) 2025-12-04T09:18:39.2948927Z y 65535(0xffff) 2025-12-04T09:18:39.2949068Z z 65535(0xffff) 2025-12-04T09:18:39.2949227Z FBarrier Max Size: 32 2025-12-04T09:18:39.2949374Z *** Done *** 2025-12-04T09:18:39.2957864Z + rocminfo 2025-12-04T09:18:39.2959575Z + grep -E 'Name:.*\sgfx|Marketing' 2025-12-04T09:18:39.3437968Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T09:18:39.3438514Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T09:18:39.3439038Z Name: gfx942 2025-12-04T09:18:39.3439464Z Marketing Name: 2025-12-04T09:18:39.3487426Z + MAYBE_ROCM=rocm/ 2025-12-04T09:18:39.3487879Z + [[ linux-jammy-rocm-py3.10 == *xpu* ]] 2025-12-04T09:18:39.3488075Z + [[ linux-jammy-rocm-py3.10 != *-bazel-* ]] 2025-12-04T09:18:39.3488220Z + pip_install ninja==1.10.2 2025-12-04T09:18:39.3488384Z + pip_install_pkg='python3 -m pip install --progress-bar off' 2025-12-04T09:18:39.3488576Z + python3 -m pip install --progress-bar off ninja==1.10.2 2025-12-04T09:18:39.5418417Z Collecting ninja==1.10.2 2025-12-04T09:18:39.5691087Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (5.0 kB) 2025-12-04T09:18:39.5771263Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2025-12-04T09:18:39.7429864Z Installing collected packages: ninja 2025-12-04T09:18:39.7430204Z Attempting uninstall: ninja 2025-12-04T09:18:39.7434335Z Found existing installation: ninja 1.11.1.4 2025-12-04T09:18:39.7444781Z Uninstalling ninja-1.11.1.4: 2025-12-04T09:18:39.7474871Z Successfully uninstalled ninja-1.11.1.4 2025-12-04T09:18:39.7583566Z Successfully installed ninja-1.10.2 2025-12-04T09:18:39.8000856Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T09:18:39.8002867Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T09:18:39.8003999Z + [[ linux-jammy-rocm-py3.10 == *aarch64* ]] 2025-12-04T09:18:39.8004386Z + [[ linux-jammy-rocm-py3.10 == *asan* ]] 2025-12-04T09:18:39.8004768Z + [[ linux-jammy-rocm-py3.10 == *-debug* ]] 2025-12-04T09:18:39.8005144Z + [[ linux-jammy-rocm-py3.10 != *-bazel-* ]] 2025-12-04T09:18:39.8005667Z + echo 'We are not in debug mode: linux-jammy-rocm-py3.10. Expect the assertion to pass' 2025-12-04T09:18:39.8006293Z We are not in debug mode: linux-jammy-rocm-py3.10. Expect the assertion to pass 2025-12-04T09:18:39.8006750Z + cd test 2025-12-04T09:18:39.8007132Z + python -c 'import torch; torch._C._crash_if_debug_asserts_fail(424242)' 2025-12-04T09:18:40.6783494Z + [[ default == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2025-12-04T09:18:40.6783943Z + [[ default == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2025-12-04T09:18:40.6784352Z + [[ default == \l\e\g\a\c\y\_\n\v\i\d\i\a\_\d\r\i\v\e\r ]] 2025-12-04T09:18:40.6785880Z + DYNAMO_BENCHMARK_FLAGS=() 2025-12-04T09:18:40.6786306Z + [[ default == *pr_time_benchmarks* ]] 2025-12-04T09:18:40.6786666Z + [[ default == *dynamo_eager* ]] 2025-12-04T09:18:40.6787008Z + [[ default == *aot_eager* ]] 2025-12-04T09:18:40.6787593Z + [[ default == *aot_inductor* ]] 2025-12-04T09:18:40.6787880Z + [[ default == *max_autotune_inductor* ]] 2025-12-04T09:18:40.6788178Z + [[ default == *inductor* ]] 2025-12-04T09:18:40.6788442Z + [[ default == *dynamic* ]] 2025-12-04T09:18:40.6788696Z + [[ default == *cpu* ]] 2025-12-04T09:18:40.6788932Z + [[ default == *xpu* ]] 2025-12-04T09:18:40.6789211Z + DYNAMO_BENCHMARK_FLAGS+=(--device cuda) 2025-12-04T09:18:40.6802183Z + [[ linux-jammy-rocm-py3.10 == *libtorch* ]] 2025-12-04T09:18:40.6802482Z + [[ linux-jammy-rocm-py3.10 == *-bazel-* ]] 2025-12-04T09:18:40.6805104Z + cd test 2025-12-04T09:18:40.6805562Z + python -c 'import torch; print(torch.__config__.show())' 2025-12-04T09:18:41.3901704Z PyTorch built with: 2025-12-04T09:18:41.3902130Z - GCC 11.4 2025-12-04T09:18:41.3902402Z - C++ Version: 201703 2025-12-04T09:18:41.3903029Z - Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-12-04T09:18:41.3903823Z - Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-12-04T09:18:41.3904308Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-12-04T09:18:41.3904680Z - LAPACK is enabled (usually provided by MKL) 2025-12-04T09:18:41.3905031Z - NNPACK is enabled 2025-12-04T09:18:41.3905336Z - CPU capability usage: AVX512 2025-12-04T09:18:41.3905655Z - HIP Runtime 7.1.25424 2025-12-04T09:18:41.3905944Z - MIOpen 3.5.1 2025-12-04T09:18:41.3906201Z - Magma 2.9.0 2025-12-04T09:18:41.3909574Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, COMMIT_SHA=35b7a9a26c5923d98aebaa41a031dae21788a9ee, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_FBGEMM_GENAI -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, TORCH_VERSION=2.10.0, USE_CUDA=OFF, USE_CUDNN=OFF, USE_CUSPARSELT=OFF, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=ON, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, USE_XPU=OFF, 2025-12-04T09:18:41.3912353Z 2025-12-04T09:18:41.6622211Z + cd test 2025-12-04T09:18:41.6622397Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2025-12-04T09:18:42.2915858Z ATen/Parallel: 2025-12-04T09:18:42.2916163Z at::get_num_threads() : 128 2025-12-04T09:18:42.2916448Z at::get_num_interop_threads() : 128 2025-12-04T09:18:42.2916730Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-12-04T09:18:42.2917020Z omp_get_max_threads() : 128 2025-12-04T09:18:42.2917498Z Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-12-04T09:18:42.2918024Z mkl_get_max_threads() : 128 2025-12-04T09:18:42.2918369Z Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-12-04T09:18:42.2918747Z std::thread::hardware_concurrency() : 128 2025-12-04T09:18:42.2919020Z Environment variables: 2025-12-04T09:18:42.2919250Z OMP_NUM_THREADS : [not set] 2025-12-04T09:18:42.2919485Z MKL_NUM_THREADS : [not set] 2025-12-04T09:18:42.2919737Z ATen parallel backend: OpenMP 2025-12-04T09:18:42.2919897Z 2025-12-04T09:18:42.5325113Z + [[ default == *numpy_2* ]] 2025-12-04T09:18:42.5325501Z + [[ linux-jammy-rocm-py3.10 == *aarch64* ]] 2025-12-04T09:18:42.5325803Z + [[ default == *backward* ]] 2025-12-04T09:18:42.5326078Z + [[ default == *libtorch_agnostic_targetting* ]] 2025-12-04T09:18:42.5326364Z + [[ default == *xla* ]] 2025-12-04T09:18:42.5326583Z + [[ default == *vllm* ]] 2025-12-04T09:18:42.5327480Z + [[ default == *executorch* ]] 2025-12-04T09:18:42.5327725Z + [[ default == \j\i\t\_\l\e\g\a\c\y ]] 2025-12-04T09:18:42.5327997Z + [[ default == \q\u\a\n\t\i\z\a\t\i\o\n ]] 2025-12-04T09:18:42.5328279Z + [[ linux-jammy-rocm-py3.10 == *libtorch* ]] 2025-12-04T09:18:42.5328556Z + [[ default == distributed ]] 2025-12-04T09:18:42.5328804Z + [[ default == *operator_benchmark* ]] 2025-12-04T09:18:42.5329082Z + [[ default == *operator_microbenchmark* ]] 2025-12-04T09:18:42.5329357Z + [[ default == *attention_microbenchmark* ]] 2025-12-04T09:18:42.5329649Z + [[ default == *inductor_distributed* ]] 2025-12-04T09:18:42.5329915Z + [[ default == *inductor-halide* ]] 2025-12-04T09:18:42.5330179Z + [[ default == *inductor-pallas* ]] 2025-12-04T09:18:42.5330445Z + [[ default == *inductor-triton-cpu* ]] 2025-12-04T09:18:42.5330734Z + [[ default == *inductor-micro-benchmark* ]] 2025-12-04T09:18:42.5331043Z + [[ default == *aoti_cross_compile_for_windows* ]] 2025-12-04T09:18:42.5331343Z + [[ default == *huggingface* ]] 2025-12-04T09:18:42.5331580Z + [[ default == *timm* ]] 2025-12-04T09:18:42.5331807Z + [[ default == cachebench ]] 2025-12-04T09:18:42.5332106Z + [[ default == verify_cachebench ]] 2025-12-04T09:18:42.5332357Z + [[ default == *torchbench* ]] 2025-12-04T09:18:42.5332606Z + [[ default == *inductor_cpp_wrapper* ]] 2025-12-04T09:18:42.5332875Z + [[ default == *inductor_core* ]] 2025-12-04T09:18:42.5333117Z + [[ default == *inductor* ]] 2025-12-04T09:18:42.5333349Z + [[ default == *einops* ]] 2025-12-04T09:18:42.5333588Z + [[ default == *dynamo_core* ]] 2025-12-04T09:18:42.5333971Z + [[ default == *dynamo_wrapped* ]] 2025-12-04T09:18:42.5334239Z + [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-12-04T09:18:42.5334501Z + [[ -n '' ]] 2025-12-04T09:18:42.5334685Z + [[ 3 == 1 ]] 2025-12-04T09:18:42.5334872Z + [[ 3 == 2 ]] 2025-12-04T09:18:42.5335051Z + [[ 3 -gt 2 ]] 2025-12-04T09:18:42.5335252Z + install_torchvision 2025-12-04T09:18:42.5335468Z + local orig_preload 2025-12-04T09:18:42.5335677Z + local commit 2025-12-04T09:18:42.5335888Z ++ get_pinned_commit vision 2025-12-04T09:18:42.5336129Z ++ cat .github/ci_commit_pins/vision.txt 2025-12-04T09:18:42.5341990Z + commit=617079d944b0e72632311c30ae2bbdf1168b901e 2025-12-04T09:18:42.5342217Z + orig_preload= 2025-12-04T09:18:42.5342363Z + '[' -n '' ']' 2025-12-04T09:18:42.5342524Z + [[ linux-jammy-rocm-py3.10 == *cuda* ]] 2025-12-04T09:18:42.5342909Z + pip_build_and_install git+https://github.com/pytorch/vision.git@617079d944b0e72632311c30ae2bbdf1168b901e dist/vision 2025-12-04T09:18:42.5343437Z + local build_target=git+https://github.com/pytorch/vision.git@617079d944b0e72632311c30ae2bbdf1168b901e 2025-12-04T09:18:42.5343755Z + local wheel_dir=dist/vision 2025-12-04T09:18:42.5343927Z + local found_whl=0 2025-12-04T09:18:42.5344089Z + for file in "${wheel_dir}"/*.whl 2025-12-04T09:18:42.5344276Z + [[ -f dist/vision/*.whl ]] 2025-12-04T09:18:42.5344445Z + '[' 0 == 0 ']' 2025-12-04T09:18:42.5344834Z + python3 -m pip wheel --no-build-isolation --no-deps -w dist/vision git+https://github.com/pytorch/vision.git@617079d944b0e72632311c30ae2bbdf1168b901e 2025-12-04T09:18:42.6837539Z Collecting git+https://github.com/pytorch/vision.git@617079d944b0e72632311c30ae2bbdf1168b901e 2025-12-04T09:18:42.6840253Z Cloning https://github.com/pytorch/vision.git (to revision 617079d944b0e72632311c30ae2bbdf1168b901e) to /tmp/pip-req-build-r4cfrrtj 2025-12-04T09:18:42.6857775Z Running command git clone --filter=blob:none --quiet https://github.com/pytorch/vision.git /tmp/pip-req-build-r4cfrrtj 2025-12-04T09:18:45.9987071Z Running command git rev-parse -q --verify 'sha^617079d944b0e72632311c30ae2bbdf1168b901e' 2025-12-04T09:18:46.0004385Z Running command git fetch -q https://github.com/pytorch/vision.git 617079d944b0e72632311c30ae2bbdf1168b901e 2025-12-04T09:18:46.6529534Z Resolved https://github.com/pytorch/vision.git to commit 617079d944b0e72632311c30ae2bbdf1168b901e 2025-12-04T09:18:48.2319431Z Preparing metadata (pyproject.toml) ... [?25l- \ | done 2025-12-04T09:18:48.2345307Z [?25hBuilding wheels for collected packages: torchvision 2025-12-04T09:19:26.5066811Z Building wheel for torchvision (pyproject.toml) ... [?25l- \ | / - \ | / - \ | / - \ | / - \ | / - \ | / - done 2025-12-04T09:19:26.5086644Z [?25h Created wheel for torchvision: filename=torchvision-0.25.0a0+617079d-cp310-cp310-linux_x86_64.whl size=1809011 sha256=a479b31da22f7ef687a96f9f92e9fc404ee91f14afbbcafd64eac877edd28a63 2025-12-04T09:19:26.5089758Z Stored in directory: /var/lib/jenkins/.cache/pip/wheels/12/b2/29/1f82685c5b5173629e1f36a9b93989ce92ce563e5fb91d27ac 2025-12-04T09:19:26.5115129Z Successfully built torchvision 2025-12-04T09:19:26.5690985Z + for file in "${wheel_dir}"/*.whl 2025-12-04T09:19:26.5691660Z + pip_install_whl dist/vision/torchvision-0.25.0a0+617079d-cp310-cp310-linux_x86_64.whl 2025-12-04T09:19:26.5692527Z + args=('dist/vision/torchvision-0.25.0a0+617079d-cp310-cp310-linux_x86_64.whl') 2025-12-04T09:19:26.5693000Z + local args 2025-12-04T09:19:26.5693487Z + [[ dist/vision/torchvision-0.25.0a0+617079d-cp310-cp310-linux_x86_64.whl == *\ * ]] 2025-12-04T09:19:26.5693894Z + for path in "${args[@]}" 2025-12-04T09:19:26.5694273Z + echo 'Installing dist/vision/torchvision-0.25.0a0+617079d-cp310-cp310-linux_x86_64.whl' 2025-12-04T09:19:26.5694798Z Installing dist/vision/torchvision-0.25.0a0+617079d-cp310-cp310-linux_x86_64.whl 2025-12-04T09:19:26.5695403Z + python3 -mpip install --no-index --no-deps dist/vision/torchvision-0.25.0a0+617079d-cp310-cp310-linux_x86_64.whl 2025-12-04T09:19:26.7209715Z Processing ./dist/vision/torchvision-0.25.0a0+617079d-cp310-cp310-linux_x86_64.whl 2025-12-04T09:19:26.7255386Z Installing collected packages: torchvision 2025-12-04T09:19:26.9274529Z Successfully installed torchvision-0.25.0a0+617079d 2025-12-04T09:19:26.9464856Z + '[' -n '' ']' 2025-12-04T09:19:26.9465064Z + test_python_shard 3 2025-12-04T09:19:26.9465248Z + [[ -z 6 ]] 2025-12-04T09:19:26.9467811Z + python test/run_test.py --exclude-jit-executor --exclude-distributed-tests --exclude-quantization-tests --shard 3 6 --verbose --upload-artifacts-while-running 2025-12-04T09:19:28.5668402Z Excluding inductor/test_max_autotune on ROCm 2025-12-04T09:19:28.5668756Z Excluding test_cuda_nvml_based_avail on ROCm 2025-12-04T09:19:29.5473399Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/pytorch/test/.pytorch-disabled-tests.json 2025-12-04T09:19:29.8922877Z Ignoring disabled issues: [''] 2025-12-04T09:19:29.8972559Z Found test times from artifacts 2025-12-04T09:19:29.9137939Z Found test times from artifacts 2025-12-04T09:19:29.9142999Z Running all tests 2025-12-04T09:19:29.9428416Z Running parallel tests on 1 processes 2025-12-04T09:19:29.9431348Z Name: tests to run (est. time: 181.41min) 2025-12-04T09:19:29.9431801Z Serial tests (91): 2025-12-04T09:19:29.9432401Z inductor/test_aot_inductor 2/3 2025-12-04T09:19:29.9432823Z inductor/test_torchinductor_codegen_dynamic_shapes 1/4 2025-12-04T09:19:29.9433192Z inductor/test_torchinductor_opinfo 2/12 2025-12-04T09:19:29.9433430Z inductor/test_torchinductor_opinfo 8/12 2025-12-04T09:19:29.9433649Z inductor/test_group_batch_fusion 1/1 2025-12-04T09:19:29.9433867Z dynamo/test_dynamic_shapes 2/2 2025-12-04T09:19:29.9434076Z inductor/test_cuda_repro 1/1 2025-12-04T09:19:29.9434272Z dynamo/test_after_aot 1/1 2025-12-04T09:19:29.9434473Z inductor/test_snode_runtime 1/1 2025-12-04T09:19:29.9434679Z inductor/test_minifier 1/1 2025-12-04T09:19:29.9434871Z inductor/test_perf 1/1 2025-12-04T09:19:29.9435069Z inductor/test_fused_attention 1/1 2025-12-04T09:19:29.9435291Z inductor/test_mkldnn_pattern_matcher 1/2 2025-12-04T09:19:29.9435524Z inductor/test_cpu_select_algorithm 1/1 2025-12-04T09:19:29.9435741Z inductor/test_cuda_select_algorithm 1/1 2025-12-04T09:19:29.9435960Z inductor/test_aot_inductor_arrayref 1/1 2025-12-04T09:19:29.9436176Z inductor/test_deterministic 1/4 2025-12-04T09:19:29.9436865Z inductor/test_inductor_utils 1/1 2025-12-04T09:19:29.9437094Z inductor/test_template_heuristics_registry 1/1 2025-12-04T09:19:29.9437327Z inductor/test_async_compile 1/1 2025-12-04T09:19:29.9437535Z inductor/test_gpu_cpp_wrapper 1/1 2025-12-04T09:19:29.9437740Z dynamo/test_utils 1/1 2025-12-04T09:19:29.9437933Z inductor/test_provenance_tracing 1/1 2025-12-04T09:19:29.9438146Z dynamo/test_interop 1/1 2025-12-04T09:19:29.9438337Z functorch/test_eager_transforms 1/1 2025-12-04T09:19:29.9438546Z inductor/test_benchmarking 1/1 2025-12-04T09:19:29.9438750Z inductor/test_helion_kernels 1/1 2025-12-04T09:19:29.9438949Z inductor/test_quantization 1/1 2025-12-04T09:19:29.9439143Z inductor/test_best_config 1/1 2025-12-04T09:19:29.9439337Z export/test_tools 1/1 2025-12-04T09:19:29.9439531Z inductor/test_compiled_optimizers 1/2 2025-12-04T09:19:29.9439740Z inductor/test_aot_inductor_utils 1/1 2025-12-04T09:19:29.9439951Z dynamo/test_graph_region_tracker 1/1 2025-12-04T09:19:29.9440155Z inductor/test_compile 1/1 2025-12-04T09:19:29.9440355Z inductor/test_scatter_optimization 1/1 2025-12-04T09:19:29.9440561Z dynamo/test_functions 1/1 2025-12-04T09:19:29.9440757Z inductor/test_ordered_set 1/1 2025-12-04T09:19:29.9440957Z dynamo/test_install_free_tensors 1/1 2025-12-04T09:19:29.9441209Z inductor/test_torchinductor_codegen_config_overrides 1/1 2025-12-04T09:19:29.9441461Z export/test_passes 1/1 2025-12-04T09:19:29.9441646Z dynamo/test_autograd_function 1/1 2025-12-04T09:19:29.9442045Z inductor/test_codecache 1/1 2025-12-04T09:19:29.9442250Z inductor/test_distributed_patterns 1/1 2025-12-04T09:19:29.9442463Z dynamo/test_fake_distributed 1/1 2025-12-04T09:19:29.9442666Z export/test_nativert 1/1 2025-12-04T09:19:29.9442862Z inductor/test_custom_op_autotune 1/1 2025-12-04T09:19:29.9443073Z export/test_converter 1/1 2025-12-04T09:19:29.9443246Z dynamo/test_reorder_logs 1/1 2025-12-04T09:19:29.9443399Z dynamo/test_subclasses 1/1 2025-12-04T09:19:29.9443544Z export/test_verifier 1/1 2025-12-04T09:19:29.9443690Z export/test_sparse 1/1 2025-12-04T09:19:29.9443829Z test_weak 1/1 2025-12-04T09:19:29.9443947Z test_decomp 2/12 2025-12-04T09:19:29.9444083Z test_decomp 8/12 2025-12-04T09:19:29.9444216Z lazy/test_functionalization 1/1 2025-12-04T09:19:29.9444377Z torch_np/test_random 1/1 2025-12-04T09:19:29.9444528Z nn/test_multihead_attention 1/1 2025-12-04T09:19:29.9444684Z lazy/test_bindings 1/1 2025-12-04T09:19:29.9444818Z xpu/test_conv 1/1 2025-12-04T09:19:29.9444957Z test_utils 1/1 2025-12-04T09:19:29.9445073Z test_pytree 1/1 2025-12-04T09:19:29.9445210Z test_namedtuple_return_api 1/1 2025-12-04T09:19:29.9445371Z profiler/test_record_function 1/1 2025-12-04T09:19:29.9445531Z test_compile_benchmark_util 1/1 2025-12-04T09:19:29.9445696Z test_set_default_mobile_cpu_allocator 1/1 2025-12-04T09:19:29.9445866Z test_fake_tensor 1/1 2025-12-04T09:19:29.9446002Z test_stateless 1/1 2025-12-04T09:19:29.9446141Z test_binary_ufuncs 1/1 2025-12-04T09:19:29.9446279Z test_ops_jit 1/1 2025-12-04T09:19:29.9446409Z test_nestedtensor 2/2 2025-12-04T09:19:29.9446548Z test_modules 1/2 2025-12-04T09:19:29.9446684Z test_cpp_extensions_mtia_backend 1/1 2025-12-04T09:19:29.9446847Z lazy/test_ts_opinfo 1/1 2025-12-04T09:19:29.9446984Z test_dynamic_shapes 1/1 2025-12-04T09:19:29.9447125Z test_schema_check 1/1 2025-12-04T09:19:29.9447261Z test_ops 3/5 2025-12-04T09:19:29.9447381Z test_jit_llga_fuser 1/1 2025-12-04T09:19:29.9447526Z test_sparse_csr 2/2 2025-12-04T09:19:29.9447666Z optim/test_optim 1/1 2025-12-04T09:19:29.9447819Z torch_np/numpy_tests/core/test_getlimits 1/1 2025-12-04T09:19:29.9447992Z torch_np/test_ndarray_methods 1/1 2025-12-04T09:19:29.9448145Z test_view_ops 1/1 2025-12-04T09:19:29.9448277Z test_type_info 1/1 2025-12-04T09:19:29.9448410Z functorch/test_aotdispatch 1/1 2025-12-04T09:19:29.9448605Z test_nn 1/2 2025-12-04T09:19:29.9448743Z torch_np/test_reductions 1/1 2025-12-04T09:19:29.9448909Z torch_np/numpy_tests/core/test_scalar_ctors 1/1 2025-12-04T09:19:29.9449098Z torch_np/numpy_tests/lib/test_arraypad 1/1 2025-12-04T09:19:29.9449257Z test_prims 1/1 2025-12-04T09:19:29.9449379Z test_spectral_ops 1/1 2025-12-04T09:19:29.9449514Z doctests 1/1 2025-12-04T09:19:29.9449634Z Parallel tests (0): 2025-12-04T09:19:29.9449777Z Name: excluded (est. time: 0.0min) 2025-12-04T09:19:29.9449926Z Serial tests (0): 2025-12-04T09:19:29.9450049Z Parallel tests (0): 2025-12-04T09:19:29.9450282Z Running inductor/test_aot_inductor 2/3 ... [2025-12-04 09:19:29.943291][3566478.468102771] 2025-12-04T09:19:29.9450526Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T09:19:29.9451055Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor.py', '--shard-id=2', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:19:29.943513] 2025-12-04T09:28:11.1094578Z 2025-12-04T09:28:11.1095608Z inductor/test_aot_inductor 2/3 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_2.3_e93a53a2ed56016f_.log 2025-12-04T09:28:11.1158754Z Running 320 items in this shard: test/inductor/test_aot_inductor.py::AOTInductorLoggingTest::test_shape_env_reuse, test/inductor/test_aot_inductor.py::AOTInductorLoggingTest::test_shape_env_reuse_zero_consts_use_consts_asm_false, test/inductor/test_aot_inductor.py::TestAOTInductorConfig::test_no_compile_standalone, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aliased_buffer_reuse_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_fp8_dtype_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_sym_inputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_user_defined_triton_kernel_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_runtime_asserts_backed_symint_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_user_defined_triton_kernel_profiling_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_assert_async_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_autotune_int64_user_defined_triton_kernel_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_buffer_mutation_3_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_clamp_decomposition_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_cpu_predicate_cuda_operands_max_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_cpu_predicate_cuda_operands_max_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_nested_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_cond_with_multiple_outputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_consecutive_compiles_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_constant_original_fqn_and_dtype_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_constant_type_propagation_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_conv3d_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_custom_op_in_subgraph_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_d2h_copy_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dup_unbacked_sym_decl_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dynamic_cat_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dynamic_scalar_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_dynamic_smem_above_default_limit_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_empty_constant_folding_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fill__fallback_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fp8_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_fqn_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_inf_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_large_dynamic_dim_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_libtorch_free_so_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_linear_freezing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_misaligned_input_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_misc_1_max_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_missing_cubin_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_missing_output_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_model_modified_weights_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_nan_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_no_args_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_none_args_aot_codegen_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_output_path_1_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_profile_benchmark_harness_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_proxy_executor_hann_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_proxy_executor_permute_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_repeat_interleave_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_repeated_calling_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_return_constant_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_reuse_kernel_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_complex_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_runtime_checks_fp8_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_same_backing_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_scaled_dot_product_efficient_attention_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sdpa_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sdpa_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_embed_kernel_binary_False_max_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_embed_kernel_binary_True_max_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_simple_embed_kernel_binary_True_max_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_size_from_multi_output_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_size_with_unbacked_add_expr_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_size_with_unbacked_add_expr_transitive_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_small_constant_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sym_i64_input_codegen_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_symfloat_item_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_symint_item_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_sympy_cpp_printer_min_max_minmax0_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_autotuning_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_dynamic_launcher_grid_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_sympy_expr_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_sympy_fn_like_arg_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_new_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_False_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_triton_next_power_of_2_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_unbacked_expr_replacements_shift_k_1_use_static_size_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_unbacked_expr_replacements_shift_k_2_use_static_size_True_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_update_constant_buffer_simple_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_update_user_managed_buffer_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_upper_bound_i64_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_nested_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_outer_buffers_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_while_loop_with_pytree_inputs_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_zero_grid_with_unbacked_symbols_cpu, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aot_inductor_consts_cpp_build_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_fp8_dtype_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_debug_printer_user_defined_triton_kernel_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_runtime_asserts_backed_symint_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_aoti_runtime_asserts_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_bmm_multiple_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_bool_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_boolean_indexing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_buffer_mutation_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_codegen_int_array_var_fix_memory_leak_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_cpu_predicate_cuda_operands_max_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_mismatched_branch_output_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_mismatched_branch_output_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_non_tensor_predicates_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_symint_input_disable_one_pass_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_unbacked_symint_closure_dynamic_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_with_multiple_outputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_with_reinterpret_view_inputs_outputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_cond_with_replace_view_ops_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_consecutive_compiles_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_constant_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_constant_folding_with_update_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_constant_type_propagation_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_convolution_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_custom_op_in_subgraph_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_dup_unbacked_sym_decl_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_dup_unbacked_sym_decl_with_refinement_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_duplicated_params_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_dynamic_scalar_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_embedding_bag_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_empty_constant_folding_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_empty_graph_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fallback_kernel_with_symexpr_output_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fallback_mem_leak_fix_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fft_c2c_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fp8_view_of_param_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_fx_gm_return_tuple_validation_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_index_put_with_none_index_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_int_list_input_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_issue_140766_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_large_dynamic_dim_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_large_mmaped_weights_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_linear_freezing_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_misc_1_max_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_missing_cubin_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_missing_output_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_mixed_device_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_model_modified_weights_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_narrow_fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_non_contiguous_output_alias_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_output_misaligned_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_output_path_1_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_poi_multiple_dynamic_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_profile_benchmark_harness_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_proxy_executor_abs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_quanatized_int8_linear_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_repeat_interleave_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_repeat_output_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_replicate_on_devices_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_dtype_failed_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_fp8_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_large_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_runtime_checks_shape_failed_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_scaled_dot_product_efficient_attention_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_scatter_reduce_fallback_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_sdpa_2_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_shifted_constraint_ranges_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_simple_embed_kernel_binary_True_max_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_size_with_unbacked_add_expr_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_so_without_weight_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_stride_with_unbacked_expr_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_subclasses_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_sym_i64_input_codegen_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_symfloat_item_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_autotuning_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_dynamic_launcher_grid_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_dynamic_launcher_grid_infer_from_tensor_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_bool_param_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_dynamic_grid_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_equal_to_1_arg_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_equal_to_1_float_arg_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_on_device_tma_dynamic_False_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_old_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_new_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_kernel_weird_param_order_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_triton_mutated_autotuning_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_unbacked_expr_replacements_shift_k_0_use_static_size_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_unbacked_expr_replacements_shift_k_3_use_static_size_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_unbacked_expr_replacements_shift_k_3_use_static_size_True_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_unbounded_expr_substitutions_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_update_constant_buffer_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_update_constant_buffer_simple_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_update_user_managed_buffer_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_using_model_name_for_files_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_nested_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_conv_dynamic_False_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_outer_code_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_while_loop_with_pytree_inputs_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_with_no_triton_profiler_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_with_offset_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleGpu::test_zero_grid_with_unbacked_symbols_cuda, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_addmm_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_constant_tensor_name_collision_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_cpp_kernel_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_sym_inputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_debug_printer_user_defined_triton_kernel_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_profiler_enable_kernel_profile_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_aoti_profiler_enable_kernel_profile_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_autotune_int64_user_defined_triton_kernel_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_bool_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_boolean_indexing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_mutation_and_force_mmap_weights_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_buffer_reuse_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_cpu_predicate_cuda_operands_max_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_mismatched_branch_output_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_non_tensor_predicates_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_simple_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_symint_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_cond_with_outer_code_before_after_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_consecutive_compiles_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_constant_folding_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_constant_folding_with_update_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_conv3d_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_conv_freezing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_custom_op_in_subgraph_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_dup_unbacked_sym_decl_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_duplicate_constant_folding_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_duplicated_params_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_empty_cat_dtype_promotion_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_empty_constant_folding_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_extract_constants_map_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_freezing_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_int_list_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_linear_dynamic_maxautotune_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_load_package_multiple_gpus_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_misaligned_input_1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_misaligned_input_2_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_misc_1_max_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_missing_cubin_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_model_modified_weights_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_multiple_output_alias_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_non_default_gpu_device_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_non_tensor_input_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_on_gpu_device1_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_pad_fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_profile_benchmark_harness_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_proxy_executor_hann_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_quantized_linear_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_repeated_user_defined_triton_kernel_embed_kernel_binary_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_replicate_on_devices_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_return_constant_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_reuse_kernel_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_complex_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_runtime_checks_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_scaled_dot_product_efficient_attention_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_scaled_grouped_mm_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_scatter_fallback_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_sdpa_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_embed_kernel_binary_True_max_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_multi_arch_embed_kernel_binary_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_simple_split_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_size_with_unbacked_add_expr_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_so_without_weight_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_dynamic_launcher_grid_infer_from_tensor_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_dynamic_launcher_grid_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_dynamic_grid_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_equal_to_1_float_arg_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_equal_to_1_float_arg_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_extern_kernel_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_on_device_tma_dynamic_False_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_on_device_tma_dynamic_True_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_reinterpret_view_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_sympy_fn_like_arg_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_new_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_old_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_triton_mutated_autotuning_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_unbacked_expr_replacements_shift_k_2_use_static_size_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_unbacked_expr_replacements_shift_k_3_use_static_size_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_update_constant_buffer_simple_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_upper_bound_i64_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_view_outputs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_weight_on_disk_legacy_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_conv_dynamic_True_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_outer_code_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_while_loop_with_sym_expr_cond_dynamic_False_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_with_cudagraphs_mps, test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleMps::test_with_offset_mps 2025-12-04T09:28:11.1206874Z 2025-12-04T09:28:11.1207048Z Finished inductor/test_aot_inductor 2/3 ... [2025-12-04 09:28:11.109543][3566999.634353218], took 8.69min 2025-12-04T09:28:11.1207457Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T09:28:13.2747417Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T09:28:13.2748130Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T09:28:13.2748626Z Uploading artifacts took 0.00 seconds 2025-12-04T09:28:13.2749456Z Running inductor/test_torchinductor_codegen_dynamic_shapes 1/4 ... [2025-12-04 09:28:13.274745][3567001.799553139] 2025-12-04T09:28:13.2750074Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T09:28:13.2751746Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_codegen_dynamic_shapes.py', '--shard-id=1', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:28:13.275014] 2025-12-04T09:33:32.8445111Z 2025-12-04T09:33:32.8446373Z inductor/test_torchinductor_codegen_dynamic_shapes 1/4 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_codegen_dynamic_shapes_1.4_04bc65872ffb23f1_.log 2025-12-04T09:33:32.8537529Z Running 440 items in this shard: test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test__dyn_quant_matmul_4bit_bf16_input_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_abs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_avg_pool1d_argmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_adaptive_avg_pool_with_output_size_0_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_complex6_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_complex7_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_add_const_float_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_aoti_eager_support_out_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_aoti_eager_with_scalar_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_arange2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_arange6_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_as_strided_on_views_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_as_strided_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_assert_alignment_op_name_fail_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_assert_alignment_op_name_pass_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_assert_size_stride_op_name_fail_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_assert_size_stride_op_name_pass_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d_backward2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d_backward3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool2d_backward_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_avg_pool3d_backward2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_batch_norm_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int16_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int16_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int32_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int64_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int8_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int8_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_int8_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_bucketize_int_uint8_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_builtins_round_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_builtins_round_float_ndigits_zero_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_inplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_negative_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_of_loops_and_extern_kernel_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_unbacked_empty_1d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cat_upcasting_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_compar_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_config_option_dont_assume_alignment_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_const_int32_to_float_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_constant_pad_2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv3d_channels_last_use_block_ptr_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv_bn_fuse_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_conv_functional_bn_fuse_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cpu_scalar_with_gpu_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cpu_tensor_with_cpu_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cpu_tensor_with_gpu_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_cudnn_rnn_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_custom_op_fixed_layout_sequential_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_data_type_propogation_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_device_assert_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div6_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div9_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_div_by_zero_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_bfloat16_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float16_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float16_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_int32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float32_uint8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_float16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_float64_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int16_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_float32_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_int16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int32_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int64_int64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_float64_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_int8_int8_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_dtypeview_uint8_bfloat16_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_embedding_bag_byte_unpack_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_embedding_bag_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_exp2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_exp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_expand_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fallback_mutable_op_list_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fft_real_input_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_flexible_layout_immutable_free_symbols_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_fmin_fmax_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_full_like_sliced_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_functionalize_rng_wrappers_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gather2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_generated_code_has_alignment_assert_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gpu_scalar_with_cpu_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_gpu_scalar_with_gpu_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_propagation_abs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_index_put_failed_reinplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inner_fn_str_and_stride_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_inplace_flip_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_int8_weight_only_quant_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_invalid_operand_issue1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_isinf2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_lgamma_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_like_rands_sliced_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linspace3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_linspace4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_logcumsumexp_zero_dim_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_long_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_low_memory_max_pool_dilation_1_dim_3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_masked_fill_promotion_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d_with_indices_backward2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_max_pool2d_with_indices_backward4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mean_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_misaligned_address_issue1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_mix_device_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_move_arange_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multi_gpu_recompile_on_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_multi_threading_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nan_assert_inside_triton_kernel_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nan_sort_stable_False_descending_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nan_sort_stable_False_descending_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_nan_to_num_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_narrow_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_no_op_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_no_specization_over_symbolic_value_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_permute1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_philox_rand_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pixel_shuffle_channels_last_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_airy_ai_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_bessel_y1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_entr_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_gammaln_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_hermite_polynomial_he_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_log_ndtr_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_modified_bessel_i1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_modified_bessel_k1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_ndtri_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_scaled_modified_bessel_k0_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_shifted_chebyshev_polynomial_t_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pointwise_shifted_chebyshev_polynomial_w_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_polar_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_pow2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_prod_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randint_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_randint_int64_mod_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reflection_pad2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_reinterpret_dtypeview_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_remove_noop_view_dtype_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_repeat_interleave_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_replication_pad_errors_with_bool_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_require_stride_expanded_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_roll_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_round_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_rsqrt_dynamic_shapes_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scalar_cpu_tensor_arg_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_scatter_add2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_False_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_True_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sdpa_unaligned_mask_freezing_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_select_scatter_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_shape_padding_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_signbit_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_mutation3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_slice_scatter_reinplace_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sort_bool_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sort_transpose_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_cumprod_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_cumprod_low_prec_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_cumsum_low_prec_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_failed_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_reduction_with_int64_size_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_with_list_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_with_sizes_with_unbacked_symints_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_split_with_unbacked_symints_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_stack_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_std_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_stride_preservation_with_stride_modifying_fx_pass_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_strided_inputs_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum2_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_sum5_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_tensor_index_slice_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_topk_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_triton_argmin_argmax_transpose_logical_index_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_triton_kernel_bool_param_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_uint4x2_mixed_mm_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unfold_zero_dimension_tensor_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unroll_small_reduction_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_unsigned_constant_tensors_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_upsample_nearest2d_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_vdd_clamp_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_vectorized_ops_masked_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_view_on_aliased_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_views1_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_views3_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_views4_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenCpuTests::test_zero_element_mutation_dynamic_shapes_cpu, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test__dyn_quant_pack_4bit_weight_bf16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_abs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_avg_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_avg_pool2d_low_prec_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_adaptive_max_pool2d2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_add_const_float_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_add_const_int_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_addmm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aliased_buffer_reuse_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_allow_reuse_active_if_under_peak_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aoti_eager_override_registration_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aoti_eager_with_persistent_cache_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_aoti_eager_with_scalar_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_arange5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_argmax_argmin1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_as_strided_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_as_strided_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_assert_alignment_op_name_fail_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_assert_size_stride_op_name_fail_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_assert_size_stride_op_name_pass_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d6_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool2d_backward3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_avg_pool3d_backward4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_batch_norm_2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bernoulli2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_both_scalars_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_computed_offsets_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int16_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_int16_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_uint8_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_bucketize_int_uint8_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_buffer_copied_in_graph_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_buffer_copied_in_graph_with_different_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_empty_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_empty_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cat_upcasting_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_clamp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_clone_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_compar_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_consecutive_split_cumprod_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_const_int32_to_float_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_constant_pad_2d_strides_nonpositive_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_constant_pad_3d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_constant_pad_fill_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv1d_depthwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_conv_functional_bn_fuse_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_convolution3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_copy_with_scalar_src_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cpu_scalar_with_gpu_tensor_dynamic_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cudnn_rnn_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cumsum_inf_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cumsum_no_mask_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_cumsum_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_default_layout_constraint_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_op_unbacked_symints_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_custom_scan_op_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_data_type_propogation_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_device_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div9_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div_precision_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_div_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dont_constant_fold_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dropout3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dropout_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_bfloat16_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float16_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float16_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float32_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_bfloat16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_float64_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int16_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int32_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int32_int64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int32_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int64_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_float16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_float32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_int8_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_uint8_float64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_dtypeview_uint8_uint8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_embedding_bag_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_exact_stride_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_exp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_expand_as_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_expand_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fallback_mutable_op_list_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fill2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_flip_cat_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_float16_to_int16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_float32_to_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_float_repr_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fmod_zero_dim_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_fractional_max_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_full_like_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_gather1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_gather_scatter_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_generated_code_has_size_stride_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_glu_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_arange1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_arange2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_constant_tensor2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_mutation_real_name_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_graph_partition_pad_dynamic_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_grid_sampler_expand_preserves_view_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_float_zero_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_propagation_abs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_put4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_index_remainder_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_indirect_load_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inductor_assert_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inplace_flip_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inplace_resize_as_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_inplace_where_pointwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_input_mutation1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_input_mutation2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_input_mutation4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_insignificant_strides_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_int8_weight_only_quant_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_kwargs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_large_offset_pointwise_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_leaky_relu_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_like_channels_last_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_like_rands3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_like_rands_sliced_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linalg_eig_stride_consistency_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linear_dynamic_maxautotune_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linspace1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_linspace4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_lite_mode_fallback_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_lite_mode_not_decompose_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_log_fp64_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_log_softmax_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_logsumexp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_masked_fill_promotion_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d7_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d8_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d_with_indices_backward2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_max_pool2d_with_indices_backward5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_min_max_reduction_nan_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_misaligned_address_issue1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mixed_mm2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_mixed_mm_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_move_arange_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_multilayer_prime_size_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_nan_sort_stable_False_descending_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_nan_sort_stable_True_descending_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_narrow_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_neg_index_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_new_ones_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_no_op_reduction_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_nonzero_unbacked_refinement_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pad_view_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pattern_matcher_unbacked_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_philox_rand_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_bessel_y1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_chebyshev_polynomial_v_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_digamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_entr_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_hermite_polynomial_he_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_i0e_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_legendre_polynomial_p_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_modified_bessel_i0_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_ndtr_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_ndtri_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_polygamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_pointwise_scaled_modified_bessel_k1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_profiler_mark_wrapper_call_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_randint_distribution_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_randint_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_randint_int64_mod_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_randint_kernel_count_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reduction4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_reduction5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remainder_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_clone_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_remove_noop_slice_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_repeat_interleave_Tensor_decomp_int64_nd_1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_rsqrt_dynamic_shapes_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scaled_dot_product_attention_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter4_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter_bf16_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter_reduce2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_scatter_reduce3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sdpa_prefer_nd_tiling_False_use_block_ptr_False_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_searchsorted_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sigmoid_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_signbit_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_simplify_loops_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_scatter5_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_scatter_dtype_consistency_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_slice_view_with_graph_break_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_softmax_one_kernel_loop_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_special_polygamma_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_reduction_with_int64_size_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_split_with_list_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_squeeze_varargs_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_stack_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_std_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_stride_preservation_with_stride_modifying_fx_pass_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sum2_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sum_dtype_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_sum_int_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tan_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tensor1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_tensor3_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_to_device_constant_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_transposed_propagates_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unbacked_floordiv_simplify_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unsigned_constant_tensors_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_unspec_inputs_int32_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_bilinear2d_b_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_nearest1d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_nearest2d_backward_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_upsample_nearest3d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_var_mean_div_by_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_var_mean_tile_reduction_True_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_vdd_clamp_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_views1_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_views6_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_weight_norm_conv2d_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_where_broadcast_dynamic_shapes_cuda, test/inductor/test_torchinductor_codegen_dynamic_shapes.py::DynamicShapesCodegenGPUTests::test_zero_element_mutation_dynamic_shapes_cuda 2025-12-04T09:33:32.8617473Z 2025-12-04T09:33:32.8617741Z Finished inductor/test_torchinductor_codegen_dynamic_shapes 1/4 ... [2025-12-04 09:33:32.844556][3567321.369366873], took 5.33min 2025-12-04T09:33:32.8618192Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T09:33:32.8618553Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T09:33:32.8618799Z Running inductor/test_torchinductor_opinfo 2/12 ... [2025-12-04 09:33:32.850553][3567321.375367152] 2025-12-04T09:33:32.8619007Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T09:33:32.8619416Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '--shard-id=2', '--num-shards=12', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:33:32.850751] 2025-12-04T09:41:24.0205143Z 2025-12-04T09:41:24.0205926Z inductor/test_torchinductor_opinfo 2/12 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_2.12_10d2fbe4b26b6d84_.log 2025-12-04T09:41:24.0256719Z Running 326 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___radd___cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__native_batch_norm_legit_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__segment_reduce_lengths_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_acosh_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcmul_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmm_decomposed_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addr_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_alias_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_all_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_all_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_all_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argwhere_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_scatter_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_asin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atanh_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_or_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_right_shift_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_xor_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_to_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_to_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bucketize_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cholesky_solve_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_chunk_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_combinations_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_contiguous_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_copysign_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_copysign_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_corrcoef_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_corrcoef_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_corrcoef_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cosh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_count_nonzero_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_count_nonzero_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cummin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumprod_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagflat_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diff_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_floor_rounding_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_no_rounding_mode_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_div_trunc_rounding_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dstack_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_einsum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_strided_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_equal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_equal_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_equal_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expm1_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exponential_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exponential_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fft_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifft2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftshift_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfft2_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_rfftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_flip_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_float_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_like_cuda_uint32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_full_like_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gcd_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gt_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_half_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_heaviside_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_heaviside_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hsplit_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hstack_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amin_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_select_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_inner_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isin_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isin_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isnan_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isposinf_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isposinf_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isposinf_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isreal_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_item_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_item_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_item_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_unary_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kron_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_kron_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ldexp_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lgamma_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cholesky_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cross_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_householder_product_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lu_solve_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_matrix_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_matrix_rank_hermitian_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_multi_dot_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_solve_triangular_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log10_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logaddexp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_xor_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logit_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_long_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lt_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lt_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mT_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_amax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmin_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumprod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumsum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_fill_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_logsumexp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_prod_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_softmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_sum_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_var_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_binary_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_binary_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mean_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_with_dim_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_minimum_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_minimum_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nan_to_num_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nan_to_num_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nansum_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nansum_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_batch_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_layer_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ne_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_full_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_ones_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_zeros_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_avg_pool3d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_max_pool2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_binary_cross_entropy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_celu_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_celu_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_conv2d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_dropout_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_embedding_bag_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_embedding_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_fractional_max_pool3d_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_gelu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardshrink_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardswish_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hinge_embedding_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_instance_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_bicubic_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_linear_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_linear_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_l1_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_layer_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_local_response_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_pool2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_multi_head_attention_forward_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_nll_loss_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_circular_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_constant_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_rrelu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softmin_with_dtype_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softplus_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_unfold_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_normal_number_mean_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ones_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ormqr_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_outer_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pca_lowrank_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_permute_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pow_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pow_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rad2deg_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randint_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_randn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reshape_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize__cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize_as__cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resize_as__cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_conj_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rot90_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rot90_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scalar_tensor_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_sum_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_scatter_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_scatter_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sigmoid_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sigmoid_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_blackman_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_exponential_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_hann_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_nuttall_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signbit_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinc_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinc_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_scatter_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_softmax_with_dtype_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sort_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sort_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sparse_sampled_addmm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_j0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_entr_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_erfcx_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_hermite_polynomial_he_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i0e_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i1_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_i1_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_k1_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sqrt_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sqrt_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_square_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_square_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_squeeze_multiple_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stack_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_std_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_to_size_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_t_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_along_dim_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_take_along_dim_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tan_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tensor_split_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_sparse_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_topk_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trace_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapezoid_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_trapz_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_triu_indices_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_true_divide_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unique_consecutive_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsafe_chunk_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_var_mean_unbiased_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_as_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_xlogy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_like_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zeros_like_cuda_float32 2025-12-04T09:41:24.0306720Z 2025-12-04T09:41:24.0306863Z Finished inductor/test_torchinductor_opinfo 2/12 ... [2025-12-04 09:41:24.020354][3567792.545164862], took 7.85min 2025-12-04T09:41:24.0307343Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T09:41:24.0307698Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T09:41:24.0307939Z Running inductor/test_torchinductor_opinfo 8/12 ... [2025-12-04 09:41:24.026775][3567792.551588749] 2025-12-04T09:41:24.0308148Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T09:41:24.0308553Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_opinfo.py', '--shard-id=8', '--num-shards=12', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:41:24.026962] 2025-12-04T09:46:47.1801782Z 2025-12-04T09:46:47.1802725Z inductor/test_torchinductor_opinfo 8/12 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_opinfo_8.12_ca3e9f6e92e4ea3c_.log 2025-12-04T09:46:47.1847844Z Running 270 items in this shard: test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_H_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_T_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rdiv___cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rmul___cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rpow___cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive___rsub___cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__chunk_cat_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__softmax_backward_data_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive__unsafe_masked_index_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_abs_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_add_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addcdiv_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addmv_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_addr_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_alias_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_allclose_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amax_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_amin_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_aminmax_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_any_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_argsort_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_as_strided_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan2_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atan_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_atleast_3d_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bfloat16_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bitwise_not_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bool_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_shapes_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_broadcast_tensors_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_bucketize_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_byte_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cartesian_prod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cdouble_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ceil_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_clamp_min_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_column_stack_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_complex_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_conj_physical_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cos_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cov_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cross_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cross_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumsum_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_cumulative_trapezoid_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diag_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagflat_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagflat_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_copy_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_diagonal_scatter_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_digamma_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dot_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_double_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_dsplit_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_empty_permuted_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eq_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erf_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_erfc_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_as_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_copy_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expand_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expm1_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_expm1_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_eye_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_fftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft2_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft2_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfft2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_hfftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ifftshift_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_ihfft2_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfft_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfftn_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fft_irfftn_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_divide_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_floor_divide_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_fmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_geometric_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_gt_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_half_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_hash_tensor_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_put_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_reduce_amax_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_index_select_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_int_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isinf_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isnan_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isneginf_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isreal_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_isreal_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_le_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_cross_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_diagonal_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_ldl_factor_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_lstsq_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_tensorsolve_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linalg_vector_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_linspace_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_log_softmax_with_dtype_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logical_and_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logit_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logspace_tensor_overload_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_logsumexp_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_long_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_lu_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mH_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmin_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_argmin_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumprod_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_cumprod_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_log_softmax_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_logaddexp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_mean_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_mean_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_norm_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_masked_scatter_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_matrix_exp_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_no_dim_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_max_reduction_with_dim_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_maximum_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mean_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_median_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_binary_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_min_reduction_no_dim_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_movedim_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_movedim_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_msort_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mv_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_narrow_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_native_layer_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_empty_strided_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_new_full_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_adaptive_max_pool1d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_batch_norm_without_cudnn_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_bilinear_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_hardtanh_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_huber_loss_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_instance_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_local_response_norm_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_logsigmoid_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_max_unpool3d_grad_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_mish_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_mish_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_nll_loss_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_scaled_dot_product_attention_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softplus_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softshrink_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_softsign_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_threshold_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_threshold_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_nonzero_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_norm_fro_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_outer_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_pinverse_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_2_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_positive_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_put_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_qr_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_ravel_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_reciprocal_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_remainder_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_repeat_interleave_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_resolve_conj_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_round_decimals_3_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_rsqrt_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scalar_tensor_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_scatter_reduce_prod_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_select_scatter_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_short_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_short_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sigmoid_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sign_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_bartlett_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signal_windows_general_cosine_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_signbit_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinc_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sinc_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_slice_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_airy_ai_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_bessel_y1_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1e_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_i1e_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_laguerre_polynomial_l_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_modified_bessel_k0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_list_args_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_copy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_copy_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_split_with_sizes_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_stack_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sub_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_sum_to_size_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tanh_cuda_bool, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_cuda_uint8, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_to_sparse_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_transpose_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_tril_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unflatten_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unfold_copy_cuda_int64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_uniform_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_unsqueeze_cuda_float32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_var_unbiased_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_view_copy_cuda_float64, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_vsplit_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_where_cuda_int32, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_xlogy_cuda_float16, test/inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_zero__cuda_int32 2025-12-04T09:46:47.1889110Z 2025-12-04T09:46:47.1889249Z Finished inductor/test_torchinductor_opinfo 8/12 ... [2025-12-04 09:46:47.180213][3568115.705023343], took 5.39min 2025-12-04T09:46:47.1889657Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T09:46:47.1890024Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T09:46:47.1890261Z Running inductor/test_group_batch_fusion 1/1 ... [2025-12-04 09:46:47.186412][3568115.711225916] 2025-12-04T09:46:47.1890456Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T09:46:47.1890855Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_group_batch_fusion.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:46:47.186601] 2025-12-04T09:47:35.0052252Z 2025-12-04T09:47:35.0053280Z inductor/test_group_batch_fusion 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_group_batch_fusion_1.1_813dadab6cf57408_.log 2025-12-04T09:47:35.0057238Z Running 13 items in this shard: test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_batch_dropout_pre_grad_fusion, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_batch_layer_norm_fusion, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_batch_linear_lhs_fusion, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_batch_linear_pre_grad_fusion, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_gate_fusion_post_grad, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_group_linear_fusion, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_group_linear_fusion_different_shapes, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_math_op_fusion, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_pointwise_op_fusion, test/inductor/test_group_batch_fusion.py::TestGroupBatchFusion::test_pointwise_op_fusion_post_grad, test/inductor/test_group_batch_fusion.py::TestPostGradBatchLinearFusion::test_batch_linear_post_grad_fusion, test/inductor/test_group_batch_fusion.py::TestFindIndependentSubsetGreedy::test_find_independent_subset_greedy, test/inductor/test_group_batch_fusion.py::TestFindIndependentSubsetGreedy::test_find_independent_subset_greedy_fuse 2025-12-04T09:47:35.0060763Z 2025-12-04T09:47:35.0061001Z Finished inductor/test_group_batch_fusion 1/1 ... [2025-12-04 09:47:35.004854][3568163.52966426], took 0.80min 2025-12-04T09:47:35.0061751Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T09:47:35.0112217Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T09:47:35.0112535Z Running dynamo/test_dynamic_shapes 2/2 ... [2025-12-04 09:47:35.011044][3568163.535858333] 2025-12-04T09:47:35.0112785Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T09:47:35.0115107Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_dynamic_shapes.py', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:47:35.011231] 2025-12-04T09:54:29.2563047Z 2025-12-04T09:54:29.2563707Z dynamo/test_dynamic_shapes 2/2 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_dynamic_shapes_2.2_bfdecf96bb9cbf6b_.log 2025-12-04T09:54:29.2713220Z Running 974 items in this shard: test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_311_resume_block_keyerror_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autocast_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autocast_float64_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autocast_graph_break_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_autocast_sdpa_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_context_wrapping_grad_mode_decorator_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_context_wrapping_set_grad_enabled_nested_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_event_method_create_stream_outside_of_compile_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_event_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_stream_across_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_stream_compared_with_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_cuda_stream_context_manager2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_disable_saved_tensors_hooks_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_disable_saved_tensors_hooks_prev_disabled_nested_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_generic_context_manager_CustomizedCtxManager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_generic_context_manager_with_graph_break_customized_ctx_manager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_generic_ctx_manager_with_graph_break_CustomizedCtxManagerWithGraphBreak_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_grad_mode_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_inactive_context_graph_break_local_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_inactive_context_graph_break_local_nullctx2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_inactive_context_graph_break_stack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_is_autocast_cpu_enabled_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_nested_generic_context_manager_CustomizedCtxManager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_nested_generic_context_manager_customized_ctx_manager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_nested_generic_context_manager_with_graph_break_customized_ctx_manager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_return_context_manager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_sdpa_kernel_ctx_manager_set_priority_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_store_attr_graph_break_key_error_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_torch_profiler_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesCtxManagerTests::test_torch_profiler_use_after_with_block_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_add__dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_addcdiv__dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_addcdiv_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_build_list_unpack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_call_dict1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_call_dict4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_call_dict5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_callable_builtin_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_callable_torch_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_chunks1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_compare_constant_and_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_const_tuple_add1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_const_tuple_add2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_default_dict_constr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_default_dict_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_default_dict_lambda_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_default_dict_set_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_defaultdict_setdefault1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_deque_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_device_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_copy_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_id_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_items_sorted_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_keys_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_sorted_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_tuple_lazy_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_update_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dict_update_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_dtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_elipsis_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_enumerate_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_enumerate_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_filter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_filter_infinite_iterator_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_finfo_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_flat_param_same_storage_size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_float_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_fn_with_self_set_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_fstrings3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_fstrings4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_fstrings5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_fstrings6_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_funcdef_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_functools_partial_binding_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_functools_partial_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_generic_namedtuple_hasattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_get_calculate_correct_fan_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_get_privateuse1_name_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_getattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_getattr_metaclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_globalmodule_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_globalvar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_in_not_in_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_index_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_indirect1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_indirect2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_inline_jit_annotations_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_inline_lru_cache_fn_with_default_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_inline_with_default_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_inner_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_contiguous_frame_counts_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_in_onnx_export_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_inference_mode_global_recompilation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_integer_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_not_null_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_quantized_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_is_sparse_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itemgetter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_combinations_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_compress_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_pairwise_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_permutations_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_product_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_product_various_iterators_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_itertools_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_jit_annotate_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_len_constant_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_len_constant_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_len_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_add_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_add_then_mutate_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_compare_polyfill_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_compare_polyfill_non_lists_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_index_with_constant_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_setitem_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_slice_assignment_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_sorted1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_sorted2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_list_truth_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_listarg1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_listarg3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_listarg5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_manual_seed_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_deque_extendleft_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_enumerate_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_infinite_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_list_extend_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_max_const_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_partial_unpack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_return_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_sum_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_unpack_vars_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_with_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_map_zip_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_match_mapping_and_match_keys_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_match_sequence_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_methodcall2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_module_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_namedtuple_fields_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_namedtuple_replace_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_ndarray_methods_returning_scalar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_ndarray_reshape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_non_inlined_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_np_constant_collections_as_input_int_or_float_int_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_np_finfo_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_number_method_method_as_integer_ratio_num_type0_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_number_method_method_bit_length_num_type1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_number_method_method_hex_num_type5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_numpy_random_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_numpy_size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_obj_is_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_ordered_dict_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_as_input_UDF_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_as_input_partials_lambda_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_as_input_partials_mod_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_graph_break_reconstruct_args_and_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_graph_break_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_graph_break_reconstruct_mix_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_graph_break_reconstruct_mix_no_source_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___builtins___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___call___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___closure___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___defaults___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___delattr___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___dict___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___doc___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___format___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___hash___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___le___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___name___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___ne___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___new___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___repr___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___setattr___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___str___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr___subclasshook___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr_func_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_attr_keywords_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_hasattr_set_attr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_partials_udf_kwarg_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_rand_tensor_partial_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_range2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_range_iterator_2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_range_iterator_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_range_iterator_graph_break_2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_range_iterator_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_range_with_index_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_reduce_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_reduce_with_single_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_return_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_return_numpy_ndarray_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_return_tuple1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_round_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_set_in_frozenset_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_set_keys_view_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_set_update_bytecode_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_set_update_list_with_duplicated_items_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_shape1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_size_tuple_add_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_slice1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_slice2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_slice6_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_slice_eq_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_sliced_range_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_startswith_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_sum_shortcut_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_sum_shortcut_with_start_kwarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_sum_with_start_kwarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_symbool_to_int_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_element_size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_is_complex_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_new_with_shape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_type2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_type3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tensor_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_to_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_torch_distributions_functions_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_torch_size_as_dict_key_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_torch_size_hasattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_torch_source_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_transpose_for_scores_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tuple1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tuple_iadd_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_tuple_map_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_two_point_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_unary_fold_op_seq_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_unpack_ex2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_unpack_mutable_map_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_unsqueeze_inplace_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_viatorch_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_zip_longest_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFunctionTests::test_zip_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_312_binary_slice_with_graph_break2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_add_sizes_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_any_all_symnode_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_assigning_function_to_class_attribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_assigning_function_to_object_attribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_assume_32_bit_indexing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_backward_deterministic_mode_mismatch_warning_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_bound_shape_checks_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_build_tuple_unpack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_builtin_bool_on_symbool_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_builtin_bool_on_symfloat_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_builtin_complex_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_builtin_isinstance_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_builtin_str_on_user_defined_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_builtin_subclasses_as_method_on_class_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cannot_trace_mark_dynamic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cannot_trace_mark_dynamic_safe_unreached_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cast_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_catch_watchings1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cell_captured_by_existing_func_but_not_root_frame_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cell_output1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_check_assert_error_at_runtime_when_predicate_false_and_message_has_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_check_compiles_when_predicate_true_constant_and_message_has_no_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_check_raises_at_runtime_when_predicate_false_and_message_None_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_check_raises_at_runtime_when_predicate_false_constant_and_message_has_no_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_class_duner_mro_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_closure_out_of_scope_cell_with_cond_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_closure_recompiles_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_compare_shapes_tuple_neq_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_compare_shapes_with_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_compiled_class_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cond_export_single_arg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cond_runtime_assert_generation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cond_side_effects_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cond_with_quantization_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_config_obj_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_const_dict_variable_python_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_constant_getattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_cross_entropy_loss_fancy_ctor1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_custom_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_custom_module_free_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dataclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dataclass_fields_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dataclass_local_hasattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_default_args_device_dtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_defaultdict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_deque_append_left_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_derpy_nn_module_usage_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_descriptor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_descriptor_side_effect_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dim_order_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_disable_flag_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dtypes_no_graphbreaks_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dunder_new_function_inlining1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dunder_new_function_inlining4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dunder_new_function_inlining_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamic_sources_dynamic_override_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamic_sources_force_parameter_static_shapes_and_property_static_shapes_override_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamic_sources_precedence_over_int_specialization_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamic_sources_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamo_inside_custom_op_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamo_min_operator_with_shape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_dynamo_reset_clears_cache_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_enum_as_dict_key_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_enum_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_existing_func_that_creates_capturing_nested_func_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_float_speculation_log_divergence_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_fold_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_frozen_dataclass_attr_access_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_frozen_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_frozenset_of_non_literals_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_frozenset_torch_func_contains_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_function_annotation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_function_generic_alias_annotation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_generate_tensor_from_list_of_numpy_primitive_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_get_attr_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_get_custom_tensor_attribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_get_instruction_source_311_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_getattr_dict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_getattrvariable_as_python_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_grad_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_grad_non_none_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_grad_state_mutated_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_failure_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_failure_fn_tensor_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_filter_fn_by_is_global_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_filter_inbuilt_nn_modules_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_filter_nn_modules_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_filter_tensors_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_function_builder_with_cse_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guard_sym_node_fstring_when_used_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_guards_cse_pass_single_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_hasattr_nn_module_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_hash_getitem_slice_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_id_guarded_class_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_id_guarded_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_id_guarded_object_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_id_of_nn_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_if_cond_nn_mod2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_if_cond_user_defined_object2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_if_cond_user_defined_object3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_if_cond_user_defined_object_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inline_dict_function_passed_as_arg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inline_module_attr_dict_clear_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inline_user_defined_dict_attr_clear_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inplace_desugaring_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inplace_param_update_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_input_cell_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inspect_signature_bind_non_user_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_inspect_signature_parameters_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_int_int_comparisons_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_int_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_int_neg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_int_shape_binops_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_is_compiling_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_is_floating_point2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_is_tensor2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_is_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_item_changes_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_item_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_accumulate_tensors_builtins_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_accumulate_tensors_user_defined_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_groupby_pure_python_key_func_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_infinite_cycle_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_infinite_repeat_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_itertools_islice_default_end_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_jacfwd_one_hot_dynamic_compile_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_list_class_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_list_hasattr1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_list_hasattr2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_list_iterator_contains_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_list_mul_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_mandelbrot_numpy_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_map_side_effects_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_mark_static_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_mark_unbacked_strict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_matmul1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_min_max_over_iterable_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_mro_type_tensor_no_source_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_named_parameters_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_namedtuple1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_namedtuple3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nan_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_ne_operator_with_custom_graphbreak_eq_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_sequential_try_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_sequential_try_with_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nested_sequential_with_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nesteduserfunction_setattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_new_with_int_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_newly_constructed_tensor_attr_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nn_module_getattribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_nn_sequential_invocation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_no_error_on_nested_fx_trace_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_no_raise_guard_partial_constraint_across_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_non_pt2_compliant_ops_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_not_dynamic_scope_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numel_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_array_of_arrays_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_int_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_min_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_ndarray_graph_break_with_multiple_outputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_ndarray_works_with_builtin_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_no_raise_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_size_attr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_subdtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_ufunc_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_ufunc_out_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_numpy_unique_f16_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_object_setattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_ordered_dict_move_to_end_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_out_variant_custom_op_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_overridden_getattribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_patched_builtin_functions_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_precompile_entry_hit_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_precompile_entry_miss_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_proxy_frozen_dataclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_pt2_compliant_ops_are_allowed_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_raise_guard_indirect_full_constraint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_raises_importerror1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_raises_importerror2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_range_iter_guards_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_recompile_on_disable_2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_reconstruct_set_across_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_recursive_tensor_attribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_release_module_memory_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_release_scope_memory_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_repeat_interleave_graphbreaks_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_replay_side_effects_config_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_replay_side_effects_input_mut_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_repr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_return_dict_with_graph_break_and_update_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_return_nested_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_returning_nested_func_with_captured_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_running_nested_func_with_captured_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_runtime_assert_replacement_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_scalar_tensor_is_equivalent_to_int_list_argument_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_sequential_module_free_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_set_custom_tensor_attribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_set_discard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_setattr_mutation1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_env_equal_constructor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_env_equal_create_symbolic_sizes_strides_storage_offset_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_env_equal_unbacked_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_env_recorded_function_fallback_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_int_comparisons_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_shape_unpack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_size_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_slice_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_source_non_input_grad_access_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_str_format_assert1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_str_format_return1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_stride_dim_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_structseq1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_structseq2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_super_after_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_sym_constrain_range_on_replaced_unbacked_symbol_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_symint_as_device_kwarg_multi_gpu_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_symint_copy_into_unbacked_slice_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_symint_fold_nontrivial_product_modulo_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tagging_tensors_simple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor__iter___dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_build_list_unpack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_ctor_list_of_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_dot_grad_no_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_hasattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_item_capture_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_item_no_capture_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_layout_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_setattr_getset_descriptor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tensor_types_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_thread_local_setattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tolist_0d_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tolist_1d_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tolist_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tolist_float_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_top_package_import_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_check_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_check_symbolic_shape_rel_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_distributions_lazy_property_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_dtype_python_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_dynamo_codegen_pow_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_generator_set_state_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_objects_as_keys_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_torch_size_numel_dynamic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_trace_ndarray_frame_2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_trace_ndarray_frame_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tracing_nested_py_tree_mixed_all_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tuple_iadd_with_shape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tuple_mul_with_shape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_tying_union_new_syntax_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_typing_typevar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_typing_union_new_syntax_reconstruct_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unbacked_empty_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unbacked_repeat_cat_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unbacked_strict_mode_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unhandled_exception_in_dynamo2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unique_consecutive_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_unpack4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_update_locals_and_stack_uses_shared_cache_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_defined_class_name_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_defined_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_defined_object_class_interaction_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_defined_setattr2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_function_variable_supports_enum_argument_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_user_property_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_usr_cls_staticmethod_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_validate_outputs_unbacked_by_custom_op_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_validate_outputs_unbacked_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_variable_access_in_exception_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_with_builtin_type_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_writes_to_cells_across_frames1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_yield_from_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_yield_gen_and_from_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_312_local_cell_overlap_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_abc_setattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_add_complex_conj_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_addr_alpha_beta_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_amp_foreach_fake_impl_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_aot_autograd_runtime_wrapper_prologue_profiled_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_as_strided_on_base_with_mutation_works_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_as_strided_on_existing_view_banned_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_attached_attribute_in_dir_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_autograd_function_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_avoid_dupe_specialization_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_batch_norm_act_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_bigbird_unsqueeze_inplace_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_bitwise_op_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_bitwise_print_precedence_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_boxes_len_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_build_map_unpack_with_call_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_c_defined_metaclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_chunk_reformer_ff_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_contains_range_constprop_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_convert_boxes_to_pooler_format_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_copy_weird_strides_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_create_rand_mask_from_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dalle2_maybe_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dataclass_in_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_deferred_runtime_asserts_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_delattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_delattr_raises_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_delattr_return_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_delete_local_error_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_delsubscr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_delsubscr_raises_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_detectron2_instances_cat_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_disabling_unpack_hooks_within_compiled_region_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_distributions_subclass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_do_paste_mask_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dont_aggressively_write_assert_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dropout_inline_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dynamic_shape_disable_duck_size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dynamic_shapes_double_not_equal_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dynamic_shapes_implicit_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_dynamo_disable_lru_cache_behavior_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_empty_graph_nested_calls_fullgraph_True_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_empty_out_dynamic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_enum_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_exception_in_dynamo_handling_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_exec_import_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_export_vs_dynamo_for_multiheadattention_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_flip_bad_accuracy_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_for_loop_graph_break_before_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_foreach_decomp_arg_names_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_fsdp_set_input_mutation_applied_when_input_gets_no_gradients_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_function_in_skipfiles_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_functools_wraps_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_gan_repro_trying_to_backward_through_the_graph_a_second_time_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_generator_dealloc_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_get_type_hints_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_global_fn_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_grad_references_cleared_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_guard_default_device_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_guard_same_frame_fail_message_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_guard_with_tuple_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_hasattr_builtin_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_hf_t5_forward_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_hf_xsoftmax_training_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_inductor_dynamic_shapes_broadcasting_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_inductor_no_recursionerror_on_for_loops_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_inference_mode_dynamic_shapes_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_inplace_unsqueeze_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_int_format_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_intermediate_leaf_requires_grad_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_invalid_seq_unpack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_isinstance_dtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_isinstance_storage_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_issue114171_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_issue126128_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_issue134451_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_issue175_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_jit_script_defaults_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_list_index_not_found_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_list_index_tensor_unsupported_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_list_reverse_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_list_self_reference_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_longformer_chunk_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_longtensor_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_lru_cache_tracing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_maml_no_item_capture_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_many_overlapping_inputs_does_not_explode_guards_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_maybe_multiply_symint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_merge_criteria_processor_list2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_module_in_skipfiles_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_modules_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_multi_dot_import_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nanmean_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_negative_shape_guard_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nn_module_callable_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nn_module_property_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nn_param_freevar_codegen_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nn_parameter_ctor_graph_breaks_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nn_parameter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_nn_parametrize_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_no_grad_inline_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_no_tracing_into_eval_frame_ctx_manager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_no_tracing_into_eval_frame_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_not_rewrite_assert_for_other_errors_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_numpy_not_ndarray_recompiles_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_odict_get_item_index_name_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_omegaconf_dictconfig_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_omegaconf_listconfig_contains_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_os_fspath_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_out_none_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_out_overload_non_contiguous_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_output_aliases_intermediate_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_overwriting_params_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_partially_initialized_module_property_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_partitioner_activation_memory_budget_with_unbacked_symints_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_pointless_graph_removal_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_preserve_stride_with_clone_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_randint_out_dynamic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_reformer_min_chunk_len_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_reformer_sorting_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_reinplacing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_relative_import_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_requires_grad_guards_with_grad_mode2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_restricted_list_subclass1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_restricted_list_subclass2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_restricted_list_subclass3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_return_value_duplication_scalar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_return_weakref_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_rewrite_assert_dont_change_bytecode_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_rewrite_assert_with_non_string_msg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_rewrite_assert_without_msg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_seq_append_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_setitem_tensor_prop_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_sigmoid_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_slicing_dynamic_shape_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_slicing_dynamic_shape_setitem_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_specialized_stride_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_staticmethod_allow_in_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_str_isalnum_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_string_format_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_subclass_graph_output_repro_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_super_staticmethod_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_symint_bitwise_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_isinstance_tuple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_random_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_set_data_backend_eager_func_name_func3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_set_data_mismatched_dtype_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_split_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_split_within_device_cm_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tensor_uniform_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_threading_local_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_tokenization_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_torch_compile_in_compile_frame_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_torch_tensor_ops_no_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_unbacked_arange_in_bounds_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_unpack_hooks_can_be_disabled_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_unspecialized_nn_module_with_torch_variable_attribute_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_unsqueeze_mul_strides_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_user_ctor_ctx_manager_custom_init_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_user_defined_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_vc_bumped_in_inference_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_vdd_duplicate_error_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_view_dtype_overload_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_weakref_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_with_on_graph_break_inst_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_with_on_graph_break_nested_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesReproTests::test_zeros_out_dynamic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_basicmodule1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_basicmodule2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_call_fn_with_non_const_inputs_safe_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_cfgmod_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_conv_call_forward_directly_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_fnmember_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_forward_directly_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_hasattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_inject_module_parameters_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_intarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_iseval2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_isnonelayer_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_istraining2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_layerlist_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module7_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module_bad_params_call_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module_bad_params_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_lazy_module_no_cls_to_become_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_module_class_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_module_comparison_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_module_forward_has_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_module_guard_name_is_valid_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_module_property_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_moduledict_custom_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_nn_module_setattr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_parameterdict_custom_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_parameterdict_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_parameters3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_parameters4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_parameters5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_seq_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_sequential_with_duplicated_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_stringmember_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_submodules2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_super2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_super_class_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_tensorlist_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_torch_mangled_class_name_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_unsupportedmethod_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_unsupportedmodule_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesNNModuleTests::test_viamodulecall_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_access_class_method_from_user_class_attr_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_access_class_method_from_user_class_builtin_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_free_variables_overlapping_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_op_param_buffer_lifted_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_raise_user_error_on_branch_args_mismatch_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_raise_user_error_on_branch_return_multiple_tensors_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_raise_user_error_on_branch_return_non_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_raise_user_error_on_mismatch_return_tensor_meta_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_raise_user_error_on_non_tensor_operands_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_raise_user_error_on_unsupported_pred_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_cond_supported_pred_types_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_constraint_violation_error_messages_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dict_return_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dict_return_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dupes_and_bypass_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dupes_and_bypass_reorder_with_non_tensor_arg_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dupes_and_bypass_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dupes_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dynamic_slicing_simple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dynamo_enum_in_tuple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_dynamo_list_index_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_enforce_equalities_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_cond_in_aten_symbolic_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_decomp_asserts_bad_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_defaults_ok_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_dynamic_control_flow_error_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_dynamic_dim_not_1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_dynamic_dim_range_constraint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_graph_with_complex_reorder_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_graph_with_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_graph_with_list_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_identity_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_meta_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_meta_val_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_mismatched_out_2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_mismatched_out_2_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_mismatched_out_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_module_specify_constraints_signature_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_nn_module_stack_patched_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_no_tensor_computation_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_raise_guard_partial_constraint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_args_and_empty_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_args_with_default_None_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_args_with_default_tuple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_builtin_op_on_assume_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_cond_dynamic_shape_pred_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_cond_with_closed_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_free_function_and_class_method_multiarg_diff_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_free_function_and_class_method_multiarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_free_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_global_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_in_unspecialized_nn_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_list_nonzero_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_list_nonzero_free_function_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_none_control_flow_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_none_control_flow_free_func_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_not_none_control_flow_free_func_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_not_return_const_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_constant_tuple_nonzero_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_functools_wrapped_method_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_kwargs_and_empty_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_map_zero_sized_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_map_zero_sized_tensor_suppress_errors_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_module_layer_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_nonzero_static_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_shallow_list_copy_with_side_effects_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_shallow_list_copy_wo_side_effects_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_export_with_wrapped_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_exported_graph_serialization_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_func_return_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_func_return_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_fx_pytree_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_input_global_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_input_global_multiple_access_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_list_not_contains_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_list_unpack_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_map_cond_param_buffer_lifted_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_mixed_real_and_fake_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_nested_cond_op_param_buffer_lifted_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_no_tensor_computation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_no_tensor_computation_fail_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_not_functionalize_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_param_buffer_safe_from_mutation_recurse_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_param_buffer_safe_from_mutation_simple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_pre_dispatch_simple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_predispatch_with_higher_order_nested_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_preserve_fx_node_metadata_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_retracibility_dict_container_inp_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_strict_fake_tensor_prop_real_tensors_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_subclass_parameters_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_sum_param_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_trivial_constraint_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_uncaptured_higher_order_op_error_not_suppresed_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_untracked_inputs_in_constraints_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_zeroes_in_and_out_different_shape_on_test_with_aten_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesExportTests::test_zeroes_in_new_shape_scalar_out_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_capi_call1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_capi_call2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_capi_call3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_control_flow2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_control_flow3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_control_flow4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_control_flow5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_dynamic_duck_size_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_dynamic_kwarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_dynamic_order_dependence_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_dynamic_zero_inference_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_enumerate_not_break_graph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_indirect_unsupported1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_indirect_unsupported2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_no_graph_break_on_item_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_pop_after_resume_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_restore_range_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_restore_range_iter_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume_freevars_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume_tuple_iterator_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume_with_no_grad2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_resume_with_no_grad3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_stack_state2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_start2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_start3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesSubGraphTests::test_tuple_iterator_return_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_allow_python_side_effects_utility_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_global_num_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_numpy_number_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_tracked_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_untracked_global_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_untracked_nonlocal_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_capture_value_created_in_subgraph_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_cond_branches_no_arguments_no_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_cond_pytree_operands_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_cond_pytree_operands_with_non_tensor_leaves_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_cond_side_effect_in_one_branches_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_cond_with_constant_pred_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_enum_arg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_error_message_sane_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_fallback_on_graph_break_complicated_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_flat_list_output_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_fn_with_kwargs_in_torch_ops_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_freevars_as_inputs_to_wrap_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_grad_source_fn_stack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_hints_wrapper_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_hints_wrapper_no_hints_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_hints_wrapper_pytree_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_hooks_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_inlined_functions_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_internal_nonlocal_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_lift_tensor_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_lift_tensors_with_compound_expressions_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_map_example_value_metadata_consistent_with_eager_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_map_multi_return_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_map_pytree_return_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_map_symint_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_nested_tuple_output_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_register_mode_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_return_captured_var_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_return_captured_var_used_multiple_times_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_return_captured_vars_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_del_existing_attr_global_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_del_existing_attr_nonlocal_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_mutate_global_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_mutate_global_tensor_builtin_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_mutate_global_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_mutate_nonlocal_tensor_builtin_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_set_existing_attr_global_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_set_existing_attr_global_obj_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_set_existing_attr_nonlocal_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_set_existing_attr_nonlocal_obj_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_set_new_attr_nonlocal_module_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_side_effect_set_new_attr_nonlocal_obj_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_symint_in_slice_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_symint_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_tensor_and_unbacked_symbol_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_tensor_to_list_closure_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_vmap_source_fn_stack_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_all_kwarg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_allow_local_assign_in_body_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_kwarg_default_else_branch_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_kwarg_int_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_pytree_args_not_const_symint_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesHigherOrderOpTests::test_wrap_pytree_args_with_symint_constant_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_functional_call_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_functional_call_sequential_params_and_buffers_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_call_compiled_backward_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_call_torch_compile_fn_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_closure_scalar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_fn_with_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_freevar_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_has_aux_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_over_grad_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_two_tensor_all_grad_has_aux_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_grad_two_tensor_has_aux_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_hessian_argnums_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_hessian_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jacfwd_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jacfwd_has_aux_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jacfwd_two_tensors_argnums_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jacrev_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jacrev_has_aux_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jvp_freevar_python_scalar_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jvp_freevar_tensor_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jvp_simple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_jvp_two_tensors_disable_enable_disable_grad_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_teardown_resets_nested_graph_breaks_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vjp_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_free_const_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_get_wrapped_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_kwargs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_multiple_invocation_out_dims_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_multiple_outputs_diff_dims_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_multiple_outputs_out_dims_tuple_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_new_tensor_implicit_via_op_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_new_tensor_in_body_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_out_dims_None_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_over_vmap_two_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_recompile_different_config_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_recompile_same_config_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_recompile_with_randomness_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_side_effects_append_input_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_two_inputs_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesFuncTorchHigherOrderOpTests::test_vmap_with_graph_break_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_LSTM_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_aot_autograd_expand_mutation_functionalizes_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_aot_export_joint_simple_repro_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_aot_grad_mode_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_arg_dupe_via_dynamo_recompiles_many_args_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_arg_dupe_via_dynamo_recompiles_many_args_param_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_arg_dupe_via_dynamo_recompiles_many_args_param_non_tensor_arg_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_arg_dupe_via_dynamo_recompiles_many_args_param_non_tensor_arg_list_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_call_fn_with_non_const_inputs_aot_safe_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_call_fn_with_non_const_inputs_aot_unsafe_control_flow_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_call_fn_with_non_const_inputs_aot_unsafe_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_data_ptr_access_fails_in_forward_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_different_inputs_overlapping_set_with_mutation_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer5_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer6_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer_with_retain_or_create_graph2_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer_with_retain_or_create_graph3_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_donated_buffer_with_retain_or_create_graph4_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_mutation1_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_negative_testing_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesAotAutogradFallbackTests::test_requires_grad_fake_via_dynamo_recompiles_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesTestSDPA::test_intermediate_attr_access_SDPAParams_dynamic_shapes, test/dynamo/test_dynamic_shapes.py::DynamicShapesTestSDPA::test_sdpa_kernel_decorator_with_compile_dynamic_shapes 2025-12-04T09:54:29.2856275Z 2025-12-04T09:54:29.2856431Z Finished dynamo/test_dynamic_shapes 2/2 ... [2025-12-04 09:54:29.257064][3568577.781874382], took 6.90min 2025-12-04T09:54:29.2856837Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T09:54:29.2857200Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T09:54:29.2857600Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T09:54:29.2857788Z Uploading artifacts took 0.00 seconds 2025-12-04T09:54:29.2857970Z Running inductor/test_cuda_repro 1/1 ... [2025-12-04 09:54:29.263052][3568577.787865659] 2025-12-04T09:54:29.2858157Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T09:54:29.2858551Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cuda_repro.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:54:29.263245] 2025-12-04T09:55:30.2023828Z 2025-12-04T09:55:30.2028045Z inductor/test_cuda_repro 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cuda_repro_1.1_3bf7a466f8f95b3c_.log 2025-12-04T09:55:30.2049073Z Running 96 items in this shard: test/inductor/test_cuda_repro.py::CudaReproTests::test_3d_tiling, test/inductor/test_cuda_repro.py::CudaReproTests::test_accuracy_issue1, test/inductor/test_cuda_repro.py::CudaReproTests::test_adaptive_avg_pool3d_issue_157248, test/inductor/test_cuda_repro.py::CudaReproTests::test_atomic_add_bfloat16, test/inductor/test_cuda_repro.py::CudaReproTests::test_autotune_inplace_kernel, test/inductor/test_cuda_repro.py::CudaReproTests::test_backward_context, test/inductor/test_cuda_repro.py::CudaReproTests::test_bool_emulate_low_precision, test/inductor/test_cuda_repro.py::CudaReproTests::test_bucketize_dynamic_dense, test/inductor/test_cuda_repro.py::CudaReproTests::test_bucketize_epilogue, test/inductor/test_cuda_repro.py::CudaReproTests::test_cat_int8_one_kernel, test/inductor/test_cuda_repro.py::CudaReproTests::test_cpu_index, test/inductor/test_cuda_repro.py::CudaReproTests::test_deterministic_algorithms, test/inductor/test_cuda_repro.py::CudaReproTests::test_dont_inplace_disjoint_accesses, test/inductor/test_cuda_repro.py::CudaReproTests::test_dtype_factory_issue, test/inductor/test_cuda_repro.py::CudaReproTests::test_dynamic_persistent_reductions, test/inductor/test_cuda_repro.py::CudaReproTests::test_dynamic_shapes, test/inductor/test_cuda_repro.py::CudaReproTests::test_dynamic_to_static_cudagraphs, test/inductor/test_cuda_repro.py::CudaReproTests::test_effn_attn_bias_padding, test/inductor/test_cuda_repro.py::CudaReproTests::test_effn_attn_bias_padding_misaligned, test/inductor/test_cuda_repro.py::CudaReproTests::test_embedding_var_mean, test/inductor/test_cuda_repro.py::CudaReproTests::test_emulate_low_precision, test/inductor/test_cuda_repro.py::CudaReproTests::test_emulate_precision_casts_mean_ratio_chain, test/inductor/test_cuda_repro.py::CudaReproTests::test_emulate_precision_casts_min_pow_chain, test/inductor/test_cuda_repro.py::CudaReproTests::test_emulate_precision_casts_norm_rounding, test/inductor/test_cuda_repro.py::CudaReproTests::test_epilogue_fusion_with_view, test/inductor/test_cuda_repro.py::CudaReproTests::test_expanded_inputs_cudagraphs, test/inductor/test_cuda_repro.py::CudaReproTests::test_expanded_inputs_cudagraphs_no_size_asserts, test/inductor/test_cuda_repro.py::CudaReproTests::test_flash_attention_dynamic, test/inductor/test_cuda_repro.py::CudaReproTests::test_float64_constants, test/inductor/test_cuda_repro.py::CudaReproTests::test_float8_e8m0fnu, test/inductor/test_cuda_repro.py::CudaReproTests::test_full_copy, test/inductor/test_cuda_repro.py::CudaReproTests::test_identity_load, test/inductor/test_cuda_repro.py::CudaReproTests::test_index_add_fallback, test/inductor/test_cuda_repro.py::CudaReproTests::test_index_put_cudagraph, test/inductor/test_cuda_repro.py::CudaReproTests::test_index_put_inplace_cudagraph, test/inductor/test_cuda_repro.py::CudaReproTests::test_index_put_issue, test/inductor/test_cuda_repro.py::CudaReproTests::test_index_put_no_fallback_cudagraph, test/inductor/test_cuda_repro.py::CudaReproTests::test_indirect_indexing_dense_mask, test/inductor/test_cuda_repro.py::CudaReproTests::test_inductor_output_aliases_intermediate, test/inductor/test_cuda_repro.py::CudaReproTests::test_inplace_add_alpha_autotune, test/inductor/test_cuda_repro.py::CudaReproTests::test_inplace_buffer_autotune, test/inductor/test_cuda_repro.py::CudaReproTests::test_inplace_updates_cudagraphs, test/inductor/test_cuda_repro.py::CudaReproTests::test_input_channels_last, test/inductor/test_cuda_repro.py::CudaReproTests::test_int64_index_intermediate, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue100806, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue103461, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue103481, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue104759, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue97695_1input, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue97695_2input, test/inductor/test_cuda_repro.py::CudaReproTests::test_issue_103924, test/inductor/test_cuda_repro.py::CudaReproTests::test_libdevice_routing, test/inductor/test_cuda_repro.py::CudaReproTests::test_linear_cpu_input, test/inductor/test_cuda_repro.py::CudaReproTests::test_linear_with_zero_infeature_size, test/inductor/test_cuda_repro.py::CudaReproTests::test_lookup_seed_backward, test/inductor/test_cuda_repro.py::CudaReproTests::test_max_autotune_nograd, test/inductor/test_cuda_repro.py::CudaReproTests::test_memory_history_inductor, test/inductor/test_cuda_repro.py::CudaReproTests::test_mm_out_dtype_compile, test/inductor/test_cuda_repro.py::CudaReproTests::test_multi_output_layout_fallback, test/inductor/test_cuda_repro.py::CudaReproTests::test_mutated_aligned_tensor, test/inductor/test_cuda_repro.py::CudaReproTests::test_negative_arange_dynamic_shapes, test/inductor/test_cuda_repro.py::CudaReproTests::test_no_device_idx_repro_cudagraphs, test/inductor/test_cuda_repro.py::CudaReproTests::test_non_commutative_scan_op, test/inductor/test_cuda_repro.py::CudaReproTests::test_non_contiguous_unaligned_input_indices, test/inductor/test_cuda_repro.py::CudaReproTests::test_normalize_norm_leq_one, test/inductor/test_cuda_repro.py::CudaReproTests::test_not_initializing_wrong_device, test/inductor/test_cuda_repro.py::CudaReproTests::test_permute_fusion, test/inductor/test_cuda_repro.py::CudaReproTests::test_qwen2_7b_sdpa_input_alignment_requires_recompile, test/inductor/test_cuda_repro.py::CudaReproTests::test_red_dtype_mismatch, test/inductor/test_cuda_repro.py::CudaReproTests::test_reflection_pad_loop_order, test/inductor/test_cuda_repro.py::CudaReproTests::test_repeated_masked_load, test/inductor/test_cuda_repro.py::CudaReproTests::test_scalar_triton_index, test/inductor/test_cuda_repro.py::CudaReproTests::test_scaled_dot_product_efficient_attention_backward, test/inductor/test_cuda_repro.py::CudaReproTests::test_scatter_index_not_wrapped, test/inductor/test_cuda_repro.py::CudaReproTests::test_searchsorted_stride_permutations_quantiles_shape0_quantiles_strides0_batch_size_16, test/inductor/test_cuda_repro.py::CudaReproTests::test_searchsorted_stride_permutations_quantiles_shape1_quantiles_strides1_batch_size_16, test/inductor/test_cuda_repro.py::CudaReproTests::test_searchsorted_stride_permutations_quantiles_shape2_quantiles_strides2_batch_size_16, test/inductor/test_cuda_repro.py::CudaReproTests::test_searchsorted_stride_permutations_quantiles_shape3_quantiles_strides3_batch_size_16, test/inductor/test_cuda_repro.py::CudaReproTests::test_searchsorted_stride_permutations_quantiles_shape4_quantiles_strides4_batch_size_16, test/inductor/test_cuda_repro.py::CudaReproTests::test_searchsorted_stride_permutations_quantiles_shape5_quantiles_strides5_batch_size_16, test/inductor/test_cuda_repro.py::CudaReproTests::test_searchsorted_stride_permutations_quantiles_shape6_quantiles_strides6_batch_size_16, test/inductor/test_cuda_repro.py::CudaReproTests::test_searchsorted_stride_permutations_quantiles_shape7_quantiles_strides7_batch_size_16, test/inductor/test_cuda_repro.py::CudaReproTests::test_selecsls42b_misaligned_address, test/inductor/test_cuda_repro.py::CudaReproTests::test_simplify_dims, test/inductor/test_cuda_repro.py::CudaReproTests::test_sort_stride_issue, test/inductor/test_cuda_repro.py::CudaReproTests::test_sorted_masks, test/inductor/test_cuda_repro.py::CudaReproTests::test_split_reduction_channels_last, test/inductor/test_cuda_repro.py::CudaReproTests::test_split_reduction_transposed, test/inductor/test_cuda_repro.py::CudaReproTests::test_triton_interpret, test/inductor/test_cuda_repro.py::CudaReproTests::test_truediv_base_not_bitwise_equivalent, test/inductor/test_cuda_repro.py::CudaReproTests::test_truediv_emulate_divison_rounding, test/inductor/test_cuda_repro.py::CudaReproTests::test_uint_view_copy, test/inductor/test_cuda_repro.py::CudaReproTests::test_unspec_inputs_interop, test/inductor/test_cuda_repro.py::CudaReproTests::test_unused_cpu_input_cudagraphs, test/inductor/test_cuda_repro.py::CudaReproTests::test_view_replay_padding_issue_163328, test/inductor/test_cuda_repro.py::CudaReproTests::test_xlnet_lm_stride_repro 2025-12-04T09:55:30.2061348Z 2025-12-04T09:55:30.2061464Z Finished inductor/test_cuda_repro 1/1 ... [2025-12-04 09:55:30.202170][3568638.726979054], took 1.02min 2025-12-04T09:55:30.2061897Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T09:55:30.2091728Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T09:55:30.2094998Z Running dynamo/test_after_aot 1/1 ... [2025-12-04 09:55:30.209325][3568638.734138492] 2025-12-04T09:55:30.2095424Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T09:55:30.2096625Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_after_aot.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:55:30.209551] 2025-12-04T09:55:37.1348470Z 2025-12-04T09:55:37.1349317Z dynamo/test_after_aot 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_after_aot_1.1_5cbf15b6597349e7_.log 2025-12-04T09:55:37.1350058Z Running 2 items in this shard: test/dynamo/test_after_aot.py::TestAfterAot::test_dump_tensor, test/dynamo/test_after_aot.py::TestAfterAot::test_save_graph_repro 2025-12-04T09:55:37.1350389Z 2025-12-04T09:55:37.1350575Z Finished dynamo/test_after_aot 1/1 ... [2025-12-04 09:55:37.134572][3568645.659381529], took 0.12min 2025-12-04T09:55:37.1352972Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T09:55:37.1412539Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T09:55:37.1414634Z Running inductor/test_snode_runtime 1/1 ... [2025-12-04 09:55:37.141335][3568645.666148514] 2025-12-04T09:55:37.1414909Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T09:55:37.1416321Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_snode_runtime.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:55:37.141544] 2025-12-04T09:55:50.0773545Z 2025-12-04T09:55:50.0774492Z inductor/test_snode_runtime 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_snode_runtime_1.1_f895aa26a66aacd0_.log 2025-12-04T09:55:50.0779770Z Running 22 items in this shard: test/inductor/test_snode_runtime.py::UnsupportedTests::test_no_cuda, test/inductor/test_snode_runtime.py::UnsupportedTests::test_no_op, test/inductor/test_snode_runtime.py::ComputeBoundedTests::test_addmm, test/inductor/test_snode_runtime.py::ComputeBoundedTests::test_bmm, test/inductor/test_snode_runtime.py::ComputeBoundedTests::test_conv1d, test/inductor/test_snode_runtime.py::ComputeBoundedTests::test_conv2d, test/inductor/test_snode_runtime.py::ComputeBoundedTests::test_conv2d_transpose, test/inductor/test_snode_runtime.py::ComputeBoundedTests::test_conv3d, test/inductor/test_snode_runtime.py::ComputeBoundedTests::test_mm, test/inductor/test_snode_runtime.py::MemoryBoundedTests::test_dynamic, test/inductor/test_snode_runtime.py::MemoryBoundedTests::test_horizontal_reduction_pointwise, test/inductor/test_snode_runtime.py::MemoryBoundedTests::test_pointwise, test/inductor/test_snode_runtime.py::MemoryBoundedTests::test_relu, test/inductor/test_snode_runtime.py::TestCommAnalysis::test_all_gather_into_tensor, test/inductor/test_snode_runtime.py::TestCommAnalysis::test_all_gather_into_tensor_coalesced, test/inductor/test_snode_runtime.py::TestCommAnalysis::test_all_reduce, test/inductor/test_snode_runtime.py::TestCommAnalysis::test_all_reduce_coalesced, test/inductor/test_snode_runtime.py::TestCommAnalysis::test_legacy_all_gather_into_tensor_coalesced, test/inductor/test_snode_runtime.py::TestCommAnalysis::test_legacy_all_reduce, test/inductor/test_snode_runtime.py::TestCommAnalysis::test_legacy_all_reduce_coalesced, test/inductor/test_snode_runtime.py::TestCommAnalysis::test_reduce_scatter_tensor, test/inductor/test_snode_runtime.py::TestCommAnalysis::test_reduce_scatter_tensor_coalesced 2025-12-04T09:55:50.0784013Z 2025-12-04T09:55:50.0784219Z Finished inductor/test_snode_runtime 1/1 ... [2025-12-04 09:55:50.076998][3568658.601807169], took 0.22min 2025-12-04T09:55:50.0784882Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T09:55:50.0839657Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T09:55:50.0840115Z Running inductor/test_minifier 1/1 ... [2025-12-04 09:55:50.083866][3568658.608679712] 2025-12-04T09:55:50.0840537Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T09:55:50.0842629Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_minifier.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:55:50.084098] 2025-12-04T09:57:02.9647489Z 2025-12-04T09:57:02.9648530Z inductor/test_minifier 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_minifier_1.1_2a9cd6b15fa5d80d_.log 2025-12-04T09:57:02.9653208Z Running 14 items in this shard: test/inductor/test_minifier.py::MinifierTests::test_accuracy_vs_strict_accuracy, test/inductor/test_minifier.py::MinifierTests::test_after_aot_cpu_accuracy_error, test/inductor/test_minifier.py::MinifierTests::test_after_aot_cpu_compile_error, test/inductor/test_minifier.py::MinifierTests::test_after_aot_gpu_accuracy_error, test/inductor/test_minifier.py::MinifierTests::test_after_aot_gpu_compile_error, test/inductor/test_minifier.py::MinifierTests::test_aoti_cpu_accuracy_error, test/inductor/test_minifier.py::MinifierTests::test_aoti_cpu_compile_error, test/inductor/test_minifier.py::MinifierTests::test_aoti_cpu_compile_error_unflatten, test/inductor/test_minifier.py::MinifierTests::test_aoti_gpu_accuracy_error, test/inductor/test_minifier.py::MinifierTests::test_aoti_gpu_compile_error, test/inductor/test_minifier.py::MinifierTests::test_aoti_gpu_compile_error_unflatten, test/inductor/test_minifier.py::MinifierTests::test_constant_in_graph, test/inductor/test_minifier.py::MinifierTests::test_offload_to_disk, test/inductor/test_minifier.py::MinifierTests::test_rmse_improves_over_atol 2025-12-04T09:57:02.9656187Z 2025-12-04T09:57:02.9656402Z Finished inductor/test_minifier 1/1 ... [2025-12-04 09:57:02.964538][3568731.489348409], took 1.21min 2025-12-04T09:57:02.9657130Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T09:57:02.9710919Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T09:57:02.9713181Z Running inductor/test_perf 1/1 ... [2025-12-04 09:57:02.971161][3568731.495975166] 2025-12-04T09:57:02.9713470Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T09:57:02.9715396Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_perf.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:57:02.971378] 2025-12-04T09:57:38.1927699Z 2025-12-04T09:57:38.1928437Z inductor/test_perf 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_perf_1.1_ff66005b0f9db417_.log 2025-12-04T09:57:38.1935689Z Running 66 items in this shard: test/inductor/test_perf.py::NumBytesMetricTests::test_cat, test/inductor/test_perf.py::NumBytesMetricTests::test_cat_pointwise, test/inductor/test_perf.py::NumBytesMetricTests::test_cat_pointwise_config_option, test/inductor/test_perf.py::NumBytesMetricTests::test_cat_pointwise_many_complex_inputs, test/inductor/test_perf.py::NumBytesMetricTests::test_cat_pointwise_many_simple_inputs, test/inductor/test_perf.py::NumBytesMetricTests::test_extern, test/inductor/test_perf.py::NumBytesMetricTests::test_index, test/inductor/test_perf.py::NumBytesMetricTests::test_pointwise, test/inductor/test_perf.py::NumBytesMetricTests::test_reduction, test/inductor/test_perf.py::FusionTests::test_create_block_mask, test/inductor/test_perf.py::FusionTests::test_double_softmax, test/inductor/test_perf.py::FusionTests::test_factory_reduction, test/inductor/test_perf.py::FusionTests::test_horizontal_reduction_outer_pointwise, test/inductor/test_perf.py::FusionTests::test_horizontal_reduction_pointwise, test/inductor/test_perf.py::FusionTests::test_horizontal_reduction_pointwise2, test/inductor/test_perf.py::FusionTests::test_horizontal_reduction_reduction, test/inductor/test_perf.py::FusionTests::test_horizontal_sum_pw_broadcast, test/inductor/test_perf.py::FusionTests::test_index_pointwise, test/inductor/test_perf.py::FusionTests::test_index_reduction, test/inductor/test_perf.py::FusionTests::test_layer_norm, test/inductor/test_perf.py::FusionTests::test_mutation_fusion, test/inductor/test_perf.py::FusionTests::test_neighbor, test/inductor/test_perf.py::FusionTests::test_norm_chain, test/inductor/test_perf.py::FusionTests::test_pointwise_multi_level_reduction, test/inductor/test_perf.py::FusionTests::test_reduction_pointwise_multi_level_reduction, test/inductor/test_perf.py::FusionTests::test_softmax_backward, test/inductor/test_perf.py::FusionTests::test_softmax_inner, test/inductor/test_perf.py::FusionTests::test_vertical_sum_pw, test/inductor/test_perf.py::SchedulerFusionTests::test_fusion_choice1, test/inductor/test_perf.py::SchedulerFusionTests::test_fusion_choice2, test/inductor/test_perf.py::SchedulerFusionTests::test_fusion_choice3, test/inductor/test_perf.py::SchedulerFusionTests::test_fusion_choice4_cpu, test/inductor/test_perf.py::TilingTests::test_tiling_simple, test/inductor/test_perf.py::TilingTests::test_tiling_three, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_cat, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_dtype, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_full_remat, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_keops, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_long_chain_add, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_partial_remat, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_relu, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_unremat_bw, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_unremat_bw2, test/inductor/test_perf.py::MinCutPartitioningTests::test_partitioning_with_view, test/inductor/test_perf.py::NoopTests::test_noop_cat, test/inductor/test_perf.py::NoopTests::test_noop_clones, test/inductor/test_perf.py::NoopTests::test_noop_device_conversion, test/inductor/test_perf.py::NoopTests::test_noop_dtype_conversion, test/inductor/test_perf.py::NoopTests::test_noop_int_ops, test/inductor/test_perf.py::NoopTests::test_noop_slice_scatter, test/inductor/test_perf.py::InplacingTests::test_inplace_custom_op, test/inductor/test_perf.py::InplacingTests::test_inplace_custom_op_intermediate, test/inductor/test_perf.py::InplacingTests::test_inplace_custom_op_training, test/inductor/test_perf.py::InplacingTests::test_inplace_custom_op_training_two_mutated_inputs, test/inductor/test_perf.py::InplacingTests::test_inplace_custom_op_two_mutated_inputs, test/inductor/test_perf.py::InplacingTests::test_inplace_randperm_scatter, test/inductor/test_perf.py::InplacingTests::test_inplace_scatter, test/inductor/test_perf.py::InplacingTests::test_inplace_scatter_noop_view, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_training, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v1, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v2, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v3, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v4, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v5, test/inductor/test_perf.py::InplacingTests::test_inplace_triton_kernel_v6, test/inductor/test_perf.py::InplacingTests::test_triton_kernel_not_fusable_with_users 2025-12-04T09:57:38.1942254Z 2025-12-04T09:57:38.1942363Z Finished inductor/test_perf 1/1 ... [2025-12-04 09:57:38.192677][3568766.717484796], took 0.59min 2025-12-04T09:57:38.1942743Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T09:57:38.1996075Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T09:57:38.1998187Z Running inductor/test_fused_attention 1/1 ... [2025-12-04 09:57:38.199685][3568766.724498506] 2025-12-04T09:57:38.1998384Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T09:57:38.2000054Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_fused_attention.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:57:38.199897] 2025-12-04T09:59:15.3935175Z 2025-12-04T09:59:15.3935847Z inductor/test_fused_attention 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_fused_attention_1.1_0012c01c7f81f8fc_.log 2025-12-04T09:59:15.3952285Z Running 108 items in this shard: test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_insignificant_strides, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_pattern_fails_with_reuse_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_pattern_fails_with_tensor_factor_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_pattern_fails_with_unsupported_mask_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_prev_13_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_prev_14_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_prev_15_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_10_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_11_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_12_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_13_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_14_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_15_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_17_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_19_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_1_freezing, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_1_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_20_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_21_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_22_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_23_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_24_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_2_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_3_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_4_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_5_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_6_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_7_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_8_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuTests::test_sdpa_rewriter_9_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_insignificant_strides, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_pattern_fails_with_reuse_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_pattern_fails_with_tensor_factor_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_pattern_fails_with_unsupported_mask_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_prev_13_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_prev_14_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_prev_15_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_10_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_11_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_12_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_13_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_14_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_15_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_17_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_19_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_1_freezing, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_1_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_20_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_21_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_22_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_23_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_24_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_2_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_3_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_4_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_5_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_6_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_7_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_8_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterGpuDynamicTests::test_sdpa_rewriter_9_gpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_pattern_fails_with_reuse_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_pattern_fails_with_tensor_factor_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_pattern_fails_with_unsupported_mask_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_prev_13_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_prev_14_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_prev_15_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_11_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_12_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_13_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_14_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_15_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_16_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_16_fp32_mask_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_17_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_18_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_19_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_1_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_20_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_21_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_22_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_23_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_24_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_2_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_5_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_pattern_fails_with_reuse_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_pattern_fails_with_tensor_factor_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_pattern_fails_with_unsupported_mask_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_prev_13_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_prev_14_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_prev_15_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_11_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_12_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_13_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_14_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_15_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_16_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_16_fp32_mask_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_17_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_18_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_19_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_1_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_20_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_21_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_22_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_23_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_24_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_2_cpu, test/inductor/test_fused_attention.py::SDPAPatternRewriterCpuDynamicTests::test_sdpa_rewriter_5_cpu 2025-12-04T09:59:15.3967068Z 2025-12-04T09:59:15.3967198Z Finished inductor/test_fused_attention 1/1 ... [2025-12-04 09:59:15.393421][3568863.91823082], took 1.62min 2025-12-04T09:59:15.3967596Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T09:59:15.4001120Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T09:59:15.4003804Z Running inductor/test_mkldnn_pattern_matcher 1/2 ... [2025-12-04 09:59:15.400195][3568863.925008374] 2025-12-04T09:59:15.4004026Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T09:59:15.4005211Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_mkldnn_pattern_matcher.py', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 09:59:15.400421] 2025-12-04T10:03:26.4842606Z 2025-12-04T10:03:26.4843611Z inductor/test_mkldnn_pattern_matcher 1/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_mkldnn_pattern_matcher_1.2_9522f6dbcd2f44d3_.log 2025-12-04T10:03:26.4878915Z Running 143 items in this shard: test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_conv2d_binary_fusion_failed, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_False_float32_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_bfloat16_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_1_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_False_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_1_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_False_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_1_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_32_inplace_add_False_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_da8w8_sym_act_sym_wgt_with_int_mm_has_bias_True_float32_dynamic_True_reshape_a_True_M_32_inplace_add_True_expand_a_scale_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_dynamic_qlinear_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_dynamic_qlinear_qat_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_hardtanh_pattern_fallback, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_leaky_relu_pattern_fallback, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_binary_broadcast_shapes, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_relu_dynamic_fp16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_linear_unary, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d_relu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qat_qconv2d_relu6, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qcat, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_3, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_relu_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_relu_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_relu_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_relu_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_add_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_dequant_promotion_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardswish_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardtanh_int8_mixed_bf16_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardtanh_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_hardtanh_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_relu6_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_relu6_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_relu_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_relu_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_silu_int8_mixed_bf16_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qconv2d_with_concat_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qflatten, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_False_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_False_is_qat_False_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_False_is_qat_True_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_True_is_qat_True_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_cpu_use_relu_True_is_qat_True_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_fp8_inductor_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_False_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_False_is_qat_False_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_False_is_qat_True_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_False_is_qat_True_is_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_use_relu_True_is_qat_True_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_int8_mixed_bf16_xpu_use_relu_True_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_add_xpu_use_relu_True_is_qat_False_is_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_input_dim_exceeds_2_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_int8_mixed_bf16, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_int8_mixed_bf16_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_dequant_promotion_int8_mixed_bf16_input_dim_exceeds_2_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_gelu_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_input_dim_exceeds_2_and_not_contiguous, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_input_dim_exceeds_2_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16_input_dim_exceeds_2, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16_input_dim_exceeds_2_use_autocast, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_int8_mixed_bf16_use_autocast, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_mul_cpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_mul_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_input_dim_exceeds_2_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_int8_mixed_bf16_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qlinear_relu_xpu, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_qmaxpool2d, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_reproduce_121253_issue_addmm_fusion_check, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_bfloat16_per_channel_quant_False_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_bfloat16_per_channel_quant_False_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_bfloat16_per_channel_quant_True_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_bfloat16_per_channel_quant_True_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_float32_per_channel_quant_True_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_False_float32_per_channel_quant_True_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_bfloat16_per_channel_quant_False_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_bfloat16_per_channel_quant_False_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_float32_per_channel_quant_False_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_float32_per_channel_quant_False_dynamic_True, test/inductor/test_mkldnn_pattern_matcher.py::TestPatternMatcher::test_smooth_quant_with_int_mm_has_bias_True_float32_per_channel_quant_True_dynamic_False, test/inductor/test_mkldnn_pattern_matcher.py::TestDynamicPatternMatcher::test_linear_input_non_contiguous_3D_wo_bias_dynamic_shapes, test/inductor/test_mkldnn_pattern_matcher.py::TestDynamicPatternMatcher::test_q_attention_block, test/inductor/test_mkldnn_pattern_matcher.py::TestDynamicPatternMatcher::test_qconv2d_maxpool2d_linear_dynamic_cpu 2025-12-04T10:03:26.4905751Z 2025-12-04T10:03:26.4905886Z Finished inductor/test_mkldnn_pattern_matcher 1/2 ... [2025-12-04 10:03:26.484265][3569115.009053925], took 4.18min 2025-12-04T10:03:26.4906296Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T10:03:26.4915689Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:03:26.4917801Z Running inductor/test_cpu_select_algorithm 1/1 ... [2025-12-04 10:03:26.491687][3569115.016499518] 2025-12-04T10:03:26.4918067Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:03:26.4920211Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cpu_select_algorithm.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:03:26.491932] 2025-12-04T10:03:32.0078930Z 2025-12-04T10:03:32.0080064Z inductor/test_cpu_select_algorithm 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_cpu_select_algorithm_1.1_5be7c75cf73d5a51_.log 2025-12-04T10:03:32.0080883Z Running 0 items in this shard: 2025-12-04T10:03:32.0081074Z 2025-12-04T10:03:32.0081388Z Finished inductor/test_cpu_select_algorithm 1/1 ... [2025-12-04 10:03:32.007585][3569120.532393814], took 0.09min 2025-12-04T10:03:32.0086144Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T10:03:32.0147101Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:03:32.0148865Z Running inductor/test_cuda_select_algorithm 1/1 ... [2025-12-04 10:03:32.014774][3569120.539587964] 2025-12-04T10:03:32.0149267Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:03:32.0151152Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_cuda_select_algorithm.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:03:32.014995] 2025-12-04T10:49:10.9179433Z 2025-12-04T10:49:10.9179769Z PRINTING LOG FILE of inductor/test_cuda_select_algorithm 1/1 (test/test-reports/inductor.test_cuda_select_algorithm_1.1_a2cc8512cf78dd46_.log) 2025-12-04T10:49:10.9180275Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-15aa08761d81af9d.xml 2025-12-04T10:49:10.9180606Z ============================= test session starts ============================== 2025-12-04T10:49:10.9180840Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9181037Z cachedir: .pytest_cache 2025-12-04T10:49:10.9181299Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9181554Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9181679Z configfile: pytest.ini 2025-12-04T10:49:10.9181985Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9182229Z collecting ... collected 58 items 2025-12-04T10:49:10.9182371Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T10:49:10.9194985Z Running 58 items in this shard: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16, test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9207122Z 2025-12-04T10:49:10.9207380Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.5332s] [ 1%] 2025-12-04T10:49:10.9207932Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.5053s] [ 1%] 2025-12-04T10:49:10.9208454Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 FAILED [0.4971s] [ 1%] 2025-12-04T10:49:10.9208719Z 2025-12-04T10:49:10.9208779Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9209038Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9209281Z Traceback (most recent call last): 2025-12-04T10:49:10.9209545Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9209829Z method(*args, **kwargs) 2025-12-04T10:49:10.9210081Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9210316Z method(*args, **kwargs) 2025-12-04T10:49:10.9210543Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9210775Z with policy(): 2025-12-04T10:49:10.9210995Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9211247Z raise RuntimeError(msg) 2025-12-04T10:49:10.9211785Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 24576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9212273Z 2025-12-04T10:49:10.9212352Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9212766Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9213105Z 2025-12-04T10:49:10.9213194Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9213414Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9213591Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9213905Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9214234Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9214390Z graph_break [] 2025-12-04T10:49:10.9214658Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9214904Z Traceback (most recent call last): 2025-12-04T10:49:10.9215179Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9215429Z method(*args, **kwargs) 2025-12-04T10:49:10.9215689Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9215925Z method(*args, **kwargs) 2025-12-04T10:49:10.9216150Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9216378Z with policy(): 2025-12-04T10:49:10.9216598Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9216836Z raise RuntimeError(msg) 2025-12-04T10:49:10.9217328Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 24576 and is now reported as 49152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9217769Z 2025-12-04T10:49:10.9217851Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9218261Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9218588Z 2025-12-04T10:49:10.9218678Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9218926Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9219099Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9219374Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9219688Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9219838Z graph_break [] 2025-12-04T10:49:10.9219968Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9220136Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9220301Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9220611Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9220866Z graph_break [] 2025-12-04T10:49:10.9220981Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9221236Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9221476Z Traceback (most recent call last): 2025-12-04T10:49:10.9221733Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9222016Z method(*args, **kwargs) 2025-12-04T10:49:10.9222237Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9222494Z method(*args, **kwargs) 2025-12-04T10:49:10.9222707Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9222933Z with policy(): 2025-12-04T10:49:10.9223140Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9223366Z raise RuntimeError(msg) 2025-12-04T10:49:10.9223877Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9224353Z 2025-12-04T10:49:10.9224446Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9224860Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9225222Z 2025-12-04T10:49:10.9225307Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9225507Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9225674Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9225945Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9226228Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9226376Z graph_break [] 2025-12-04T10:49:10.9226501Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9226670Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9226829Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9227138Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9227432Z graph_break [] 2025-12-04T10:49:10.9227557Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9227723Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9227884Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9228169Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9228454Z graph_break [] 2025-12-04T10:49:10.9228770Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-15aa08761d81af9d.xml - 2025-12-04T10:49:10.9229108Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9229859Z FAILED [0.4971s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9230557Z 2025-12-04T10:49:10.9230630Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9231041Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9231394Z 2025-12-04T10:49:10.9231480Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9231675Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9231834Z ========================== 1 failed, 2 rerun in 3.70s ========================== 2025-12-04T10:49:10.9232021Z Got exit code 1 2025-12-04T10:49:10.9232118Z Retrying single test... 2025-12-04T10:49:10.9232424Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-eccf8a19b6130b9e.xml 2025-12-04T10:49:10.9232718Z ============================= test session starts ============================== 2025-12-04T10:49:10.9232933Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9233131Z cachedir: .pytest_cache 2025-12-04T10:49:10.9233357Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9233606Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9233723Z configfile: pytest.ini 2025-12-04T10:49:10.9233956Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9234252Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9234658Z stepcurrent: skipping 0 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9235032Z Running 1 items in this shard 2025-12-04T10:49:10.9235108Z 2025-12-04T10:49:10.9235479Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:03:49.758518687 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9235907Z 2025-12-04T10:49:10.9236063Z [W1204 10:03:56.415824032 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9236253Z 2025-12-04T10:49:10.9236404Z [W1204 10:03:56.415972858 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9236588Z 2025-12-04T10:49:10.9236771Z [W1204 10:03:56.416391008 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9236959Z 2025-12-04T10:49:10.9237108Z [W1204 10:03:56.416487366 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9237298Z 2025-12-04T10:49:10.9237447Z [W1204 10:03:56.417225989 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9237640Z 2025-12-04T10:49:10.9237789Z [W1204 10:03:56.417297557 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9237977Z 2025-12-04T10:49:10.9238125Z [W1204 10:03:56.417410255 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9238312Z 2025-12-04T10:49:10.9238466Z [W1204 10:03:56.417472973 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9238658Z 2025-12-04T10:49:10.9238808Z [W1204 10:03:56.422141374 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9238997Z 2025-12-04T10:49:10.9239149Z [W1204 10:03:56.422241632 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9239341Z 2025-12-04T10:49:10.9239493Z [W1204 10:03:56.422314590 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9239684Z 2025-12-04T10:49:10.9239862Z [W1204 10:03:56.422414278 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9240046Z 2025-12-04T10:49:10.9240218Z [W1204 10:03:56.422475907 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9240403Z 2025-12-04T10:49:10.9240552Z [W1204 10:03:56.422571074 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9240739Z 2025-12-04T10:49:10.9240886Z [W1204 10:03:56.422630183 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9241071Z 2025-12-04T10:49:10.9241220Z [W1204 10:03:56.422716101 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9241409Z 2025-12-04T10:49:10.9241557Z [W1204 10:03:56.422774720 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9241744Z 2025-12-04T10:49:10.9241928Z [W1204 10:03:56.463714725 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9242114Z 2025-12-04T10:49:10.9242261Z [W1204 10:03:56.463829872 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9242449Z 2025-12-04T10:49:10.9242599Z [W1204 10:03:56.463904011 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9242790Z 2025-12-04T10:49:10.9242944Z [W1204 10:03:56.464013428 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9243165Z 2025-12-04T10:49:10.9243314Z [W1204 10:03:56.464075946 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9243500Z 2025-12-04T10:49:10.9243673Z [W1204 10:03:56.464172214 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9243861Z 2025-12-04T10:49:10.9304167Z [W1204 10:03:56.464231773 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9304513Z 2025-12-04T10:49:10.9304721Z [W1204 10:03:56.464319211 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9304934Z 2025-12-04T10:49:10.9305128Z [W1204 10:03:56.464377440 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9305382Z 2025-12-04T10:49:10.9305466Z ('RERUN', {'yellow': True}) [10.3434s] [100%] 2025-12-04T10:49:10.9306018Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:03:58.580207681 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9306475Z 2025-12-04T10:49:10.9306635Z [W1204 10:03:58.580385727 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9306840Z 2025-12-04T10:49:10.9306993Z [W1204 10:03:58.580461266 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9307191Z 2025-12-04T10:49:10.9307354Z [W1204 10:03:58.580570213 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9307572Z 2025-12-04T10:49:10.9307745Z [W1204 10:03:58.580636672 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9307976Z 2025-12-04T10:49:10.9308204Z [W1204 10:03:58.580736929 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9308401Z 2025-12-04T10:49:10.9308560Z [W1204 10:03:58.580798008 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9308751Z 2025-12-04T10:49:10.9308910Z [W1204 10:03:58.580883106 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9309099Z 2025-12-04T10:49:10.9309256Z [W1204 10:03:58.580942044 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9309459Z 2025-12-04T10:49:10.9309617Z [W1204 10:03:58.583581203 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9309812Z 2025-12-04T10:49:10.9309968Z [W1204 10:03:58.583683481 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9310168Z 2025-12-04T10:49:10.9310328Z [W1204 10:03:58.583757889 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9310524Z 2025-12-04T10:49:10.9310677Z [W1204 10:03:58.583851967 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9310870Z 2025-12-04T10:49:10.9311023Z [W1204 10:03:58.583912985 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9311251Z 2025-12-04T10:49:10.9311406Z [W1204 10:03:58.584013633 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9311601Z 2025-12-04T10:49:10.9311762Z [W1204 10:03:58.584075391 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9312011Z 2025-12-04T10:49:10.9312182Z [W1204 10:03:58.584160659 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9312378Z 2025-12-04T10:49:10.9312547Z [W1204 10:03:58.584218988 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9312743Z 2025-12-04T10:49:10.9312908Z [W1204 10:03:58.621842651 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9313107Z 2025-12-04T10:49:10.9313276Z [W1204 10:03:58.621948959 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9313468Z 2025-12-04T10:49:10.9313638Z [W1204 10:03:58.622027387 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9313838Z 2025-12-04T10:49:10.9313997Z [W1204 10:03:58.622129705 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9314202Z 2025-12-04T10:49:10.9314364Z [W1204 10:03:58.622191693 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9314565Z 2025-12-04T10:49:10.9314721Z [W1204 10:03:58.622288771 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9314919Z 2025-12-04T10:49:10.9315078Z [W1204 10:03:58.622349950 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9315280Z 2025-12-04T10:49:10.9315443Z [W1204 10:03:58.622436118 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9315643Z 2025-12-04T10:49:10.9315838Z [W1204 10:03:58.622494806 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9316037Z 2025-12-04T10:49:10.9316108Z ('RERUN', {'yellow': True}) [0.5626s] [100%] 2025-12-04T10:49:10.9316574Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:03:58.126317288 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9316983Z 2025-12-04T10:49:10.9317146Z [W1204 10:03:58.126491554 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9317337Z 2025-12-04T10:49:10.9317500Z [W1204 10:03:58.126567912 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9317693Z 2025-12-04T10:49:10.9317860Z [W1204 10:03:58.126673590 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9318065Z 2025-12-04T10:49:10.9318220Z [W1204 10:03:58.126736038 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9318418Z 2025-12-04T10:49:10.9318578Z [W1204 10:03:58.126836626 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9318775Z 2025-12-04T10:49:10.9318964Z [W1204 10:03:58.126896715 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9319164Z 2025-12-04T10:49:10.9319322Z [W1204 10:03:58.126983633 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9319518Z 2025-12-04T10:49:10.9319676Z [W1204 10:03:58.127046551 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9319877Z 2025-12-04T10:49:10.9320043Z [W1204 10:03:58.129638001 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9320238Z 2025-12-04T10:49:10.9320401Z [W1204 10:03:58.129734599 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9320595Z 2025-12-04T10:49:10.9320762Z [W1204 10:03:58.129808027 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9320958Z 2025-12-04T10:49:10.9321119Z [W1204 10:03:58.129902365 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9321310Z 2025-12-04T10:49:10.9321475Z [W1204 10:03:58.129962133 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9321668Z 2025-12-04T10:49:10.9321829Z [W1204 10:03:58.130060451 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9322112Z 2025-12-04T10:49:10.9322268Z [W1204 10:03:58.130120700 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9322470Z 2025-12-04T10:49:10.9322624Z [W1204 10:03:58.130204718 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9322826Z 2025-12-04T10:49:10.9322982Z [W1204 10:03:58.130281576 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9323182Z 2025-12-04T10:49:10.9323373Z [W1204 10:03:58.168459836 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9323575Z 2025-12-04T10:49:10.9323730Z [W1204 10:03:58.168565764 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9323932Z 2025-12-04T10:49:10.9324086Z [W1204 10:03:58.168641262 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9324284Z 2025-12-04T10:49:10.9324448Z [W1204 10:03:58.168742840 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9324641Z 2025-12-04T10:49:10.9324804Z [W1204 10:03:58.168804158 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9324995Z 2025-12-04T10:49:10.9325158Z [W1204 10:03:58.168901696 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9325358Z 2025-12-04T10:49:10.9325524Z [W1204 10:03:58.168961465 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9325723Z 2025-12-04T10:49:10.9325893Z [W1204 10:03:58.169051483 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9326087Z 2025-12-04T10:49:10.9326249Z [W1204 10:03:58.169112701 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9326489Z 2025-12-04T10:49:10.9326539Z FAILED [0.5525s] [100%] 2025-12-04T10:49:10.9326618Z 2025-12-04T10:49:10.9326682Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9326960Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9327222Z Traceback (most recent call last): 2025-12-04T10:49:10.9327479Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9327731Z method(*args, **kwargs) 2025-12-04T10:49:10.9327975Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9328221Z method(*args, **kwargs) 2025-12-04T10:49:10.9328458Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9328710Z with policy(): 2025-12-04T10:49:10.9328943Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9329190Z raise RuntimeError(msg) 2025-12-04T10:49:10.9329688Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 24576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9330140Z 2025-12-04T10:49:10.9330218Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9330647Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9330993Z 2025-12-04T10:49:10.9331087Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9331314Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9331505Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9331822Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9332182Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9332349Z graph_break [] 2025-12-04T10:49:10.9332502Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9332987Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9333424Z if out == self.unknown_value: 2025-12-04T10:49:10.9333676Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9333933Z Traceback (most recent call last): 2025-12-04T10:49:10.9334184Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9334434Z method(*args, **kwargs) 2025-12-04T10:49:10.9334678Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9334925Z method(*args, **kwargs) 2025-12-04T10:49:10.9335158Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9335400Z with policy(): 2025-12-04T10:49:10.9335628Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9335915Z raise RuntimeError(msg) 2025-12-04T10:49:10.9336409Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 24576 and is now reported as 49152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9336858Z 2025-12-04T10:49:10.9336940Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9337360Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9337698Z 2025-12-04T10:49:10.9337799Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9338019Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9338211Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9338498Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9338805Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9338969Z graph_break [] 2025-12-04T10:49:10.9339113Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9339610Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9340040Z if out == self.unknown_value: 2025-12-04T10:49:10.9340201Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9340378Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9340550Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9340883Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9341146Z graph_break [] 2025-12-04T10:49:10.9341264Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9341525Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9341775Z Traceback (most recent call last): 2025-12-04T10:49:10.9342060Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9342301Z method(*args, **kwargs) 2025-12-04T10:49:10.9342533Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9342769Z method(*args, **kwargs) 2025-12-04T10:49:10.9342994Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9343229Z with policy(): 2025-12-04T10:49:10.9343457Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9343695Z raise RuntimeError(msg) 2025-12-04T10:49:10.9344184Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9344669Z 2025-12-04T10:49:10.9344745Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9345156Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9345493Z 2025-12-04T10:49:10.9345584Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9345789Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9345976Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9346253Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9346547Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9346711Z graph_break [] 2025-12-04T10:49:10.9346853Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9347316Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9347751Z if out == self.unknown_value: 2025-12-04T10:49:10.9347904Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9348086Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9348262Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9348559Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9348820Z graph_break [] 2025-12-04T10:49:10.9348958Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9349137Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9349310Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9349639Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9349899Z graph_break [] 2025-12-04T10:49:10.9350210Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-eccf8a19b6130b9e.xml - 2025-12-04T10:49:10.9350562Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9351316Z FAILED [0.5525s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9352060Z 2025-12-04T10:49:10.9352144Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9352566Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9352898Z 2025-12-04T10:49:10.9352995Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9353191Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9353406Z ================== 1 failed, 57 deselected, 2 rerun in 11.61s ================== 2025-12-04T10:49:10.9353562Z Got exit code 1 2025-12-04T10:49:10.9353671Z Retrying single test... 2025-12-04T10:49:10.9353947Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-4f6c9fe929938079.xml 2025-12-04T10:49:10.9354259Z ============================= test session starts ============================== 2025-12-04T10:49:10.9354482Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9354686Z cachedir: .pytest_cache 2025-12-04T10:49:10.9354914Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9355160Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9355289Z configfile: pytest.ini 2025-12-04T10:49:10.9355528Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9355821Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9356313Z stepcurrent: skipping 0 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9356694Z Running 1 items in this shard 2025-12-04T10:49:10.9356770Z 2025-12-04T10:49:10.9357153Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:04:07.009127063 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9357556Z 2025-12-04T10:49:10.9357717Z [W1204 10:04:14.520587738 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9357917Z 2025-12-04T10:49:10.9358073Z [W1204 10:04:14.520769024 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9358266Z 2025-12-04T10:49:10.9358433Z [W1204 10:04:14.521282052 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9358658Z 2025-12-04T10:49:10.9358812Z [W1204 10:04:14.521410569 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9359005Z 2025-12-04T10:49:10.9359165Z [W1204 10:04:14.522643950 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9359354Z 2025-12-04T10:49:10.9359511Z [W1204 10:04:14.522714499 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9359711Z 2025-12-04T10:49:10.9359873Z [W1204 10:04:14.522830596 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9360065Z 2025-12-04T10:49:10.9360221Z [W1204 10:04:14.522895764 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9360413Z 2025-12-04T10:49:10.9360571Z [W1204 10:04:15.527410110 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9360764Z 2025-12-04T10:49:10.9360920Z [W1204 10:04:15.527508308 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9361115Z 2025-12-04T10:49:10.9361270Z [W1204 10:04:15.527581966 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9361496Z 2025-12-04T10:49:10.9361655Z [W1204 10:04:15.527678564 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9361880Z 2025-12-04T10:49:10.9362035Z [W1204 10:04:15.527740602 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9362232Z 2025-12-04T10:49:10.9362391Z [W1204 10:04:15.527851430 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9362590Z 2025-12-04T10:49:10.9362744Z [W1204 10:04:15.527912608 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9362937Z 2025-12-04T10:49:10.9363094Z [W1204 10:04:15.527999136 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9363295Z 2025-12-04T10:49:10.9363458Z [W1204 10:04:15.528064665 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9363648Z 2025-12-04T10:49:10.9363807Z [W1204 10:04:15.568894431 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9364000Z 2025-12-04T10:49:10.9364161Z [W1204 10:04:15.569018598 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9364353Z 2025-12-04T10:49:10.9364514Z [W1204 10:04:15.569094846 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9364710Z 2025-12-04T10:49:10.9364870Z [W1204 10:04:15.569201023 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9365064Z 2025-12-04T10:49:10.9365224Z [W1204 10:04:15.569264712 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9365417Z 2025-12-04T10:49:10.9365570Z [W1204 10:04:15.569378299 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9365763Z 2025-12-04T10:49:10.9365940Z [W1204 10:04:15.569441788 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9366136Z 2025-12-04T10:49:10.9366294Z [W1204 10:04:15.569530616 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9366486Z 2025-12-04T10:49:10.9366640Z [W1204 10:04:15.569591094 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9366841Z 2025-12-04T10:49:10.9366900Z ('RERUN', {'yellow': True}) [10.1896s] [100%] 2025-12-04T10:49:10.9367390Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:04:16.714730521 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9367796Z 2025-12-04T10:49:10.9367960Z [W1204 10:04:16.714923576 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9368154Z 2025-12-04T10:49:10.9368310Z [W1204 10:04:16.715007345 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9368499Z 2025-12-04T10:49:10.9368658Z [W1204 10:04:16.715122012 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9368849Z 2025-12-04T10:49:10.9369016Z [W1204 10:04:16.715187350 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9369252Z 2025-12-04T10:49:10.9369406Z [W1204 10:04:16.715290748 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9369598Z 2025-12-04T10:49:10.9369754Z [W1204 10:04:16.715352067 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9369950Z 2025-12-04T10:49:10.9370102Z [W1204 10:04:16.715437795 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9370305Z 2025-12-04T10:49:10.9370458Z [W1204 10:04:16.715497123 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9370652Z 2025-12-04T10:49:10.9370804Z [W1204 10:04:16.718192061 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9371001Z 2025-12-04T10:49:10.9371158Z [W1204 10:04:16.718298028 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9371351Z 2025-12-04T10:49:10.9371509Z [W1204 10:04:16.718370837 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9371697Z 2025-12-04T10:49:10.9371898Z [W1204 10:04:16.718468385 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9372086Z 2025-12-04T10:49:10.9372245Z [W1204 10:04:16.718529373 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9372437Z 2025-12-04T10:49:10.9372595Z [W1204 10:04:16.718625991 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9372787Z 2025-12-04T10:49:10.9372943Z [W1204 10:04:16.718699509 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9373134Z 2025-12-04T10:49:10.9373293Z [W1204 10:04:16.718785837 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9373515Z 2025-12-04T10:49:10.9373669Z [W1204 10:04:16.718844706 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9373862Z 2025-12-04T10:49:10.9374014Z [W1204 10:04:16.757231148 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9374215Z 2025-12-04T10:49:10.9374367Z [W1204 10:04:16.757339556 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9374561Z 2025-12-04T10:49:10.9374712Z [W1204 10:04:16.757413884 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9374904Z 2025-12-04T10:49:10.9375059Z [W1204 10:04:16.757514212 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9375253Z 2025-12-04T10:49:10.9375476Z [W1204 10:04:16.757574470 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9375669Z 2025-12-04T10:49:10.9375828Z [W1204 10:04:16.757671938 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9376021Z 2025-12-04T10:49:10.9376179Z [W1204 10:04:16.757732237 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9376401Z 2025-12-04T10:49:10.9376559Z [W1204 10:04:16.757819715 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9376754Z 2025-12-04T10:49:10.9376916Z [W1204 10:04:16.757877234 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9377105Z 2025-12-04T10:49:10.9377167Z ('RERUN', {'yellow': True}) [0.6997s] [100%] 2025-12-04T10:49:10.9377621Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:04:16.460093889 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9378032Z 2025-12-04T10:49:10.9378192Z [W1204 10:04:16.460274305 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9378396Z 2025-12-04T10:49:10.9378553Z [W1204 10:04:16.460351893 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9378745Z 2025-12-04T10:49:10.9378899Z [W1204 10:04:16.460458321 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9379091Z 2025-12-04T10:49:10.9379246Z [W1204 10:04:16.460523019 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9379445Z 2025-12-04T10:49:10.9379603Z [W1204 10:04:16.460645346 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9379794Z 2025-12-04T10:49:10.9379953Z [W1204 10:04:16.460711835 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9380151Z 2025-12-04T10:49:10.9380309Z [W1204 10:04:16.460799603 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9380498Z 2025-12-04T10:49:10.9380654Z [W1204 10:04:16.460859412 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9380845Z 2025-12-04T10:49:10.9381037Z [W1204 10:04:16.463479271 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9381228Z 2025-12-04T10:49:10.9381384Z [W1204 10:04:16.463586338 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9381579Z 2025-12-04T10:49:10.9381732Z [W1204 10:04:16.463658357 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9381982Z 2025-12-04T10:49:10.9382142Z [W1204 10:04:16.463755665 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9382336Z 2025-12-04T10:49:10.9382489Z [W1204 10:04:16.463817133 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9382688Z 2025-12-04T10:49:10.9382847Z [W1204 10:04:16.463912941 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9383040Z 2025-12-04T10:49:10.9383195Z [W1204 10:04:16.463984039 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9383394Z 2025-12-04T10:49:10.9383546Z [W1204 10:04:16.464075297 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9383739Z 2025-12-04T10:49:10.9383938Z [W1204 10:04:16.464135856 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9384133Z 2025-12-04T10:49:10.9384292Z [W1204 10:04:16.502936219 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9384479Z 2025-12-04T10:49:10.9384635Z [W1204 10:04:16.503047366 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9384822Z 2025-12-04T10:49:10.9384979Z [W1204 10:04:16.503121754 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9385167Z 2025-12-04T10:49:10.9385322Z [W1204 10:04:16.503224132 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9385508Z 2025-12-04T10:49:10.9385665Z [W1204 10:04:16.503286571 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9385858Z 2025-12-04T10:49:10.9386010Z [W1204 10:04:16.503385238 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9386201Z 2025-12-04T10:49:10.9386355Z [W1204 10:04:16.503446037 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9386546Z 2025-12-04T10:49:10.9386697Z [W1204 10:04:16.503532465 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9386889Z 2025-12-04T10:49:10.9387039Z [W1204 10:04:16.503591254 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9387228Z 2025-12-04T10:49:10.9387269Z FAILED [0.7612s] [100%] 2025-12-04T10:49:10.9387340Z 2025-12-04T10:49:10.9387398Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9387655Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9387899Z Traceback (most recent call last): 2025-12-04T10:49:10.9388173Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9388411Z method(*args, **kwargs) 2025-12-04T10:49:10.9388638Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9388874Z method(*args, **kwargs) 2025-12-04T10:49:10.9389097Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9389328Z with policy(): 2025-12-04T10:49:10.9389544Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9389780Z raise RuntimeError(msg) 2025-12-04T10:49:10.9390260Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 24576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9390700Z 2025-12-04T10:49:10.9390779Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9391193Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9391639Z 2025-12-04T10:49:10.9391732Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9391997Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9392174Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9392451Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9392746Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9392897Z graph_break [] 2025-12-04T10:49:10.9393031Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9393491Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9393923Z if out == self.unknown_value: 2025-12-04T10:49:10.9394167Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9394418Z Traceback (most recent call last): 2025-12-04T10:49:10.9394654Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9394891Z method(*args, **kwargs) 2025-12-04T10:49:10.9395117Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9395356Z method(*args, **kwargs) 2025-12-04T10:49:10.9395581Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9395812Z with policy(): 2025-12-04T10:49:10.9396029Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9396267Z raise RuntimeError(msg) 2025-12-04T10:49:10.9396757Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 24576 and is now reported as 49152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9397203Z 2025-12-04T10:49:10.9397314Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9397724Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9398059Z 2025-12-04T10:49:10.9398150Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9398354Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9398534Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9398811Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9399106Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9399259Z graph_break [] 2025-12-04T10:49:10.9399396Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9399857Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9400284Z if out == self.unknown_value: 2025-12-04T10:49:10.9400436Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9400613Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9400824Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9401116Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9401373Z graph_break [] 2025-12-04T10:49:10.9401490Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9401747Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9402033Z Traceback (most recent call last): 2025-12-04T10:49:10.9402272Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9402511Z method(*args, **kwargs) 2025-12-04T10:49:10.9402736Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9402979Z method(*args, **kwargs) 2025-12-04T10:49:10.9403208Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9403440Z with policy(): 2025-12-04T10:49:10.9403658Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9403900Z raise RuntimeError(msg) 2025-12-04T10:49:10.9404384Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9404833Z 2025-12-04T10:49:10.9404913Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9405330Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9405662Z 2025-12-04T10:49:10.9405757Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9405993Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9406171Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9406447Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9406737Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9406885Z graph_break [] 2025-12-04T10:49:10.9407018Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9407474Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9407900Z if out == self.unknown_value: 2025-12-04T10:49:10.9408058Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9408232Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9408399Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9408692Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9408948Z graph_break [] 2025-12-04T10:49:10.9409082Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9409289Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9409459Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9409747Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9410002Z graph_break [] 2025-12-04T10:49:10.9410306Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-4f6c9fe929938079.xml - 2025-12-04T10:49:10.9410649Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9411396Z FAILED [0.7612s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9412135Z 2025-12-04T10:49:10.9412210Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9412624Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9412959Z 2025-12-04T10:49:10.9413051Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9413245Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9413421Z ================== 1 failed, 57 deselected, 2 rerun in 11.82s ================== 2025-12-04T10:49:10.9413573Z Got exit code 1 2025-12-04T10:49:10.9413885Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9414300Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:10.9414701Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d25e1da6cba61e6b.xml 2025-12-04T10:49:10.9414999Z ============================= test session starts ============================== 2025-12-04T10:49:10.9415217Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9415416Z cachedir: .pytest_cache 2025-12-04T10:49:10.9415646Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9415894Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9416019Z configfile: pytest.ini 2025-12-04T10:49:10.9416255Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9416538Z collecting ... collected 58 items / 1 deselected / 57 selected 2025-12-04T10:49:10.9416706Z stepcurrent: skipping 1 already run items. 2025-12-04T10:49:10.9416847Z Running 57 items in this shard 2025-12-04T10:49:10.9416921Z 2025-12-04T10:49:10.9417188Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.6667s] [ 1%] 2025-12-04T10:49:10.9417741Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.7192s] [ 1%] 2025-12-04T10:49:10.9418301Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 FAILED [0.7078s] [ 1%] 2025-12-04T10:49:10.9418572Z 2025-12-04T10:49:10.9418634Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9418898Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9419151Z Traceback (most recent call last): 2025-12-04T10:49:10.9419400Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9419646Z method(*args, **kwargs) 2025-12-04T10:49:10.9419879Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9420123Z method(*args, **kwargs) 2025-12-04T10:49:10.9420353Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9420595Z with policy(): 2025-12-04T10:49:10.9420817Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9421061Z raise RuntimeError(msg) 2025-12-04T10:49:10.9421550Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 24576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9422040Z 2025-12-04T10:49:10.9422117Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9422532Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9422871Z 2025-12-04T10:49:10.9422962Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9423169Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9423353Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9423670Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9423966Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9424126Z graph_break [] 2025-12-04T10:49:10.9424350Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9424599Z Traceback (most recent call last): 2025-12-04T10:49:10.9424846Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9425087Z method(*args, **kwargs) 2025-12-04T10:49:10.9425318Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9425559Z method(*args, **kwargs) 2025-12-04T10:49:10.9425790Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9426025Z with policy(): 2025-12-04T10:49:10.9426248Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9426491Z raise RuntimeError(msg) 2025-12-04T10:49:10.9426983Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 24576 and is now reported as 49152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9427572Z 2025-12-04T10:49:10.9427650Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9428072Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9428408Z 2025-12-04T10:49:10.9428504Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9428710Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9428892Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9429173Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9429473Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9429630Z graph_break [] 2025-12-04T10:49:10.9429769Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9429952Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9430126Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9430421Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9430681Z graph_break [] 2025-12-04T10:49:10.9430799Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9431060Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9431317Z Traceback (most recent call last): 2025-12-04T10:49:10.9431561Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9431804Z method(*args, **kwargs) 2025-12-04T10:49:10.9432087Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9432327Z method(*args, **kwargs) 2025-12-04T10:49:10.9432595Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9432832Z with policy(): 2025-12-04T10:49:10.9433056Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9433301Z raise RuntimeError(msg) 2025-12-04T10:49:10.9433793Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9434242Z 2025-12-04T10:49:10.9434328Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9434750Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9435088Z 2025-12-04T10:49:10.9435178Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9435385Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9435569Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9435849Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9436179Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9436338Z graph_break [] 2025-12-04T10:49:10.9436478Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9436657Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9436835Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9437134Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9437395Z graph_break [] 2025-12-04T10:49:10.9437534Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9437714Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9437887Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9438188Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9438449Z graph_break [] 2025-12-04T10:49:10.9438762Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d25e1da6cba61e6b.xml - 2025-12-04T10:49:10.9439113Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9439867Z FAILED [0.7078s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9440547Z 2025-12-04T10:49:10.9440624Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9441068Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9441400Z 2025-12-04T10:49:10.9441497Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9441693Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9441921Z =================== 1 failed, 1 deselected, 2 rerun in 4.26s =================== 2025-12-04T10:49:10.9442074Z Got exit code 1 2025-12-04T10:49:10.9442178Z Retrying single test... 2025-12-04T10:49:10.9442447Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6b6334b69131f720.xml 2025-12-04T10:49:10.9442740Z ============================= test session starts ============================== 2025-12-04T10:49:10.9442956Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9443146Z cachedir: .pytest_cache 2025-12-04T10:49:10.9443371Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9443608Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9443729Z configfile: pytest.ini 2025-12-04T10:49:10.9443955Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9444229Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9444629Z stepcurrent: skipping 1 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9445028Z Running 1 items in this shard 2025-12-04T10:49:10.9445099Z 2025-12-04T10:49:10.9445477Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:04:37.051073318 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9445879Z 2025-12-04T10:49:10.9446034Z [W1204 10:04:45.725937338 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9446225Z 2025-12-04T10:49:10.9446377Z [W1204 10:04:45.726072495 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9446570Z 2025-12-04T10:49:10.9446722Z [W1204 10:04:45.726492265 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9446913Z 2025-12-04T10:49:10.9447064Z [W1204 10:04:45.726580613 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9447254Z 2025-12-04T10:49:10.9447408Z [W1204 10:04:45.727824555 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9447596Z 2025-12-04T10:49:10.9447745Z [W1204 10:04:45.727888273 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9447935Z 2025-12-04T10:49:10.9448084Z [W1204 10:04:45.727986661 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9448277Z 2025-12-04T10:49:10.9448433Z [W1204 10:04:45.728048240 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9448619Z 2025-12-04T10:49:10.9448771Z [W1204 10:04:45.732239484 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9448957Z 2025-12-04T10:49:10.9449134Z [W1204 10:04:45.732338762 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9449320Z 2025-12-04T10:49:10.9449471Z [W1204 10:04:45.732413840 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9449658Z 2025-12-04T10:49:10.9449810Z [W1204 10:04:45.732512668 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9449998Z 2025-12-04T10:49:10.9450154Z [W1204 10:04:45.732576106 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9450343Z 2025-12-04T10:49:10.9450491Z [W1204 10:04:45.732670874 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9450679Z 2025-12-04T10:49:10.9450832Z [W1204 10:04:45.732731333 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9451019Z 2025-12-04T10:49:10.9451169Z [W1204 10:04:45.732821231 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9451356Z 2025-12-04T10:49:10.9451504Z [W1204 10:04:45.732879800 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9451691Z 2025-12-04T10:49:10.9451840Z [W1204 10:04:45.776199690 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9452102Z 2025-12-04T10:49:10.9452252Z [W1204 10:04:45.776313737 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9452437Z 2025-12-04T10:49:10.9452590Z [W1204 10:04:45.776387266 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9452777Z 2025-12-04T10:49:10.9452930Z [W1204 10:04:45.776488193 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9453114Z 2025-12-04T10:49:10.9453266Z [W1204 10:04:45.776548152 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9453451Z 2025-12-04T10:49:10.9453601Z [W1204 10:04:45.776644740 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9453790Z 2025-12-04T10:49:10.9453942Z [W1204 10:04:45.776703649 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9454130Z 2025-12-04T10:49:10.9454281Z [W1204 10:04:45.776788817 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9454468Z 2025-12-04T10:49:10.9454618Z [W1204 10:04:45.776844995 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9454806Z 2025-12-04T10:49:10.9454902Z ('RERUN', {'yellow': True}) [10.2809s] [100%] 2025-12-04T10:49:10.9455393Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:04:46.794349375 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9455818Z 2025-12-04T10:49:10.9455982Z [W1204 10:04:46.794584390 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9456285Z 2025-12-04T10:49:10.9456479Z [W1204 10:04:46.794681078 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9456691Z 2025-12-04T10:49:10.9456863Z [W1204 10:04:46.794840764 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9457054Z 2025-12-04T10:49:10.9457252Z [W1204 10:04:46.794939352 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9457452Z 2025-12-04T10:49:10.9457624Z [W1204 10:04:46.795076319 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9457821Z 2025-12-04T10:49:10.9457988Z [W1204 10:04:46.795171766 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9458199Z 2025-12-04T10:49:10.9458562Z [W1204 10:04:46.795279904 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9458771Z 2025-12-04T10:49:10.9458932Z [W1204 10:04:46.795345822 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9459137Z 2025-12-04T10:49:10.9459311Z [W1204 10:04:46.799534227 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9459525Z 2025-12-04T10:49:10.9459685Z [W1204 10:04:46.799766322 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9459920Z 2025-12-04T10:49:10.9460085Z [W1204 10:04:46.799861009 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9460299Z 2025-12-04T10:49:10.9460479Z [W1204 10:04:46.800014926 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9460679Z 2025-12-04T10:49:10.9460858Z [W1204 10:04:46.800122053 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9461054Z 2025-12-04T10:49:10.9461233Z [W1204 10:04:46.800241721 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9461432Z 2025-12-04T10:49:10.9461605Z [W1204 10:04:46.800308129 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9461809Z 2025-12-04T10:49:10.9462017Z [W1204 10:04:46.800419647 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9462209Z 2025-12-04T10:49:10.9462396Z [W1204 10:04:46.800481835 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9462606Z 2025-12-04T10:49:10.9462770Z [W1204 10:04:46.841909469 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9462980Z 2025-12-04T10:49:10.9463137Z [W1204 10:04:46.842064686 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9463359Z 2025-12-04T10:49:10.9463517Z [W1204 10:04:46.842139244 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9463743Z 2025-12-04T10:49:10.9463902Z [W1204 10:04:46.842251451 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9464110Z 2025-12-04T10:49:10.9464298Z [W1204 10:04:46.842311520 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9464493Z 2025-12-04T10:49:10.9464707Z [W1204 10:04:46.842412848 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9464907Z 2025-12-04T10:49:10.9465074Z [W1204 10:04:46.842472276 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9465282Z 2025-12-04T10:49:10.9465464Z [W1204 10:04:46.842557834 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9465664Z 2025-12-04T10:49:10.9466232Z [W1204 10:04:46.842614793 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9466428Z 2025-12-04T10:49:10.9466511Z ('RERUN', {'yellow': True}) [0.5147s] [100%] 2025-12-04T10:49:10.9467004Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:04:46.319665540 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9467423Z 2025-12-04T10:49:10.9467585Z [W1204 10:04:46.319841606 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9467800Z 2025-12-04T10:49:10.9467968Z [W1204 10:04:46.319917175 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9468206Z 2025-12-04T10:49:10.9468366Z [W1204 10:04:46.320028632 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9468576Z 2025-12-04T10:49:10.9468759Z [W1204 10:04:46.320096681 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9468962Z 2025-12-04T10:49:10.9469138Z [W1204 10:04:46.320198248 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9469334Z 2025-12-04T10:49:10.9469509Z [W1204 10:04:46.320257517 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9469704Z 2025-12-04T10:49:10.9469890Z [W1204 10:04:46.320343245 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9470085Z 2025-12-04T10:49:10.9470263Z [W1204 10:04:46.320401824 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9470461Z 2025-12-04T10:49:10.9470633Z [W1204 10:04:46.322945106 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9470858Z 2025-12-04T10:49:10.9471018Z [W1204 10:04:46.323047573 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9471226Z 2025-12-04T10:49:10.9471385Z [W1204 10:04:46.323118292 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9471597Z 2025-12-04T10:49:10.9471770Z [W1204 10:04:46.323211510 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9472030Z 2025-12-04T10:49:10.9472190Z [W1204 10:04:46.323270938 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9472408Z 2025-12-04T10:49:10.9472570Z [W1204 10:04:46.323365046 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9472796Z 2025-12-04T10:49:10.9473007Z [W1204 10:04:46.323423135 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9473205Z 2025-12-04T10:49:10.9473385Z [W1204 10:04:46.323507703 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9473580Z 2025-12-04T10:49:10.9473759Z [W1204 10:04:46.323569591 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9473961Z 2025-12-04T10:49:10.9474138Z [W1204 10:04:46.361704201 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9474339Z 2025-12-04T10:49:10.9474514Z [W1204 10:04:46.361808458 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9474727Z 2025-12-04T10:49:10.9474895Z [W1204 10:04:46.361881487 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9475107Z 2025-12-04T10:49:10.9475267Z [W1204 10:04:46.361980324 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9475475Z 2025-12-04T10:49:10.9475634Z [W1204 10:04:46.362046013 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9475858Z 2025-12-04T10:49:10.9476022Z [W1204 10:04:46.362145381 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9476260Z 2025-12-04T10:49:10.9476420Z [W1204 10:04:46.362204479 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9476626Z 2025-12-04T10:49:10.9476818Z [W1204 10:04:46.362289317 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9477023Z 2025-12-04T10:49:10.9477195Z [W1204 10:04:46.362346646 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9477389Z 2025-12-04T10:49:10.9477445Z FAILED [0.5114s] [100%] 2025-12-04T10:49:10.9477529Z 2025-12-04T10:49:10.9477616Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9477896Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9478184Z Traceback (most recent call last): 2025-12-04T10:49:10.9478461Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9478732Z method(*args, **kwargs) 2025-12-04T10:49:10.9479004Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9479267Z method(*args, **kwargs) 2025-12-04T10:49:10.9479516Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9479792Z with policy(): 2025-12-04T10:49:10.9480035Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9480309Z raise RuntimeError(msg) 2025-12-04T10:49:10.9481595Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 24576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9482086Z 2025-12-04T10:49:10.9482176Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9482687Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9483039Z 2025-12-04T10:49:10.9483140Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9483365Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9483586Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9483893Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9484227Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9484416Z graph_break [] 2025-12-04T10:49:10.9484576Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9485077Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9485546Z if out == self.unknown_value: 2025-12-04T10:49:10.9485813Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9486093Z Traceback (most recent call last): 2025-12-04T10:49:10.9486358Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9486673Z method(*args, **kwargs) 2025-12-04T10:49:10.9486922Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9487176Z method(*args, **kwargs) 2025-12-04T10:49:10.9487454Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9487714Z with policy(): 2025-12-04T10:49:10.9487969Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9488233Z raise RuntimeError(msg) 2025-12-04T10:49:10.9488743Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 24576 and is now reported as 49152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9489196Z 2025-12-04T10:49:10.9489316Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9489751Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9490102Z 2025-12-04T10:49:10.9490198Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9490450Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9490653Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9490964Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9491287Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9491470Z graph_break [] 2025-12-04T10:49:10.9491639Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9492213Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9492658Z if out == self.unknown_value: 2025-12-04T10:49:10.9492851Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9493058Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9493261Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9493587Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9493876Z graph_break [] 2025-12-04T10:49:10.9494025Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9494310Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9494574Z Traceback (most recent call last): 2025-12-04T10:49:10.9494861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9495123Z method(*args, **kwargs) 2025-12-04T10:49:10.9495378Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9495653Z method(*args, **kwargs) 2025-12-04T10:49:10.9495904Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9496207Z with policy(): 2025-12-04T10:49:10.9496451Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9496740Z raise RuntimeError(msg) 2025-12-04T10:49:10.9497255Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9497705Z 2025-12-04T10:49:10.9497802Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9498257Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9498598Z 2025-12-04T10:49:10.9498709Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9498935Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9499156Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9499457Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9499797Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9499977Z graph_break [] 2025-12-04T10:49:10.9500142Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9549248Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9549708Z if out == self.unknown_value: 2025-12-04T10:49:10.9549877Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9550059Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9550238Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9550622Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9550883Z graph_break [] 2025-12-04T10:49:10.9551019Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9551198Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9551367Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9551654Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9551946Z graph_break [] 2025-12-04T10:49:10.9552257Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6b6334b69131f720.xml - 2025-12-04T10:49:10.9552600Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9553513Z FAILED [0.5114s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9554237Z 2025-12-04T10:49:10.9554321Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9554735Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9555066Z 2025-12-04T10:49:10.9555167Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9555364Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9555536Z ================== 1 failed, 57 deselected, 2 rerun in 11.47s ================== 2025-12-04T10:49:10.9555681Z Got exit code 1 2025-12-04T10:49:10.9555782Z Retrying single test... 2025-12-04T10:49:10.9556056Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-2dc4080e561c85f8.xml 2025-12-04T10:49:10.9556360Z ============================= test session starts ============================== 2025-12-04T10:49:10.9556580Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9556777Z cachedir: .pytest_cache 2025-12-04T10:49:10.9557005Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9557256Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9557380Z configfile: pytest.ini 2025-12-04T10:49:10.9557624Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9557902Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9558310Z stepcurrent: skipping 1 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9558694Z Running 1 items in this shard 2025-12-04T10:49:10.9558770Z 2025-12-04T10:49:10.9559180Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:04:55.495853008 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9559581Z 2025-12-04T10:49:10.9559742Z [W1204 10:05:03.841699007 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9559938Z 2025-12-04T10:49:10.9560089Z [W1204 10:05:03.841863744 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9560283Z 2025-12-04T10:49:10.9560432Z [W1204 10:05:03.842305224 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9560628Z 2025-12-04T10:49:10.9560781Z [W1204 10:05:03.842408531 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9560975Z 2025-12-04T10:49:10.9561124Z [W1204 10:05:03.843119055 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9561321Z 2025-12-04T10:49:10.9561471Z [W1204 10:05:03.843191084 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9561661Z 2025-12-04T10:49:10.9561813Z [W1204 10:05:03.843311591 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9562046Z 2025-12-04T10:49:10.9562201Z [W1204 10:05:03.843372049 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9562418Z 2025-12-04T10:49:10.9562571Z [W1204 10:05:03.848284788 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9562760Z 2025-12-04T10:49:10.9562912Z [W1204 10:05:03.848389806 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9563097Z 2025-12-04T10:49:10.9563257Z [W1204 10:05:03.848462154 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9563443Z 2025-12-04T10:49:10.9563596Z [W1204 10:05:03.848559672 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9563786Z 2025-12-04T10:49:10.9563937Z [W1204 10:05:03.848620830 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9564127Z 2025-12-04T10:49:10.9564276Z [W1204 10:05:03.848716998 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9564469Z 2025-12-04T10:49:10.9564617Z [W1204 10:05:03.848778097 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9564811Z 2025-12-04T10:49:10.9564961Z [W1204 10:05:03.848865045 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9565151Z 2025-12-04T10:49:10.9565303Z [W1204 10:05:03.848923573 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9565497Z 2025-12-04T10:49:10.9565646Z [W1204 10:05:03.890338804 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9565837Z 2025-12-04T10:49:10.9565988Z [W1204 10:05:03.890460892 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9566176Z 2025-12-04T10:49:10.9566326Z [W1204 10:05:03.890533920 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9566512Z 2025-12-04T10:49:10.9566697Z [W1204 10:05:03.890635368 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9566883Z 2025-12-04T10:49:10.9567042Z [W1204 10:05:03.890695266 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9567228Z 2025-12-04T10:49:10.9567383Z [W1204 10:05:03.890792734 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9567572Z 2025-12-04T10:49:10.9567728Z [W1204 10:05:03.890851823 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9567918Z 2025-12-04T10:49:10.9568068Z [W1204 10:05:03.890939761 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9568261Z 2025-12-04T10:49:10.9568412Z [W1204 10:05:03.890998029 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9568603Z 2025-12-04T10:49:10.9568659Z ('RERUN', {'yellow': True}) [10.0844s] [100%] 2025-12-04T10:49:10.9569111Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:05:04.117095118 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9569513Z 2025-12-04T10:49:10.9569691Z [W1204 10:05:04.117279244 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9569879Z 2025-12-04T10:49:10.9570032Z [W1204 10:05:04.117364482 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9570218Z 2025-12-04T10:49:10.9570372Z [W1204 10:05:04.117474509 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9570738Z 2025-12-04T10:49:10.9570891Z [W1204 10:05:04.117538878 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9571077Z 2025-12-04T10:49:10.9571229Z [W1204 10:05:04.117641476 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9571415Z 2025-12-04T10:49:10.9571568Z [W1204 10:05:04.117702634 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9571762Z 2025-12-04T10:49:10.9571956Z [W1204 10:05:04.117791432 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9572147Z 2025-12-04T10:49:10.9572297Z [W1204 10:05:04.117852361 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9572487Z 2025-12-04T10:49:10.9572637Z [W1204 10:05:04.120526830 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9572827Z 2025-12-04T10:49:10.9572979Z [W1204 10:05:04.120628988 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9573168Z 2025-12-04T10:49:10.9573317Z [W1204 10:05:04.120701586 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9573509Z 2025-12-04T10:49:10.9573659Z [W1204 10:05:04.120799154 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9573844Z 2025-12-04T10:49:10.9574021Z [W1204 10:05:04.120861043 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9574207Z 2025-12-04T10:49:10.9574358Z [W1204 10:05:04.120957150 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9574543Z 2025-12-04T10:49:10.9574694Z [W1204 10:05:04.121021839 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9574879Z 2025-12-04T10:49:10.9575028Z [W1204 10:05:04.121108417 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9575215Z 2025-12-04T10:49:10.9575364Z [W1204 10:05:04.121167246 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9575551Z 2025-12-04T10:49:10.9575701Z [W1204 10:05:04.159495097 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9575889Z 2025-12-04T10:49:10.9576037Z [W1204 10:05:04.159602024 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9576223Z 2025-12-04T10:49:10.9576370Z [W1204 10:05:04.159674893 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9576558Z 2025-12-04T10:49:10.9576706Z [W1204 10:05:04.159774820 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9576928Z 2025-12-04T10:49:10.9577076Z [W1204 10:05:04.159835859 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9577264Z 2025-12-04T10:49:10.9577413Z [W1204 10:05:04.159934477 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9577604Z 2025-12-04T10:49:10.9577758Z [W1204 10:05:04.159994155 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9577943Z 2025-12-04T10:49:10.9578096Z [W1204 10:05:04.160085003 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9578282Z 2025-12-04T10:49:10.9578434Z [W1204 10:05:04.160143712 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9578623Z 2025-12-04T10:49:10.9578679Z ('RERUN', {'yellow': True}) [0.7039s] [100%] 2025-12-04T10:49:10.9579126Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:05:05.830155175 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9579529Z 2025-12-04T10:49:10.9579679Z [W1204 10:05:05.830347261 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9579868Z 2025-12-04T10:49:10.9580016Z [W1204 10:05:05.830420559 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9580204Z 2025-12-04T10:49:10.9580352Z [W1204 10:05:05.830541767 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9580542Z 2025-12-04T10:49:10.9580690Z [W1204 10:05:05.830604605 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9580878Z 2025-12-04T10:49:10.9581026Z [W1204 10:05:05.830705463 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9581216Z 2025-12-04T10:49:10.9581396Z [W1204 10:05:05.830765782 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9581587Z 2025-12-04T10:49:10.9581738Z [W1204 10:05:05.830851550 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9581961Z 2025-12-04T10:49:10.9582112Z [W1204 10:05:05.830909508 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9582300Z 2025-12-04T10:49:10.9582455Z [W1204 10:05:05.833504000 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9582641Z 2025-12-04T10:49:10.9582796Z [W1204 10:05:05.833602277 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9582982Z 2025-12-04T10:49:10.9583139Z [W1204 10:05:05.833673836 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9583325Z 2025-12-04T10:49:10.9583477Z [W1204 10:05:05.833768264 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9583665Z 2025-12-04T10:49:10.9583814Z [W1204 10:05:05.833827922 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9584064Z 2025-12-04T10:49:10.9584213Z [W1204 10:05:05.833922050 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9584401Z 2025-12-04T10:49:10.9584550Z [W1204 10:05:05.833981169 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9584740Z 2025-12-04T10:49:10.9584890Z [W1204 10:05:05.834069997 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9585079Z 2025-12-04T10:49:10.9585230Z [W1204 10:05:05.834129365 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9585419Z 2025-12-04T10:49:10.9585570Z [W1204 10:05:05.872161434 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9585755Z 2025-12-04T10:49:10.9585910Z [W1204 10:05:05.872265991 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9586095Z 2025-12-04T10:49:10.9586247Z [W1204 10:05:05.872338600 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9586433Z 2025-12-04T10:49:10.9586585Z [W1204 10:05:05.872438397 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9586770Z 2025-12-04T10:49:10.9586921Z [W1204 10:05:05.872497686 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9587106Z 2025-12-04T10:49:10.9587259Z [W1204 10:05:05.872616883 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9587448Z 2025-12-04T10:49:10.9587596Z [W1204 10:05:05.872676932 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9587787Z 2025-12-04T10:49:10.9587936Z [W1204 10:05:05.872761300 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9588127Z 2025-12-04T10:49:10.9588300Z [W1204 10:05:05.872819099 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9588490Z 2025-12-04T10:49:10.9588532Z FAILED [0.6969s] [100%] 2025-12-04T10:49:10.9588598Z 2025-12-04T10:49:10.9588654Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9588911Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9589155Z Traceback (most recent call last): 2025-12-04T10:49:10.9589397Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9589638Z method(*args, **kwargs) 2025-12-04T10:49:10.9589862Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9590095Z method(*args, **kwargs) 2025-12-04T10:49:10.9590314Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9590544Z with policy(): 2025-12-04T10:49:10.9590759Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9590992Z raise RuntimeError(msg) 2025-12-04T10:49:10.9591470Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 24576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9591977Z 2025-12-04T10:49:10.9592056Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9592469Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9592801Z 2025-12-04T10:49:10.9592893Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9593096Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9593271Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9593548Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9593842Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9593993Z graph_break [] 2025-12-04T10:49:10.9594124Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9594585Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9595006Z if out == self.unknown_value: 2025-12-04T10:49:10.9595241Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9595485Z Traceback (most recent call last): 2025-12-04T10:49:10.9595718Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9595948Z method(*args, **kwargs) 2025-12-04T10:49:10.9596170Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9596401Z method(*args, **kwargs) 2025-12-04T10:49:10.9596620Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9596848Z with policy(): 2025-12-04T10:49:10.9597089Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9597322Z raise RuntimeError(msg) 2025-12-04T10:49:10.9597806Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 24576 and is now reported as 49152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9598253Z 2025-12-04T10:49:10.9598327Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9598732Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9599063Z 2025-12-04T10:49:10.9599152Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9599350Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9599521Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9599794Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9600083Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9600258Z graph_break [] 2025-12-04T10:49:10.9600387Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9600843Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9601263Z if out == self.unknown_value: 2025-12-04T10:49:10.9601411Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9601579Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9601743Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9602071Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9602328Z graph_break [] 2025-12-04T10:49:10.9602439Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9602690Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9602926Z Traceback (most recent call last): 2025-12-04T10:49:10.9603157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9603386Z method(*args, **kwargs) 2025-12-04T10:49:10.9603600Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9603823Z method(*args, **kwargs) 2025-12-04T10:49:10.9604037Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9604256Z with policy(): 2025-12-04T10:49:10.9604463Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9604691Z raise RuntimeError(msg) 2025-12-04T10:49:10.9605197Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9605638Z 2025-12-04T10:49:10.9605713Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9606111Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9606438Z 2025-12-04T10:49:10.9606524Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9606718Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9606883Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9607147Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9607431Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9607575Z graph_break [] 2025-12-04T10:49:10.9607698Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9608147Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9608563Z if out == self.unknown_value: 2025-12-04T10:49:10.9608705Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9608902Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9609064Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9609345Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9609595Z graph_break [] 2025-12-04T10:49:10.9609718Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9609885Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9610045Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9610328Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9610575Z graph_break [] 2025-12-04T10:49:10.9610877Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-2dc4080e561c85f8.xml - 2025-12-04T10:49:10.9611208Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9612256Z FAILED [0.6969s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9612928Z 2025-12-04T10:49:10.9613002Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9613406Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9613735Z 2025-12-04T10:49:10.9613822Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9614050Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9614215Z ================== 1 failed, 57 deselected, 2 rerun in 11.65s ================== 2025-12-04T10:49:10.9614353Z Got exit code 1 2025-12-04T10:49:10.9614652Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9615052Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:10.9615413Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-35a974964a74c46b.xml 2025-12-04T10:49:10.9615697Z ============================= test session starts ============================== 2025-12-04T10:49:10.9615902Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9616089Z cachedir: .pytest_cache 2025-12-04T10:49:10.9616312Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9616547Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9616663Z configfile: pytest.ini 2025-12-04T10:49:10.9616890Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9617160Z collecting ... collected 58 items / 2 deselected / 56 selected 2025-12-04T10:49:10.9617316Z stepcurrent: skipping 2 already run items. 2025-12-04T10:49:10.9617482Z Running 56 items in this shard 2025-12-04T10:49:10.9617553Z 2025-12-04T10:49:10.9617814Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.6676s] [ 1%] 2025-12-04T10:49:10.9618358Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.5955s] [ 1%] 2025-12-04T10:49:10.9618875Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 FAILED [0.5930s] [ 1%] 2025-12-04T10:49:10.9619143Z 2025-12-04T10:49:10.9619195Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9619443Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9619685Z Traceback (most recent call last): 2025-12-04T10:49:10.9619918Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9620148Z method(*args, **kwargs) 2025-12-04T10:49:10.9620368Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9620596Z method(*args, **kwargs) 2025-12-04T10:49:10.9620811Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9621033Z with policy(): 2025-12-04T10:49:10.9621242Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9621468Z raise RuntimeError(msg) 2025-12-04T10:49:10.9621980Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 24576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9622419Z 2025-12-04T10:49:10.9622493Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9622936Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9623269Z 2025-12-04T10:49:10.9623356Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9623548Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9623716Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9623987Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9624273Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9624416Z graph_break [] 2025-12-04T10:49:10.9624628Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9624864Z Traceback (most recent call last): 2025-12-04T10:49:10.9625092Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9625318Z method(*args, **kwargs) 2025-12-04T10:49:10.9625532Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9625757Z method(*args, **kwargs) 2025-12-04T10:49:10.9626005Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9626226Z with policy(): 2025-12-04T10:49:10.9626432Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9626658Z raise RuntimeError(msg) 2025-12-04T10:49:10.9627137Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 24576 and is now reported as 49152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9627577Z 2025-12-04T10:49:10.9627650Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9628049Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9628384Z 2025-12-04T10:49:10.9628471Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9628663Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9628831Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9629096Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9629376Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9629518Z graph_break [] 2025-12-04T10:49:10.9629642Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9629806Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9629968Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9630248Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9630492Z graph_break [] 2025-12-04T10:49:10.9630596Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9630872Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9631107Z Traceback (most recent call last): 2025-12-04T10:49:10.9631335Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9631561Z method(*args, **kwargs) 2025-12-04T10:49:10.9631777Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9632039Z method(*args, **kwargs) 2025-12-04T10:49:10.9632252Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9632472Z with policy(): 2025-12-04T10:49:10.9632680Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9632911Z raise RuntimeError(msg) 2025-12-04T10:49:10.9633386Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9633828Z 2025-12-04T10:49:10.9633900Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9634299Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9634665Z 2025-12-04T10:49:10.9634750Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9634942Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9635003Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9635178Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9635250Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9635286Z graph_break [] 2025-12-04T10:49:10.9635359Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9635416Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9635486Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9635660Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9635696Z graph_break [] 2025-12-04T10:49:10.9635772Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9635826Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9635896Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9636069Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9636106Z graph_break [] 2025-12-04T10:49:10.9636344Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-35a974964a74c46b.xml - 2025-12-04T10:49:10.9636405Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9637072Z FAILED [0.5930s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9637074Z 2025-12-04T10:49:10.9637148Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9639109Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9639115Z 2025-12-04T10:49:10.9639199Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9639262Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9639332Z =================== 1 failed, 2 deselected, 2 rerun in 4.00s =================== 2025-12-04T10:49:10.9639371Z Got exit code 1 2025-12-04T10:49:10.9639411Z Retrying single test... 2025-12-04T10:49:10.9639609Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d220605347aad505.xml 2025-12-04T10:49:10.9639665Z ============================= test session starts ============================== 2025-12-04T10:49:10.9639778Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9639856Z cachedir: .pytest_cache 2025-12-04T10:49:10.9640015Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9640060Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9640102Z configfile: pytest.ini 2025-12-04T10:49:10.9640267Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9640341Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9640633Z stepcurrent: skipping 2 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9640678Z Running 1 items in this shard 2025-12-04T10:49:10.9640680Z 2025-12-04T10:49:10.9641053Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:05:26.014274481 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9641058Z 2025-12-04T10:49:10.9641212Z [W1204 10:05:33.487267627 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9641216Z 2025-12-04T10:49:10.9641365Z [W1204 10:05:33.487414574 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9641367Z 2025-12-04T10:49:10.9641514Z [W1204 10:05:33.487856924 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9641517Z 2025-12-04T10:49:10.9641664Z [W1204 10:05:33.487950312 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9641671Z 2025-12-04T10:49:10.9641818Z [W1204 10:05:33.488850142 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9641820Z 2025-12-04T10:49:10.9642000Z [W1204 10:05:33.488917450 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9642002Z 2025-12-04T10:49:10.9642178Z [W1204 10:05:33.489031878 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9642180Z 2025-12-04T10:49:10.9642326Z [W1204 10:05:33.489094616 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9642328Z 2025-12-04T10:49:10.9642475Z [W1204 10:05:33.493729982 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9642479Z 2025-12-04T10:49:10.9642625Z [W1204 10:05:33.493832930 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9642627Z 2025-12-04T10:49:10.9642772Z [W1204 10:05:33.493908648 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9642774Z 2025-12-04T10:49:10.9642923Z [W1204 10:05:33.494017176 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9642925Z 2025-12-04T10:49:10.9643071Z [W1204 10:05:33.494081524 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9643073Z 2025-12-04T10:49:10.9643220Z [W1204 10:05:33.494181802 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9643222Z 2025-12-04T10:49:10.9643402Z [W1204 10:05:33.494244341 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9643404Z 2025-12-04T10:49:10.9643550Z [W1204 10:05:33.494336189 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9643551Z 2025-12-04T10:49:10.9643700Z [W1204 10:05:33.494396297 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9643702Z 2025-12-04T10:49:10.9643848Z [W1204 10:05:34.534930159 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9643850Z 2025-12-04T10:49:10.9643998Z [W1204 10:05:34.535056766 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9644000Z 2025-12-04T10:49:10.9644146Z [W1204 10:05:34.535132704 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9644151Z 2025-12-04T10:49:10.9644298Z [W1204 10:05:34.535234732 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9644300Z 2025-12-04T10:49:10.9644451Z [W1204 10:05:34.535295731 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9644453Z 2025-12-04T10:49:10.9644598Z [W1204 10:05:34.535391948 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9644600Z 2025-12-04T10:49:10.9644747Z [W1204 10:05:34.535450267 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9644749Z 2025-12-04T10:49:10.9644895Z [W1204 10:05:34.535536365 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9644899Z 2025-12-04T10:49:10.9645046Z [W1204 10:05:34.535595854 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9645048Z 2025-12-04T10:49:10.9645100Z ('RERUN', {'yellow': True}) [10.2253s] [100%] 2025-12-04T10:49:10.9645492Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:05:35.752638381 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9645494Z 2025-12-04T10:49:10.9645642Z [W1204 10:05:35.752822657 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9645644Z 2025-12-04T10:49:10.9645791Z [W1204 10:05:35.752897906 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9645798Z 2025-12-04T10:49:10.9645943Z [W1204 10:05:35.753015283 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9645944Z 2025-12-04T10:49:10.9646093Z [W1204 10:05:35.753087591 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9646095Z 2025-12-04T10:49:10.9646240Z [W1204 10:05:35.753188789 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9646242Z 2025-12-04T10:49:10.9646388Z [W1204 10:05:35.753249438 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9646390Z 2025-12-04T10:49:10.9646535Z [W1204 10:05:35.753337126 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9646560Z 2025-12-04T10:49:10.9646708Z [W1204 10:05:35.753396494 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9646710Z 2025-12-04T10:49:10.9646858Z [W1204 10:05:35.756063155 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9646860Z 2025-12-04T10:49:10.9647006Z [W1204 10:05:35.756162072 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9647007Z 2025-12-04T10:49:10.9647155Z [W1204 10:05:35.756234831 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9647157Z 2025-12-04T10:49:10.9647303Z [W1204 10:05:35.756332209 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9647306Z 2025-12-04T10:49:10.9647453Z [W1204 10:05:35.756393187 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9647455Z 2025-12-04T10:49:10.9647601Z [W1204 10:05:35.756488725 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9647604Z 2025-12-04T10:49:10.9647750Z [W1204 10:05:35.756549564 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9647752Z 2025-12-04T10:49:10.9647899Z [W1204 10:05:35.756635942 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9647900Z 2025-12-04T10:49:10.9648046Z [W1204 10:05:35.756694651 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9648050Z 2025-12-04T10:49:10.9648198Z [W1204 10:05:35.794786657 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9648199Z 2025-12-04T10:49:10.9648346Z [W1204 10:05:35.794895015 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9648348Z 2025-12-04T10:49:10.9648514Z [W1204 10:05:35.794970773 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9648516Z 2025-12-04T10:49:10.9648663Z [W1204 10:05:35.795081230 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9648665Z 2025-12-04T10:49:10.9648812Z [W1204 10:05:35.795144979 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9648815Z 2025-12-04T10:49:10.9648961Z [W1204 10:05:35.795242487 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9648963Z 2025-12-04T10:49:10.9649113Z [W1204 10:05:35.795303156 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9649117Z 2025-12-04T10:49:10.9649267Z [W1204 10:05:35.795388134 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9649269Z 2025-12-04T10:49:10.9649417Z [W1204 10:05:35.795446782 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9649418Z 2025-12-04T10:49:10.9649469Z ('RERUN', {'yellow': True}) [0.7503s] [100%] 2025-12-04T10:49:10.9649836Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:05:35.515192939 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9649860Z 2025-12-04T10:49:10.9650008Z [W1204 10:05:35.515377195 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9650010Z 2025-12-04T10:49:10.9650160Z [W1204 10:05:35.515452213 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9650161Z 2025-12-04T10:49:10.9650310Z [W1204 10:05:35.515558180 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9650312Z 2025-12-04T10:49:10.9650458Z [W1204 10:05:35.515629169 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9650462Z 2025-12-04T10:49:10.9650612Z [W1204 10:05:35.515730707 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9650614Z 2025-12-04T10:49:10.9650761Z [W1204 10:05:35.515791125 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9650762Z 2025-12-04T10:49:10.9650912Z [W1204 10:05:35.515877123 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9650914Z 2025-12-04T10:49:10.9651064Z [W1204 10:05:35.515935562 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9651065Z 2025-12-04T10:49:10.9651214Z [W1204 10:05:35.518533784 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9651216Z 2025-12-04T10:49:10.9651369Z [W1204 10:05:35.518640241 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9651370Z 2025-12-04T10:49:10.9651518Z [W1204 10:05:35.518711250 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9651520Z 2025-12-04T10:49:10.9651688Z [W1204 10:05:35.518803618 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9651690Z 2025-12-04T10:49:10.9651838Z [W1204 10:05:35.518863756 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9651843Z 2025-12-04T10:49:10.9652024Z [W1204 10:05:35.518958634 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9652026Z 2025-12-04T10:49:10.9652174Z [W1204 10:05:35.519021433 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9652178Z 2025-12-04T10:49:10.9652325Z [W1204 10:05:35.519109931 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9652327Z 2025-12-04T10:49:10.9652478Z [W1204 10:05:35.519167759 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9652480Z 2025-12-04T10:49:10.9652629Z [W1204 10:05:36.557488981 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9652634Z 2025-12-04T10:49:10.9652781Z [W1204 10:05:36.557595629 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9652783Z 2025-12-04T10:49:10.9652936Z [W1204 10:05:36.557669537 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9652978Z 2025-12-04T10:49:10.9653127Z [W1204 10:05:36.557767665 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9653129Z 2025-12-04T10:49:10.9653282Z [W1204 10:05:36.557827934 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9653284Z 2025-12-04T10:49:10.9653433Z [W1204 10:05:36.557924892 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9653435Z 2025-12-04T10:49:10.9653586Z [W1204 10:05:36.557984140 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9653588Z 2025-12-04T10:49:10.9653738Z [W1204 10:05:36.558074138 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9653742Z 2025-12-04T10:49:10.9653889Z [W1204 10:05:36.558133397 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9653891Z 2025-12-04T10:49:10.9653933Z FAILED [0.7665s] [100%] 2025-12-04T10:49:10.9653934Z 2025-12-04T10:49:10.9653990Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9654151Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9654200Z Traceback (most recent call last): 2025-12-04T10:49:10.9654362Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9654403Z method(*args, **kwargs) 2025-12-04T10:49:10.9654557Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9654598Z method(*args, **kwargs) 2025-12-04T10:49:10.9654752Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9654790Z with policy(): 2025-12-04T10:49:10.9654943Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9655011Z raise RuntimeError(msg) 2025-12-04T10:49:10.9655414Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 24576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9655417Z 2025-12-04T10:49:10.9655492Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9655791Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9655793Z 2025-12-04T10:49:10.9655884Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9655958Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9656021Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9656198Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9656271Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9656309Z graph_break [] 2025-12-04T10:49:10.9656384Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9656757Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9656802Z if out == self.unknown_value: 2025-12-04T10:49:10.9656965Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9657010Z Traceback (most recent call last): 2025-12-04T10:49:10.9657164Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9657202Z method(*args, **kwargs) 2025-12-04T10:49:10.9657356Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9657396Z method(*args, **kwargs) 2025-12-04T10:49:10.9657553Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9657590Z with policy(): 2025-12-04T10:49:10.9657745Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9657786Z raise RuntimeError(msg) 2025-12-04T10:49:10.9658202Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 24576 and is now reported as 49152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9658204Z 2025-12-04T10:49:10.9658278Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9658575Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9658579Z 2025-12-04T10:49:10.9658668Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9658739Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9658800Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9658999Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9659075Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9659113Z graph_break [] 2025-12-04T10:49:10.9659188Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9659529Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9659579Z if out == self.unknown_value: 2025-12-04T10:49:10.9659651Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9659709Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9659782Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9659962Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9659999Z graph_break [] 2025-12-04T10:49:10.9660055Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9660216Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9660285Z Traceback (most recent call last): 2025-12-04T10:49:10.9660442Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9660481Z method(*args, **kwargs) 2025-12-04T10:49:10.9660634Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9660676Z method(*args, **kwargs) 2025-12-04T10:49:10.9660829Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9660864Z with policy(): 2025-12-04T10:49:10.9661017Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9661058Z raise RuntimeError(msg) 2025-12-04T10:49:10.9661470Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9661474Z 2025-12-04T10:49:10.9661547Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9661842Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9661882Z 2025-12-04T10:49:10.9661968Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9662040Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9662100Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9662275Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9662350Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9662386Z graph_break [] 2025-12-04T10:49:10.9662457Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9662826Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9662873Z if out == self.unknown_value: 2025-12-04T10:49:10.9662945Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9663004Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9663076Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9663256Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9663293Z graph_break [] 2025-12-04T10:49:10.9663368Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9663425Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9663500Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9663673Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9663712Z graph_break [] 2025-12-04T10:49:10.9663955Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d220605347aad505.xml - 2025-12-04T10:49:10.9664044Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9664692Z FAILED [0.7665s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9664694Z 2025-12-04T10:49:10.9664767Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9665062Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9665066Z 2025-12-04T10:49:10.9665155Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9665217Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9665287Z ================== 1 failed, 57 deselected, 2 rerun in 11.91s ================== 2025-12-04T10:49:10.9665322Z Got exit code 1 2025-12-04T10:49:10.9665367Z Retrying single test... 2025-12-04T10:49:10.9665565Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-e870d6a08ae50bff.xml 2025-12-04T10:49:10.9665626Z ============================= test session starts ============================== 2025-12-04T10:49:10.9665739Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9665783Z cachedir: .pytest_cache 2025-12-04T10:49:10.9665942Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9665993Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9666032Z configfile: pytest.ini 2025-12-04T10:49:10.9666197Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9666269Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9666593Z stepcurrent: skipping 2 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9666639Z Running 1 items in this shard 2025-12-04T10:49:10.9666641Z 2025-12-04T10:49:10.9667085Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:05:45.394797173 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9667089Z 2025-12-04T10:49:10.9667244Z [W1204 10:05:53.687741082 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9667246Z 2025-12-04T10:49:10.9667398Z [W1204 10:05:53.687942718 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9667400Z 2025-12-04T10:49:10.9667551Z [W1204 10:05:53.688439747 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9667553Z 2025-12-04T10:49:10.9667699Z [W1204 10:05:53.688553174 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9667701Z 2025-12-04T10:49:10.9667849Z [W1204 10:05:53.689704079 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9667875Z 2025-12-04T10:49:10.9668025Z [W1204 10:05:53.689777347 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9668026Z 2025-12-04T10:49:10.9668175Z [W1204 10:05:53.689890874 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9668177Z 2025-12-04T10:49:10.9668327Z [W1204 10:05:53.689956773 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9668329Z 2025-12-04T10:49:10.9668477Z [W1204 10:05:53.694383865 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9668479Z 2025-12-04T10:49:10.9668624Z [W1204 10:05:53.694485092 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9668628Z 2025-12-04T10:49:10.9668777Z [W1204 10:05:53.694561721 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9668779Z 2025-12-04T10:49:10.9668927Z [W1204 10:05:53.694659548 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9668931Z 2025-12-04T10:49:10.9669080Z [W1204 10:05:53.694728047 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9669081Z 2025-12-04T10:49:10.9669230Z [W1204 10:05:53.694824495 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9669232Z 2025-12-04T10:49:10.9669379Z [W1204 10:05:53.694886653 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9669382Z 2025-12-04T10:49:10.9669528Z [W1204 10:05:53.694974491 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9669533Z 2025-12-04T10:49:10.9669680Z [W1204 10:05:53.695038880 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9669682Z 2025-12-04T10:49:10.9669849Z [W1204 10:05:53.735677606 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9669851Z 2025-12-04T10:49:10.9669996Z [W1204 10:05:53.735793543 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9669998Z 2025-12-04T10:49:10.9670147Z [W1204 10:05:53.735866311 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9670150Z 2025-12-04T10:49:10.9670296Z [W1204 10:05:53.735970689 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9670300Z 2025-12-04T10:49:10.9670445Z [W1204 10:05:53.736037708 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9670447Z 2025-12-04T10:49:10.9670598Z [W1204 10:05:53.736138565 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9670600Z 2025-12-04T10:49:10.9670745Z [W1204 10:05:53.736197844 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9670747Z 2025-12-04T10:49:10.9670897Z [W1204 10:05:53.736283592 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9670920Z 2025-12-04T10:49:10.9671067Z [W1204 10:05:53.736340871 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9671069Z 2025-12-04T10:49:10.9671122Z ('RERUN', {'yellow': True}) [10.1265s] [100%] 2025-12-04T10:49:10.9671494Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:05:54.927690332 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9671497Z 2025-12-04T10:49:10.9671645Z [W1204 10:05:54.927865998 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9671647Z 2025-12-04T10:49:10.9671795Z [W1204 10:05:54.927941097 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9671799Z 2025-12-04T10:49:10.9671980Z [W1204 10:05:54.928056314 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9671982Z 2025-12-04T10:49:10.9672131Z [W1204 10:05:54.928119853 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9672133Z 2025-12-04T10:49:10.9672285Z [W1204 10:05:54.928221030 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9672287Z 2025-12-04T10:49:10.9672433Z [W1204 10:05:54.928280529 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9672435Z 2025-12-04T10:49:10.9672585Z [W1204 10:05:54.928368137 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9672587Z 2025-12-04T10:49:10.9672735Z [W1204 10:05:54.928425826 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9672737Z 2025-12-04T10:49:10.9672886Z [W1204 10:05:54.931114066 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9672887Z 2025-12-04T10:49:10.9673063Z [W1204 10:05:54.931212694 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9673065Z 2025-12-04T10:49:10.9673215Z [W1204 10:05:54.931285032 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9673217Z 2025-12-04T10:49:10.9673367Z [W1204 10:05:54.931380600 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9673369Z 2025-12-04T10:49:10.9673517Z [W1204 10:05:54.931439989 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9673520Z 2025-12-04T10:49:10.9673671Z [W1204 10:05:54.931535417 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9673673Z 2025-12-04T10:49:10.9673819Z [W1204 10:05:54.931595645 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9673824Z 2025-12-04T10:49:10.9673972Z [W1204 10:05:54.931682423 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9673974Z 2025-12-04T10:49:10.9674122Z [W1204 10:05:54.931740932 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9674124Z 2025-12-04T10:49:10.9674269Z [W1204 10:05:54.972520805 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9674296Z 2025-12-04T10:49:10.9674448Z [W1204 10:05:54.972624343 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9674450Z 2025-12-04T10:49:10.9674599Z [W1204 10:05:54.972697561 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9674604Z 2025-12-04T10:49:10.9674752Z [W1204 10:05:54.972798999 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9674754Z 2025-12-04T10:49:10.9674904Z [W1204 10:05:54.972858157 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9674906Z 2025-12-04T10:49:10.9675051Z [W1204 10:05:54.972954885 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9675054Z 2025-12-04T10:49:10.9675206Z [W1204 10:05:54.973017164 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9675208Z 2025-12-04T10:49:10.9675356Z [W1204 10:05:54.973105612 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9675359Z 2025-12-04T10:49:10.9675507Z [W1204 10:05:54.973162431 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9675509Z 2025-12-04T10:49:10.9675562Z ('RERUN', {'yellow': True}) [0.6451s] [100%] 2025-12-04T10:49:10.9675928Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:05:55.569208222 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9675932Z 2025-12-04T10:49:10.9676084Z [W1204 10:05:55.569383008 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9676086Z 2025-12-04T10:49:10.9676255Z [W1204 10:05:55.569459016 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9676257Z 2025-12-04T10:49:10.9676408Z [W1204 10:05:55.569565964 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9676409Z 2025-12-04T10:49:10.9676561Z [W1204 10:05:55.569627822 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9676562Z 2025-12-04T10:49:10.9676709Z [W1204 10:05:55.569727760 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9676713Z 2025-12-04T10:49:10.9676859Z [W1204 10:05:55.569787309 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9676861Z 2025-12-04T10:49:10.9677008Z [W1204 10:05:55.569874267 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9677011Z 2025-12-04T10:49:10.9677160Z [W1204 10:05:55.569932106 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9677161Z 2025-12-04T10:49:10.9677308Z [W1204 10:05:55.572552907 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9677311Z 2025-12-04T10:49:10.9677457Z [W1204 10:05:55.572651975 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9677481Z 2025-12-04T10:49:10.9677632Z [W1204 10:05:55.572722834 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9677633Z 2025-12-04T10:49:10.9677780Z [W1204 10:05:55.572815311 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9677782Z 2025-12-04T10:49:10.9677931Z [W1204 10:05:55.572874220 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9677933Z 2025-12-04T10:49:10.9678078Z [W1204 10:05:55.572968298 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9678083Z 2025-12-04T10:49:10.9678230Z [W1204 10:05:55.573031817 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9678233Z 2025-12-04T10:49:10.9678379Z [W1204 10:05:55.573119135 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9678381Z 2025-12-04T10:49:10.9678528Z [W1204 10:05:55.573176833 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9678530Z 2025-12-04T10:49:10.9678682Z [W1204 10:05:55.612273004 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9678684Z 2025-12-04T10:49:10.9678832Z [W1204 10:05:55.612381392 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9678834Z 2025-12-04T10:49:10.9678985Z [W1204 10:05:55.612455350 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9678988Z 2025-12-04T10:49:10.9679138Z [W1204 10:05:55.612556078 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9679140Z 2025-12-04T10:49:10.9679288Z [W1204 10:05:55.612615676 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9679290Z 2025-12-04T10:49:10.9679468Z [W1204 10:05:55.612712304 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9679470Z 2025-12-04T10:49:10.9679617Z [W1204 10:05:55.612770563 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9679619Z 2025-12-04T10:49:10.9679768Z [W1204 10:05:55.612854701 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9679770Z 2025-12-04T10:49:10.9679918Z [W1204 10:05:55.612911230 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9679920Z 2025-12-04T10:49:10.9679959Z FAILED [0.6301s] [100%] 2025-12-04T10:49:10.9679960Z 2025-12-04T10:49:10.9680016Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9680175Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9680223Z Traceback (most recent call last): 2025-12-04T10:49:10.9680470Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9680515Z method(*args, **kwargs) 2025-12-04T10:49:10.9680668Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9680712Z method(*args, **kwargs) 2025-12-04T10:49:10.9680891Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9680932Z with policy(): 2025-12-04T10:49:10.9681085Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9681129Z raise RuntimeError(msg) 2025-12-04T10:49:10.9681538Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 24576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9681544Z 2025-12-04T10:49:10.9681619Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9681955Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9681961Z 2025-12-04T10:49:10.9682049Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9682125Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9682183Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9682367Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9682439Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9682478Z graph_break [] 2025-12-04T10:49:10.9682551Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9682899Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9682946Z if out == self.unknown_value: 2025-12-04T10:49:10.9683104Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9683153Z Traceback (most recent call last): 2025-12-04T10:49:10.9683332Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9683376Z method(*args, **kwargs) 2025-12-04T10:49:10.9683528Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9683570Z method(*args, **kwargs) 2025-12-04T10:49:10.9683721Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9683763Z with policy(): 2025-12-04T10:49:10.9683913Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9683958Z raise RuntimeError(msg) 2025-12-04T10:49:10.9684370Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 24576 and is now reported as 49152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9684372Z 2025-12-04T10:49:10.9684449Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9684742Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9684777Z 2025-12-04T10:49:10.9684861Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9684936Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9684995Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9685176Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9685247Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9685285Z graph_break [] 2025-12-04T10:49:10.9685357Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9685702Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9685749Z if out == self.unknown_value: 2025-12-04T10:49:10.9685823Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9685880Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9685949Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9686124Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9686165Z graph_break [] 2025-12-04T10:49:10.9686216Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9686376Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9686423Z Traceback (most recent call last): 2025-12-04T10:49:10.9686582Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9686623Z method(*args, **kwargs) 2025-12-04T10:49:10.9686778Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9686818Z method(*args, **kwargs) 2025-12-04T10:49:10.9686993Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9687033Z with policy(): 2025-12-04T10:49:10.9687183Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9687226Z raise RuntimeError(msg) 2025-12-04T10:49:10.9687633Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9687636Z 2025-12-04T10:49:10.9687710Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9688006Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9688008Z 2025-12-04T10:49:10.9688096Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9688168Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9688228Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9688403Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9688502Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9688543Z graph_break [] 2025-12-04T10:49:10.9688613Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9688958Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9689001Z if out == self.unknown_value: 2025-12-04T10:49:10.9689073Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9689131Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9689205Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9689380Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9689420Z graph_break [] 2025-12-04T10:49:10.9689491Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9689548Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9689617Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9689792Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9689829Z graph_break [] 2025-12-04T10:49:10.9690070Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-e870d6a08ae50bff.xml - 2025-12-04T10:49:10.9690130Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9690802Z FAILED [0.6301s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9690806Z 2025-12-04T10:49:10.9690878Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9691171Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9691173Z 2025-12-04T10:49:10.9691259Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9691321Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9691386Z ================== 1 failed, 57 deselected, 2 rerun in 11.56s ================== 2025-12-04T10:49:10.9691424Z Got exit code 1 2025-12-04T10:49:10.9691675Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9691805Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:10.9692064Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-28055b6c89b8637f.xml 2025-12-04T10:49:10.9692123Z ============================= test session starts ============================== 2025-12-04T10:49:10.9692233Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9692317Z cachedir: .pytest_cache 2025-12-04T10:49:10.9692477Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9692525Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9692565Z configfile: pytest.ini 2025-12-04T10:49:10.9692729Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9692801Z collecting ... collected 58 items / 3 deselected / 55 selected 2025-12-04T10:49:10.9692855Z stepcurrent: skipping 3 already run items. 2025-12-04T10:49:10.9692900Z Running 55 items in this shard 2025-12-04T10:49:10.9692902Z 2025-12-04T10:49:10.9693162Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.7609s] [ 1%] 2025-12-04T10:49:10.9693416Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6880s] [ 1%] 2025-12-04T10:49:10.9693646Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 FAILED [0.6910s] [ 1%] 2025-12-04T10:49:10.9693650Z 2025-12-04T10:49:10.9693705Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9693860Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9693908Z Traceback (most recent call last): 2025-12-04T10:49:10.9694065Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9694110Z method(*args, **kwargs) 2025-12-04T10:49:10.9694264Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9694306Z method(*args, **kwargs) 2025-12-04T10:49:10.9694456Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9694494Z with policy(): 2025-12-04T10:49:10.9694670Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9694714Z raise RuntimeError(msg) 2025-12-04T10:49:10.9695116Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 24576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9695120Z 2025-12-04T10:49:10.9695196Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9695490Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9695492Z 2025-12-04T10:49:10.9695578Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9695652Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9695709Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9695887Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9695958Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9696022Z graph_break [] 2025-12-04T10:49:10.9696178Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9696225Z Traceback (most recent call last): 2025-12-04T10:49:10.9696378Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9696420Z method(*args, **kwargs) 2025-12-04T10:49:10.9696571Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9696613Z method(*args, **kwargs) 2025-12-04T10:49:10.9696762Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9696801Z with policy(): 2025-12-04T10:49:10.9696951Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9696996Z raise RuntimeError(msg) 2025-12-04T10:49:10.9697403Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 24576 and is now reported as 49152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9697406Z 2025-12-04T10:49:10.9697479Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9697776Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9697778Z 2025-12-04T10:49:10.9697862Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9697936Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9697996Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9698173Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9698243Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9698281Z graph_break [] 2025-12-04T10:49:10.9698374Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9698434Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9698503Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9698678Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9698716Z graph_break [] 2025-12-04T10:49:10.9698770Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9698928Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9698973Z Traceback (most recent call last): 2025-12-04T10:49:10.9699129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9699171Z method(*args, **kwargs) 2025-12-04T10:49:10.9699324Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9699363Z method(*args, **kwargs) 2025-12-04T10:49:10.9699514Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9699550Z with policy(): 2025-12-04T10:49:10.9699702Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9699767Z raise RuntimeError(msg) 2025-12-04T10:49:10.9700180Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9700182Z 2025-12-04T10:49:10.9700255Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9700550Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9700552Z 2025-12-04T10:49:10.9700638Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9700711Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9700770Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9700943Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9701017Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9701054Z graph_break [] 2025-12-04T10:49:10.9701128Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9701184Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9701255Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9701428Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9701470Z graph_break [] 2025-12-04T10:49:10.9701540Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9701597Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9701666Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9701907Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9701944Z graph_break [] 2025-12-04T10:49:10.9702187Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-28055b6c89b8637f.xml - 2025-12-04T10:49:10.9702247Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9702890Z FAILED [0.6910s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9702894Z 2025-12-04T10:49:10.9702970Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9703262Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9703264Z 2025-12-04T10:49:10.9703350Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9703411Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9703508Z =================== 1 failed, 3 deselected, 2 rerun in 4.31s =================== 2025-12-04T10:49:10.9703545Z Got exit code 1 2025-12-04T10:49:10.9703587Z Retrying single test... 2025-12-04T10:49:10.9703782Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-a61930cf1c9fe3c9.xml 2025-12-04T10:49:10.9703843Z ============================= test session starts ============================== 2025-12-04T10:49:10.9703956Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9703998Z cachedir: .pytest_cache 2025-12-04T10:49:10.9704157Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9704202Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9704245Z configfile: pytest.ini 2025-12-04T10:49:10.9704410Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9704485Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9704776Z stepcurrent: skipping 3 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9704823Z Running 1 items in this shard 2025-12-04T10:49:10.9704825Z 2025-12-04T10:49:10.9705193Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:06:15.215801807 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9705196Z 2025-12-04T10:49:10.9705349Z [W1204 10:06:23.669741215 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9705352Z 2025-12-04T10:49:10.9705504Z [W1204 10:06:23.669877672 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9705506Z 2025-12-04T10:49:10.9705678Z [W1204 10:06:23.670266233 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9705680Z 2025-12-04T10:49:10.9705830Z [W1204 10:06:23.670355591 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9705832Z 2025-12-04T10:49:10.9705978Z [W1204 10:06:23.670957218 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9705980Z 2025-12-04T10:49:10.9706129Z [W1204 10:06:23.671023467 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9706132Z 2025-12-04T10:49:10.9706280Z [W1204 10:06:23.671130774 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9706282Z 2025-12-04T10:49:10.9706428Z [W1204 10:06:23.671189203 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9706431Z 2025-12-04T10:49:10.9706583Z [W1204 10:06:23.675576746 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9706585Z 2025-12-04T10:49:10.9706733Z [W1204 10:06:23.675676624 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9706735Z 2025-12-04T10:49:10.9706883Z [W1204 10:06:23.675751952 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9706915Z 2025-12-04T10:49:10.9707064Z [W1204 10:06:23.675850370 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9707066Z 2025-12-04T10:49:10.9707212Z [W1204 10:06:23.675911089 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9707214Z 2025-12-04T10:49:10.9707365Z [W1204 10:06:23.676013467 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9707366Z 2025-12-04T10:49:10.9707514Z [W1204 10:06:23.676075685 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9707516Z 2025-12-04T10:49:10.9707664Z [W1204 10:06:23.676165933 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9707667Z 2025-12-04T10:49:10.9707814Z [W1204 10:06:23.676225202 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9707818Z 2025-12-04T10:49:10.9707964Z [W1204 10:06:23.716468046 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9707966Z 2025-12-04T10:49:10.9708116Z [W1204 10:06:23.716593943 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9708118Z 2025-12-04T10:49:10.9708265Z [W1204 10:06:23.716667502 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9708267Z 2025-12-04T10:49:10.9708416Z [W1204 10:06:23.716773719 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9708420Z 2025-12-04T10:49:10.9708567Z [W1204 10:06:23.716834778 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9708572Z 2025-12-04T10:49:10.9708719Z [W1204 10:06:23.716933036 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9708721Z 2025-12-04T10:49:10.9708887Z [W1204 10:06:23.716992164 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9708889Z 2025-12-04T10:49:10.9709036Z [W1204 10:06:23.717083182 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9709037Z 2025-12-04T10:49:10.9709185Z [W1204 10:06:23.717143291 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9709187Z 2025-12-04T10:49:10.9709239Z ('RERUN', {'yellow': True}) [10.1440s] [100%] 2025-12-04T10:49:10.9709606Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:06:24.841572366 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9709608Z 2025-12-04T10:49:10.9709760Z [W1204 10:06:24.841755952 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9709762Z 2025-12-04T10:49:10.9709909Z [W1204 10:06:24.841830570 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9709911Z 2025-12-04T10:49:10.9710059Z [W1204 10:06:24.841938928 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9710061Z 2025-12-04T10:49:10.9710230Z [W1204 10:06:24.842005527 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9710231Z 2025-12-04T10:49:10.9710381Z [W1204 10:06:24.842112384 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9710383Z 2025-12-04T10:49:10.9710533Z [W1204 10:06:24.842172943 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9710535Z 2025-12-04T10:49:10.9710682Z [W1204 10:06:24.842259791 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9710684Z 2025-12-04T10:49:10.9710832Z [W1204 10:06:24.842318470 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9710834Z 2025-12-04T10:49:10.9710980Z [W1204 10:06:24.844936262 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9710984Z 2025-12-04T10:49:10.9711132Z [W1204 10:06:24.845039990 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9711134Z 2025-12-04T10:49:10.9711282Z [W1204 10:06:24.845113128 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9711286Z 2025-12-04T10:49:10.9711433Z [W1204 10:06:24.845206756 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9711435Z 2025-12-04T10:49:10.9711583Z [W1204 10:06:24.845266515 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9711585Z 2025-12-04T10:49:10.9711731Z [W1204 10:06:24.845361063 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9711734Z 2025-12-04T10:49:10.9711921Z [W1204 10:06:24.845420471 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9711923Z 2025-12-04T10:49:10.9712100Z [W1204 10:06:24.845509220 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9712105Z 2025-12-04T10:49:10.9712252Z [W1204 10:06:24.845567518 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9712254Z 2025-12-04T10:49:10.9712402Z [W1204 10:06:24.883503303 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9712404Z 2025-12-04T10:49:10.9712550Z [W1204 10:06:24.883607011 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9712554Z 2025-12-04T10:49:10.9712703Z [W1204 10:06:24.883686309 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9712705Z 2025-12-04T10:49:10.9712853Z [W1204 10:06:24.883786667 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9712856Z 2025-12-04T10:49:10.9713005Z [W1204 10:06:24.883846996 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9713007Z 2025-12-04T10:49:10.9713154Z [W1204 10:06:24.883944214 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9713156Z 2025-12-04T10:49:10.9713303Z [W1204 10:06:24.884007002 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9713334Z 2025-12-04T10:49:10.9713483Z [W1204 10:06:24.884095390 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9713484Z 2025-12-04T10:49:10.9713631Z [W1204 10:06:24.884153239 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9713632Z 2025-12-04T10:49:10.9713686Z ('RERUN', {'yellow': True}) [0.6564s] [100%] 2025-12-04T10:49:10.9714050Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:06:25.527255024 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9714051Z 2025-12-04T10:49:10.9714198Z [W1204 10:06:25.527429870 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9714202Z 2025-12-04T10:49:10.9714350Z [W1204 10:06:25.527503099 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9714351Z 2025-12-04T10:49:10.9714498Z [W1204 10:06:25.527608986 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9714502Z 2025-12-04T10:49:10.9714649Z [W1204 10:06:25.527671165 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9714651Z 2025-12-04T10:49:10.9714802Z [W1204 10:06:25.527770653 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9714804Z 2025-12-04T10:49:10.9714951Z [W1204 10:06:25.527829142 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9714954Z 2025-12-04T10:49:10.9715102Z [W1204 10:06:25.527913890 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9715104Z 2025-12-04T10:49:10.9715250Z [W1204 10:06:25.527971579 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9715252Z 2025-12-04T10:49:10.9715419Z [W1204 10:06:25.530516223 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9715421Z 2025-12-04T10:49:10.9715568Z [W1204 10:06:25.530612020 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9715572Z 2025-12-04T10:49:10.9715719Z [W1204 10:06:25.530681409 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9715722Z 2025-12-04T10:49:10.9715870Z [W1204 10:06:25.530772687 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9715871Z 2025-12-04T10:49:10.9716017Z [W1204 10:06:25.530831866 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9716019Z 2025-12-04T10:49:10.9716168Z [W1204 10:06:25.530925864 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9716170Z 2025-12-04T10:49:10.9716316Z [W1204 10:06:25.530985172 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9716320Z 2025-12-04T10:49:10.9716466Z [W1204 10:06:25.531075160 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9716488Z 2025-12-04T10:49:10.9716637Z [W1204 10:06:25.531135479 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9716639Z 2025-12-04T10:49:10.9716784Z [W1204 10:06:25.569201781 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9716786Z 2025-12-04T10:49:10.9716938Z [W1204 10:06:25.569303699 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9716940Z 2025-12-04T10:49:10.9717087Z [W1204 10:06:25.569388727 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9717089Z 2025-12-04T10:49:10.9717238Z [W1204 10:06:25.569489805 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9717240Z 2025-12-04T10:49:10.9717390Z [W1204 10:06:25.569550064 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9717392Z 2025-12-04T10:49:10.9717540Z [W1204 10:06:25.569646592 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9717542Z 2025-12-04T10:49:10.9717693Z [W1204 10:06:25.569705210 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9717695Z 2025-12-04T10:49:10.9717842Z [W1204 10:06:25.569789118 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9717843Z 2025-12-04T10:49:10.9717993Z [W1204 10:06:25.569845847 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9717995Z 2025-12-04T10:49:10.9718035Z FAILED [0.7152s] [100%] 2025-12-04T10:49:10.9718039Z 2025-12-04T10:49:10.9718090Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9718249Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9718295Z Traceback (most recent call last): 2025-12-04T10:49:10.9718482Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9718523Z method(*args, **kwargs) 2025-12-04T10:49:10.9718676Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9718715Z method(*args, **kwargs) 2025-12-04T10:49:10.9718868Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9718905Z with policy(): 2025-12-04T10:49:10.9719059Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9719102Z raise RuntimeError(msg) 2025-12-04T10:49:10.9719512Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 24576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9719515Z 2025-12-04T10:49:10.9719588Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9719885Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9719887Z 2025-12-04T10:49:10.9719976Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9720070Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9720130Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9720306Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9720381Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9720418Z graph_break [] 2025-12-04T10:49:10.9720492Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9720835Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9720881Z if out == self.unknown_value: 2025-12-04T10:49:10.9721040Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9721087Z Traceback (most recent call last): 2025-12-04T10:49:10.9721240Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9721282Z method(*args, **kwargs) 2025-12-04T10:49:10.9721433Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9721475Z method(*args, **kwargs) 2025-12-04T10:49:10.9721625Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9721662Z with policy(): 2025-12-04T10:49:10.9721815Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9721901Z raise RuntimeError(msg) 2025-12-04T10:49:10.9722315Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 24576 and is now reported as 49152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9722318Z 2025-12-04T10:49:10.9722418Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9722713Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9722715Z 2025-12-04T10:49:10.9722801Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9722873Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9722932Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9723111Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9723184Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9723221Z graph_break [] 2025-12-04T10:49:10.9723297Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9723638Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9723684Z if out == self.unknown_value: 2025-12-04T10:49:10.9723755Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9723846Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9723917Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9724094Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9724131Z graph_break [] 2025-12-04T10:49:10.9724187Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9724343Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9724390Z Traceback (most recent call last): 2025-12-04T10:49:10.9724543Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9724585Z method(*args, **kwargs) 2025-12-04T10:49:10.9724734Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9724777Z method(*args, **kwargs) 2025-12-04T10:49:10.9724926Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9724965Z with policy(): 2025-12-04T10:49:10.9725117Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9725161Z raise RuntimeError(msg) 2025-12-04T10:49:10.9725573Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9725576Z 2025-12-04T10:49:10.9725648Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9725943Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9725945Z 2025-12-04T10:49:10.9726030Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9726124Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9726181Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9726357Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9726429Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9726467Z graph_break [] 2025-12-04T10:49:10.9726538Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9726883Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9726928Z if out == self.unknown_value: 2025-12-04T10:49:10.9727000Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9727059Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9727129Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9727306Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9727342Z graph_break [] 2025-12-04T10:49:10.9727414Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9727495Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9727566Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9727740Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9727778Z graph_break [] 2025-12-04T10:49:10.9728022Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-a61930cf1c9fe3c9.xml - 2025-12-04T10:49:10.9728082Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9728720Z FAILED [0.7152s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9728726Z 2025-12-04T10:49:10.9728797Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9729093Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9729095Z 2025-12-04T10:49:10.9729180Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9729241Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9729307Z ================== 1 failed, 57 deselected, 2 rerun in 11.69s ================== 2025-12-04T10:49:10.9729348Z Got exit code 1 2025-12-04T10:49:10.9729388Z Retrying single test... 2025-12-04T10:49:10.9729585Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-46c082333ed11825.xml 2025-12-04T10:49:10.9729641Z ============================= test session starts ============================== 2025-12-04T10:49:10.9729774Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9729815Z cachedir: .pytest_cache 2025-12-04T10:49:10.9729973Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9730018Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9730061Z configfile: pytest.ini 2025-12-04T10:49:10.9730222Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9730299Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9730592Z stepcurrent: skipping 3 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9730636Z Running 1 items in this shard 2025-12-04T10:49:10.9730638Z 2025-12-04T10:49:10.9731008Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:06:34.466448015 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9731010Z 2025-12-04T10:49:10.9731161Z [W1204 10:06:42.804275878 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9731163Z 2025-12-04T10:49:10.9731334Z [W1204 10:06:42.804418425 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9731336Z 2025-12-04T10:49:10.9731486Z [W1204 10:06:42.804851875 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9731488Z 2025-12-04T10:49:10.9731636Z [W1204 10:06:42.804941393 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9731638Z 2025-12-04T10:49:10.9731789Z [W1204 10:06:42.805960331 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9731790Z 2025-12-04T10:49:10.9731982Z [W1204 10:06:42.806032919 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9731984Z 2025-12-04T10:49:10.9732133Z [W1204 10:06:42.806144407 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9732136Z 2025-12-04T10:49:10.9732286Z [W1204 10:06:42.806207045 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9732287Z 2025-12-04T10:49:10.9732435Z [W1204 10:06:42.810697957 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9732437Z 2025-12-04T10:49:10.9732585Z [W1204 10:06:42.810796765 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9732587Z 2025-12-04T10:49:10.9732734Z [W1204 10:06:42.810869714 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9732736Z 2025-12-04T10:49:10.9732883Z [W1204 10:06:42.810969631 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9732886Z 2025-12-04T10:49:10.9733033Z [W1204 10:06:42.811035880 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9733037Z 2025-12-04T10:49:10.9733220Z [W1204 10:06:42.811136888 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9733222Z 2025-12-04T10:49:10.9733371Z [W1204 10:06:42.811197426 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9733373Z 2025-12-04T10:49:10.9733520Z [W1204 10:06:42.811283885 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9733522Z 2025-12-04T10:49:10.9733671Z [W1204 10:06:42.811342743 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9733674Z 2025-12-04T10:49:10.9733820Z [W1204 10:06:42.851587213 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9733824Z 2025-12-04T10:49:10.9733972Z [W1204 10:06:42.851720650 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9733975Z 2025-12-04T10:49:10.9734124Z [W1204 10:06:42.851794658 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9734126Z 2025-12-04T10:49:10.9734273Z [W1204 10:06:42.851902756 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9734275Z 2025-12-04T10:49:10.9734422Z [W1204 10:06:42.851963285 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9734450Z 2025-12-04T10:49:10.9734598Z [W1204 10:06:42.852066942 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9734600Z 2025-12-04T10:49:10.9734749Z [W1204 10:06:42.852128741 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9734751Z 2025-12-04T10:49:10.9734902Z [W1204 10:06:42.852216049 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9734904Z 2025-12-04T10:49:10.9735050Z [W1204 10:06:42.852274108 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9735052Z 2025-12-04T10:49:10.9735104Z ('RERUN', {'yellow': True}) [10.1094s] [100%] 2025-12-04T10:49:10.9735465Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:06:43.995324777 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9735469Z 2025-12-04T10:49:10.9735617Z [W1204 10:06:43.995507793 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9735620Z 2025-12-04T10:49:10.9735769Z [W1204 10:06:43.995583171 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9735771Z 2025-12-04T10:49:10.9735919Z [W1204 10:06:43.995693669 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9735921Z 2025-12-04T10:49:10.9736071Z [W1204 10:06:43.995756058 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9736075Z 2025-12-04T10:49:10.9736221Z [W1204 10:06:43.995856195 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9736223Z 2025-12-04T10:49:10.9736371Z [W1204 10:06:43.995924114 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9736373Z 2025-12-04T10:49:10.9736536Z [W1204 10:06:43.996014852 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9736540Z 2025-12-04T10:49:10.9736688Z [W1204 10:06:43.996076201 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9736690Z 2025-12-04T10:49:10.9736839Z [W1204 10:06:43.998729252 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9736843Z 2025-12-04T10:49:10.9736988Z [W1204 10:06:43.998831150 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9736990Z 2025-12-04T10:49:10.9737138Z [W1204 10:06:43.998903479 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9737140Z 2025-12-04T10:49:10.9737288Z [W1204 10:06:43.998999677 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9737292Z 2025-12-04T10:49:10.9737439Z [W1204 10:06:43.999066625 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9737440Z 2025-12-04T10:49:10.9737589Z [W1204 10:06:43.999163073 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9737622Z 2025-12-04T10:49:10.9737768Z [W1204 10:06:43.999222472 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9737770Z 2025-12-04T10:49:10.9737920Z [W1204 10:06:43.999308690 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9737922Z 2025-12-04T10:49:10.9745042Z [W1204 10:06:43.999366759 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9745046Z 2025-12-04T10:49:10.9745202Z [W1204 10:06:43.037889846 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9745205Z 2025-12-04T10:49:10.9745353Z [W1204 10:06:43.037995724 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9745355Z 2025-12-04T10:49:10.9745508Z [W1204 10:06:43.038076062 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9745510Z 2025-12-04T10:49:10.9745656Z [W1204 10:06:43.038181540 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9745658Z 2025-12-04T10:49:10.9745807Z [W1204 10:06:43.038242578 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9745809Z 2025-12-04T10:49:10.9745957Z [W1204 10:06:43.038341166 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9745959Z 2025-12-04T10:49:10.9746109Z [W1204 10:06:43.038400945 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9746110Z 2025-12-04T10:49:10.9746260Z [W1204 10:06:43.038487143 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9746267Z 2025-12-04T10:49:10.9746413Z [W1204 10:06:43.038544722 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9746415Z 2025-12-04T10:49:10.9746467Z ('RERUN', {'yellow': True}) [0.6839s] [100%] 2025-12-04T10:49:10.9746888Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:06:44.680068894 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9746891Z 2025-12-04T10:49:10.9747040Z [W1204 10:06:44.680248560 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9747042Z 2025-12-04T10:49:10.9747191Z [W1204 10:06:44.680324509 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9747195Z 2025-12-04T10:49:10.9747341Z [W1204 10:06:44.680432436 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9747343Z 2025-12-04T10:49:10.9747492Z [W1204 10:06:44.680494165 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9747494Z 2025-12-04T10:49:10.9747639Z [W1204 10:06:44.680595863 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9747641Z 2025-12-04T10:49:10.9747789Z [W1204 10:06:44.680662231 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9747791Z 2025-12-04T10:49:10.9747937Z [W1204 10:06:44.680749059 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9747971Z 2025-12-04T10:49:10.9748118Z [W1204 10:06:44.680806378 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9748120Z 2025-12-04T10:49:10.9748271Z [W1204 10:06:44.683421481 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9748273Z 2025-12-04T10:49:10.9748421Z [W1204 10:06:44.683523219 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9748423Z 2025-12-04T10:49:10.9748572Z [W1204 10:06:44.683594747 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9748574Z 2025-12-04T10:49:10.9748721Z [W1204 10:06:44.683690565 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9748726Z 2025-12-04T10:49:10.9748874Z [W1204 10:06:44.683749954 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9748876Z 2025-12-04T10:49:10.9749025Z [W1204 10:06:44.683845192 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9749028Z 2025-12-04T10:49:10.9749178Z [W1204 10:06:44.683903890 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9749179Z 2025-12-04T10:49:10.9749329Z [W1204 10:06:44.683988199 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9749331Z 2025-12-04T10:49:10.9749479Z [W1204 10:06:44.684049327 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9749485Z 2025-12-04T10:49:10.9749632Z [W1204 10:06:44.722697292 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9749634Z 2025-12-04T10:49:10.9749784Z [W1204 10:06:44.722801280 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9749785Z 2025-12-04T10:49:10.9749954Z [W1204 10:06:44.722874478 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9749956Z 2025-12-04T10:49:10.9750104Z [W1204 10:06:44.722975616 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9750105Z 2025-12-04T10:49:10.9750251Z [W1204 10:06:44.723038875 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9750254Z 2025-12-04T10:49:10.9750402Z [W1204 10:06:44.723139183 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9750404Z 2025-12-04T10:49:10.9750552Z [W1204 10:06:44.723197791 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9750553Z 2025-12-04T10:49:10.9750700Z [W1204 10:06:44.723283749 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9750702Z 2025-12-04T10:49:10.9750850Z [W1204 10:06:44.723342728 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9750852Z 2025-12-04T10:49:10.9750891Z FAILED [0.6812s] [100%] 2025-12-04T10:49:10.9750894Z 2025-12-04T10:49:10.9750950Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9751138Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9751186Z Traceback (most recent call last): 2025-12-04T10:49:10.9751352Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9751396Z method(*args, **kwargs) 2025-12-04T10:49:10.9751551Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9751593Z method(*args, **kwargs) 2025-12-04T10:49:10.9751744Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9751781Z with policy(): 2025-12-04T10:49:10.9751968Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9752011Z raise RuntimeError(msg) 2025-12-04T10:49:10.9752418Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 24576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9752420Z 2025-12-04T10:49:10.9752498Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9752798Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9752800Z 2025-12-04T10:49:10.9752888Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9752968Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9753031Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9753213Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9753288Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9753325Z graph_break [] 2025-12-04T10:49:10.9753430Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9753782Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9753828Z if out == self.unknown_value: 2025-12-04T10:49:10.9753988Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9754037Z Traceback (most recent call last): 2025-12-04T10:49:10.9754192Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9754233Z method(*args, **kwargs) 2025-12-04T10:49:10.9754383Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9754424Z method(*args, **kwargs) 2025-12-04T10:49:10.9754572Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9754610Z with policy(): 2025-12-04T10:49:10.9754760Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9754801Z raise RuntimeError(msg) 2025-12-04T10:49:10.9755210Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 24576 and is now reported as 49152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9755245Z 2025-12-04T10:49:10.9755319Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9755617Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9755620Z 2025-12-04T10:49:10.9755706Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9755778Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9755836Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9756016Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9756088Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9756126Z graph_break [] 2025-12-04T10:49:10.9756196Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9756541Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9756584Z if out == self.unknown_value: 2025-12-04T10:49:10.9756658Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9756714Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9756786Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9756964Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9757000Z graph_break [] 2025-12-04T10:49:10.9757054Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9757233Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9757280Z Traceback (most recent call last): 2025-12-04T10:49:10.9757434Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9757474Z method(*args, **kwargs) 2025-12-04T10:49:10.9757625Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9757666Z method(*args, **kwargs) 2025-12-04T10:49:10.9757817Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9757853Z with policy(): 2025-12-04T10:49:10.9758003Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9758044Z raise RuntimeError(msg) 2025-12-04T10:49:10.9758456Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9758459Z 2025-12-04T10:49:10.9758533Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9758827Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9758854Z 2025-12-04T10:49:10.9758939Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9759010Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9759068Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9759244Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9759315Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9759352Z graph_break [] 2025-12-04T10:49:10.9759423Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9759767Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9759811Z if out == self.unknown_value: 2025-12-04T10:49:10.9759883Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9759938Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9760010Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9760186Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9760221Z graph_break [] 2025-12-04T10:49:10.9760293Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9760348Z stats [('calls_captured', 18), ('unique_graphs', 1)] 2025-12-04T10:49:10.9760420Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9760592Z inductor [('pattern_matcher_nodes', 18), ('woq_matcher_nodes', 12), ('pattern_matcher_count', 9), ('woq_matcher_count', 3), ('extern_calls', 3), ('fxgraph_cache_miss', 1)] 2025-12-04T10:49:10.9760629Z graph_break [] 2025-12-04T10:49:10.9760890Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-46c082333ed11825.xml - 2025-12-04T10:49:10.9760950Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9761595Z FAILED [0.6812s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 49152 and is now reported as 73728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9761601Z 2025-12-04T10:49:10.9761673Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9762005Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9762007Z 2025-12-04T10:49:10.9762091Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9762153Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9762220Z ================== 1 failed, 57 deselected, 2 rerun in 11.62s ================== 2025-12-04T10:49:10.9762258Z Got exit code 1 2025-12-04T10:49:10.9762505Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9762664Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:10.9762860Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6248e220be129902.xml 2025-12-04T10:49:10.9762921Z ============================= test session starts ============================== 2025-12-04T10:49:10.9763035Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9763078Z cachedir: .pytest_cache 2025-12-04T10:49:10.9763238Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9763284Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9763327Z configfile: pytest.ini 2025-12-04T10:49:10.9763493Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9763567Z collecting ... collected 58 items / 4 deselected / 54 selected 2025-12-04T10:49:10.9763619Z stepcurrent: skipping 4 already run items. 2025-12-04T10:49:10.9763664Z Running 54 items in this shard 2025-12-04T10:49:10.9763666Z 2025-12-04T10:49:10.9763919Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.9080s] [ 1%] 2025-12-04T10:49:10.9764167Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.7560s] [ 1%] 2025-12-04T10:49:10.9764388Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 FAILED [0.7428s] [ 1%] 2025-12-04T10:49:10.9764393Z 2025-12-04T10:49:10.9764444Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9764594Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9764640Z Traceback (most recent call last): 2025-12-04T10:49:10.9764822Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9764863Z method(*args, **kwargs) 2025-12-04T10:49:10.9765015Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9765054Z method(*args, **kwargs) 2025-12-04T10:49:10.9765204Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9765243Z with policy(): 2025-12-04T10:49:10.9765395Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9765434Z raise RuntimeError(msg) 2025-12-04T10:49:10.9765837Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9765839Z 2025-12-04T10:49:10.9765912Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9766204Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9766238Z 2025-12-04T10:49:10.9766323Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9766398Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9766455Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9766731Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9766805Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9766841Z graph_break [] 2025-12-04T10:49:10.9766993Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9767038Z Traceback (most recent call last): 2025-12-04T10:49:10.9767192Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9767234Z method(*args, **kwargs) 2025-12-04T10:49:10.9767385Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9767424Z method(*args, **kwargs) 2025-12-04T10:49:10.9767577Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9767613Z with policy(): 2025-12-04T10:49:10.9767765Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9767805Z raise RuntimeError(msg) 2025-12-04T10:49:10.9768218Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9768222Z 2025-12-04T10:49:10.9768294Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9768601Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9768604Z 2025-12-04T10:49:10.9768689Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9768761Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9768818Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9769087Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9769164Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9769202Z graph_break [] 2025-12-04T10:49:10.9769275Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9769329Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9769403Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9769669Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9769709Z graph_break [] 2025-12-04T10:49:10.9769759Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9769913Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9769985Z Traceback (most recent call last): 2025-12-04T10:49:10.9770137Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9770180Z method(*args, **kwargs) 2025-12-04T10:49:10.9770331Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9770371Z method(*args, **kwargs) 2025-12-04T10:49:10.9770520Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9770558Z with policy(): 2025-12-04T10:49:10.9770710Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9770753Z raise RuntimeError(msg) 2025-12-04T10:49:10.9771159Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9771164Z 2025-12-04T10:49:10.9771235Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9771523Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9771526Z 2025-12-04T10:49:10.9771611Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9771684Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9771737Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9772046Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9772117Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9772156Z graph_break [] 2025-12-04T10:49:10.9772259Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9772316Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9772386Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9772656Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9772693Z graph_break [] 2025-12-04T10:49:10.9772768Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9772820Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9772891Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9773158Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9773198Z graph_break [] 2025-12-04T10:49:10.9773440Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6248e220be129902.xml - 2025-12-04T10:49:10.9773502Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9774141Z FAILED [0.7428s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9774175Z 2025-12-04T10:49:10.9774247Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9774538Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9774540Z 2025-12-04T10:49:10.9774625Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9774691Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9774761Z =================== 1 failed, 4 deselected, 2 rerun in 4.56s =================== 2025-12-04T10:49:10.9774798Z Got exit code 1 2025-12-04T10:49:10.9774841Z Retrying single test... 2025-12-04T10:49:10.9775037Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-f46b39137757c03c.xml 2025-12-04T10:49:10.9775097Z ============================= test session starts ============================== 2025-12-04T10:49:10.9775209Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9775252Z cachedir: .pytest_cache 2025-12-04T10:49:10.9775411Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9775459Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9775499Z configfile: pytest.ini 2025-12-04T10:49:10.9775667Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9775740Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9776045Z stepcurrent: skipping 4 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9776089Z Running 1 items in this shard 2025-12-04T10:49:10.9776091Z 2025-12-04T10:49:10.9776455Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:07:05.104930512 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9776458Z 2025-12-04T10:49:10.9776612Z [W1204 10:07:12.439456024 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9776616Z 2025-12-04T10:49:10.9776765Z [W1204 10:07:12.439630390 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9776767Z 2025-12-04T10:49:10.9776917Z [W1204 10:07:12.443549285 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9776919Z 2025-12-04T10:49:10.9777067Z [W1204 10:07:12.443855079 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9777069Z 2025-12-04T10:49:10.9777217Z [W1204 10:07:12.443929477 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9777219Z 2025-12-04T10:49:10.9777370Z [W1204 10:07:12.446440313 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9777397Z 2025-12-04T10:49:10.9777544Z [W1204 10:07:12.446711507 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9777546Z 2025-12-04T10:49:10.9777696Z [W1204 10:07:12.446785565 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9777698Z 2025-12-04T10:49:10.9777749Z ('RERUN', {'yellow': True}) [10.2308s] [100%] 2025-12-04T10:49:10.9778107Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:07:13.335397363 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9778109Z 2025-12-04T10:49:10.9778256Z [W1204 10:07:13.335757815 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9778261Z 2025-12-04T10:49:10.9778408Z [W1204 10:07:13.335837553 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9778410Z 2025-12-04T10:49:10.9778560Z [W1204 10:07:13.337249043 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9778562Z 2025-12-04T10:49:10.9778707Z [W1204 10:07:13.337499957 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9778709Z 2025-12-04T10:49:10.9778856Z [W1204 10:07:13.337574526 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9778857Z 2025-12-04T10:49:10.9779004Z [W1204 10:07:13.339778588 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9779008Z 2025-12-04T10:49:10.9779157Z [W1204 10:07:13.340037992 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9779159Z 2025-12-04T10:49:10.9779309Z [W1204 10:07:13.340114671 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9779336Z 2025-12-04T10:49:10.9779384Z ('RERUN', {'yellow': True}) [0.7439s] [100%] 2025-12-04T10:49:10.9779745Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:07:14.114108515 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9779747Z 2025-12-04T10:49:10.9779896Z [W1204 10:07:14.114495417 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9779899Z 2025-12-04T10:49:10.9780049Z [W1204 10:07:14.114575645 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9780051Z 2025-12-04T10:49:10.9780199Z [W1204 10:07:14.115953895 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9780201Z 2025-12-04T10:49:10.9780346Z [W1204 10:07:14.116208620 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9780348Z 2025-12-04T10:49:10.9780497Z [W1204 10:07:14.116285188 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9780499Z 2025-12-04T10:49:10.9780645Z [W1204 10:07:14.118472420 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9780668Z 2025-12-04T10:49:10.9780815Z [W1204 10:07:14.118726815 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9780817Z 2025-12-04T10:49:10.9780966Z [W1204 10:07:14.118802273 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9780972Z 2025-12-04T10:49:10.9781010Z FAILED [0.7789s] [100%] 2025-12-04T10:49:10.9781012Z 2025-12-04T10:49:10.9781065Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9781217Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9781265Z Traceback (most recent call last): 2025-12-04T10:49:10.9781423Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9781469Z method(*args, **kwargs) 2025-12-04T10:49:10.9781623Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9781665Z method(*args, **kwargs) 2025-12-04T10:49:10.9781815Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9781902Z with policy(): 2025-12-04T10:49:10.9782055Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9782098Z raise RuntimeError(msg) 2025-12-04T10:49:10.9782497Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9782502Z 2025-12-04T10:49:10.9782575Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9782867Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9782869Z 2025-12-04T10:49:10.9782980Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9783053Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9783109Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9783383Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9783457Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9783496Z graph_break [] 2025-12-04T10:49:10.9783566Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9783915Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9783960Z if out == self.unknown_value: 2025-12-04T10:49:10.9784114Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9784158Z Traceback (most recent call last): 2025-12-04T10:49:10.9784310Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9784350Z method(*args, **kwargs) 2025-12-04T10:49:10.9784526Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9784568Z method(*args, **kwargs) 2025-12-04T10:49:10.9784718Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9784756Z with policy(): 2025-12-04T10:49:10.9784908Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9784950Z raise RuntimeError(msg) 2025-12-04T10:49:10.9785357Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9785361Z 2025-12-04T10:49:10.9785436Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9785723Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9785727Z 2025-12-04T10:49:10.9785813Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9785885Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9785940Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9786308Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9786379Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9786420Z graph_break [] 2025-12-04T10:49:10.9786492Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9786861Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9786906Z if out == self.unknown_value: 2025-12-04T10:49:10.9786979Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9787034Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9787105Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9787372Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9787412Z graph_break [] 2025-12-04T10:49:10.9787462Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9787614Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9787660Z Traceback (most recent call last): 2025-12-04T10:49:10.9787814Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9787855Z method(*args, **kwargs) 2025-12-04T10:49:10.9788003Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9788044Z method(*args, **kwargs) 2025-12-04T10:49:10.9788193Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9788253Z with policy(): 2025-12-04T10:49:10.9788403Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9788443Z raise RuntimeError(msg) 2025-12-04T10:49:10.9788853Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9788856Z 2025-12-04T10:49:10.9788927Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9789215Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9789219Z 2025-12-04T10:49:10.9789303Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9789375Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9789430Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9789701Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9789772Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9789808Z graph_break [] 2025-12-04T10:49:10.9789879Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9790218Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9790263Z if out == self.unknown_value: 2025-12-04T10:49:10.9790332Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9790386Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9790475Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9790744Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9790780Z graph_break [] 2025-12-04T10:49:10.9790850Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9790903Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9790975Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9791241Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9791278Z graph_break [] 2025-12-04T10:49:10.9791520Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-f46b39137757c03c.xml - 2025-12-04T10:49:10.9791579Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9792257Z FAILED [0.7789s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9792287Z 2025-12-04T10:49:10.9792359Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9792648Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9792650Z 2025-12-04T10:49:10.9792736Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9792797Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9792865Z ================== 1 failed, 57 deselected, 2 rerun in 11.90s ================== 2025-12-04T10:49:10.9792901Z Got exit code 1 2025-12-04T10:49:10.9792942Z Retrying single test... 2025-12-04T10:49:10.9793138Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-fa322d5db22c17b8.xml 2025-12-04T10:49:10.9793194Z ============================= test session starts ============================== 2025-12-04T10:49:10.9793305Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9793348Z cachedir: .pytest_cache 2025-12-04T10:49:10.9793504Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9793551Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9793590Z configfile: pytest.ini 2025-12-04T10:49:10.9793754Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9793826Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9794112Z stepcurrent: skipping 4 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9794155Z Running 1 items in this shard 2025-12-04T10:49:10.9794157Z 2025-12-04T10:49:10.9794549Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:07:24.868275860 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9794551Z 2025-12-04T10:49:10.9794704Z [W1204 10:07:31.503584730 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9794706Z 2025-12-04T10:49:10.9794855Z [W1204 10:07:31.503773596 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9794859Z 2025-12-04T10:49:10.9795007Z [W1204 10:07:31.507732661 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9795009Z 2025-12-04T10:49:10.9795156Z [W1204 10:07:31.508028845 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9795160Z 2025-12-04T10:49:10.9795308Z [W1204 10:07:31.508106003 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9795310Z 2025-12-04T10:49:10.9795459Z [W1204 10:07:31.510615629 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9795460Z 2025-12-04T10:49:10.9795606Z [W1204 10:07:31.510886483 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9795629Z 2025-12-04T10:49:10.9795777Z [W1204 10:07:31.510963251 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9795778Z 2025-12-04T10:49:10.9795828Z ('RERUN', {'yellow': True}) [10.5113s] [100%] 2025-12-04T10:49:10.9796188Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:07:32.409105919 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9796190Z 2025-12-04T10:49:10.9796336Z [W1204 10:07:32.409485661 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9796340Z 2025-12-04T10:49:10.9796486Z [W1204 10:07:32.409577039 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9796489Z 2025-12-04T10:49:10.9796637Z [W1204 10:07:32.410963719 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9796638Z 2025-12-04T10:49:10.9796785Z [W1204 10:07:32.411227983 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9796788Z 2025-12-04T10:49:10.9796935Z [W1204 10:07:32.411304822 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9796937Z 2025-12-04T10:49:10.9797082Z [W1204 10:07:32.413511864 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9797086Z 2025-12-04T10:49:10.9797232Z [W1204 10:07:32.413770329 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9797236Z 2025-12-04T10:49:10.9797382Z [W1204 10:07:32.413846677 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9797384Z 2025-12-04T10:49:10.9797432Z ('RERUN', {'yellow': True}) [0.7643s] [100%] 2025-12-04T10:49:10.9797811Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:07:33.178682389 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9797813Z 2025-12-04T10:49:10.9797960Z [W1204 10:07:33.179080910 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9797962Z 2025-12-04T10:49:10.9798108Z [W1204 10:07:33.179181088 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9798111Z 2025-12-04T10:49:10.9798257Z [W1204 10:07:33.180594757 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9798259Z 2025-12-04T10:49:10.9798405Z [W1204 10:07:33.180864032 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9798407Z 2025-12-04T10:49:10.9798555Z [W1204 10:07:33.180941360 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9798557Z 2025-12-04T10:49:10.9798704Z [W1204 10:07:33.183155612 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9798706Z 2025-12-04T10:49:10.9798854Z [W1204 10:07:33.183414077 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9798875Z 2025-12-04T10:49:10.9799021Z [W1204 10:07:33.183490345 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9799023Z 2025-12-04T10:49:10.9799060Z FAILED [0.7755s] [100%] 2025-12-04T10:49:10.9799062Z 2025-12-04T10:49:10.9799113Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9799265Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9799311Z Traceback (most recent call last): 2025-12-04T10:49:10.9799466Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9799507Z method(*args, **kwargs) 2025-12-04T10:49:10.9799658Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9799700Z method(*args, **kwargs) 2025-12-04T10:49:10.9799850Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9799887Z with policy(): 2025-12-04T10:49:10.9800037Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9800079Z raise RuntimeError(msg) 2025-12-04T10:49:10.9800479Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9800482Z 2025-12-04T10:49:10.9800554Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9800843Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9800847Z 2025-12-04T10:49:10.9800932Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9801004Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9801080Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9801350Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9801421Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9801459Z graph_break [] 2025-12-04T10:49:10.9801529Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9801903Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9801947Z if out == self.unknown_value: 2025-12-04T10:49:10.9802098Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9802143Z Traceback (most recent call last): 2025-12-04T10:49:10.9802295Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9802335Z method(*args, **kwargs) 2025-12-04T10:49:10.9802486Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9802526Z method(*args, **kwargs) 2025-12-04T10:49:10.9802704Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9802741Z with policy(): 2025-12-04T10:49:10.9802891Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9802932Z raise RuntimeError(msg) 2025-12-04T10:49:10.9803339Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9803341Z 2025-12-04T10:49:10.9803415Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9803704Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9803708Z 2025-12-04T10:49:10.9803793Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9803865Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9803919Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9804189Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9804260Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9804298Z graph_break [] 2025-12-04T10:49:10.9804367Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9804706Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9804750Z if out == self.unknown_value: 2025-12-04T10:49:10.9804821Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9804901Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9804974Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9805242Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9805278Z graph_break [] 2025-12-04T10:49:10.9805329Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9805483Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9805528Z Traceback (most recent call last): 2025-12-04T10:49:10.9805681Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9805721Z method(*args, **kwargs) 2025-12-04T10:49:10.9805873Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9805913Z method(*args, **kwargs) 2025-12-04T10:49:10.9806061Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9806097Z with policy(): 2025-12-04T10:49:10.9806248Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9806319Z raise RuntimeError(msg) 2025-12-04T10:49:10.9806725Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9806727Z 2025-12-04T10:49:10.9806801Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9807088Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9807091Z 2025-12-04T10:49:10.9807175Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9807246Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9807302Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9807571Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9807640Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9807678Z graph_break [] 2025-12-04T10:49:10.9807748Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9808090Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9808132Z if out == self.unknown_value: 2025-12-04T10:49:10.9808202Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9808257Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9808328Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9808617Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9808654Z graph_break [] 2025-12-04T10:49:10.9808723Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9808777Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9808845Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9809112Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9809150Z graph_break [] 2025-12-04T10:49:10.9809391Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-fa322d5db22c17b8.xml - 2025-12-04T10:49:10.9809449Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9810085Z FAILED [0.7755s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9810112Z 2025-12-04T10:49:10.9810185Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9810473Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9810476Z 2025-12-04T10:49:10.9810561Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9810623Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9810689Z ================== 1 failed, 57 deselected, 2 rerun in 12.22s ================== 2025-12-04T10:49:10.9810726Z Got exit code 1 2025-12-04T10:49:10.9810964Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9811092Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:10.9811288Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-f7b80967d991d5cb.xml 2025-12-04T10:49:10.9811345Z ============================= test session starts ============================== 2025-12-04T10:49:10.9811455Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9811496Z cachedir: .pytest_cache 2025-12-04T10:49:10.9811653Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9811699Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9811738Z configfile: pytest.ini 2025-12-04T10:49:10.9811940Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9812014Z collecting ... collected 58 items / 5 deselected / 53 selected 2025-12-04T10:49:10.9812067Z stepcurrent: skipping 5 already run items. 2025-12-04T10:49:10.9812110Z Running 53 items in this shard 2025-12-04T10:49:10.9812112Z 2025-12-04T10:49:10.9812756Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 SKIPPED [0.0006s] (Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/167814 for platform(s) rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests.) [ 1%] 2025-12-04T10:49:10.9813004Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [3.0624s] [ 3%] 2025-12-04T10:49:10.9813246Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6342s] [ 3%] 2025-12-04T10:49:10.9813469Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 FAILED [0.6240s] [ 3%] 2025-12-04T10:49:10.9813472Z 2025-12-04T10:49:10.9813523Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9813673Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:10.9813718Z Traceback (most recent call last): 2025-12-04T10:49:10.9813874Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9813914Z method(*args, **kwargs) 2025-12-04T10:49:10.9814065Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9814132Z method(*args, **kwargs) 2025-12-04T10:49:10.9814281Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9814319Z with policy(): 2025-12-04T10:49:10.9814470Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9814510Z raise RuntimeError(msg) 2025-12-04T10:49:10.9814910Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:10.9814912Z 2025-12-04T10:49:10.9814986Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9815275Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9815278Z 2025-12-04T10:49:10.9815363Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9815437Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9815491Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9815762Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9815833Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9815872Z graph_break [] 2025-12-04T10:49:10.9816019Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:10.9816065Z Traceback (most recent call last): 2025-12-04T10:49:10.9816218Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9816257Z method(*args, **kwargs) 2025-12-04T10:49:10.9816430Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9816470Z method(*args, **kwargs) 2025-12-04T10:49:10.9816618Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9816661Z with policy(): 2025-12-04T10:49:10.9816815Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9816858Z raise RuntimeError(msg) 2025-12-04T10:49:10.9817258Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:10.9817261Z 2025-12-04T10:49:10.9817334Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9817622Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9817624Z 2025-12-04T10:49:10.9817707Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9817779Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9817856Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9818123Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9818194Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9818233Z graph_break [] 2025-12-04T10:49:10.9818308Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9818366Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9818435Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9818703Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9818742Z graph_break [] 2025-12-04T10:49:10.9818793Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9818942Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:10.9818986Z Traceback (most recent call last): 2025-12-04T10:49:10.9819141Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9819180Z method(*args, **kwargs) 2025-12-04T10:49:10.9819330Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9819368Z method(*args, **kwargs) 2025-12-04T10:49:10.9819518Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9819555Z with policy(): 2025-12-04T10:49:10.9819706Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9819746Z raise RuntimeError(msg) 2025-12-04T10:49:10.9820170Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:10.9820172Z 2025-12-04T10:49:10.9820243Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9820528Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9820532Z 2025-12-04T10:49:10.9820617Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9820689Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9820745Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9821014Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9821085Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9821121Z graph_break [] 2025-12-04T10:49:10.9821193Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9821246Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9821317Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9821611Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9821648Z graph_break [] 2025-12-04T10:49:10.9821717Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9821773Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9821842Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9822139Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9822175Z graph_break [] 2025-12-04T10:49:10.9822416Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-f7b80967d991d5cb.xml - 2025-12-04T10:49:10.9822477Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9823105Z FAILED [0.6240s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:10.9823108Z 2025-12-04T10:49:10.9823180Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9823465Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9823470Z 2025-12-04T10:49:10.9823553Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9823614Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9823686Z ============= 1 failed, 1 skipped, 5 deselected, 2 rerun in 4.49s ============== 2025-12-04T10:49:10.9823751Z Got exit code 1 2025-12-04T10:49:10.9823792Z Retrying single test... 2025-12-04T10:49:10.9823989Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-5fd9c03999eb1759.xml 2025-12-04T10:49:10.9824045Z ============================= test session starts ============================== 2025-12-04T10:49:10.9824158Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9824198Z cachedir: .pytest_cache 2025-12-04T10:49:10.9824359Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9824403Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9824444Z configfile: pytest.ini 2025-12-04T10:49:10.9824605Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9824678Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9824961Z stepcurrent: skipping 6 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9825007Z Running 1 items in this shard 2025-12-04T10:49:10.9825009Z 2025-12-04T10:49:10.9825367Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:07:55.580848913 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9825400Z 2025-12-04T10:49:10.9825551Z [W1204 10:08:02.233272481 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9825553Z 2025-12-04T10:49:10.9825704Z [W1204 10:08:02.233448517 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9825706Z 2025-12-04T10:49:10.9825853Z [W1204 10:08:02.237715036 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9825855Z 2025-12-04T10:49:10.9826002Z [W1204 10:08:02.238035579 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9826004Z 2025-12-04T10:49:10.9826150Z [W1204 10:08:02.238112808 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9826153Z 2025-12-04T10:49:10.9826301Z [W1204 10:08:02.240690083 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9826302Z 2025-12-04T10:49:10.9826450Z [W1204 10:08:02.240963097 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9826452Z 2025-12-04T10:49:10.9826597Z [W1204 10:08:02.241046725 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9826599Z 2025-12-04T10:49:10.9826649Z ('RERUN', {'yellow': True}) [10.7122s] [100%] 2025-12-04T10:49:10.9827002Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:08:03.080866035 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9827006Z 2025-12-04T10:49:10.9827156Z [W1204 10:08:03.081298906 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9827158Z 2025-12-04T10:49:10.9827336Z [W1204 10:08:03.081398944 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9827338Z 2025-12-04T10:49:10.9827485Z [W1204 10:08:03.082814253 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9827487Z 2025-12-04T10:49:10.9827635Z [W1204 10:08:03.083089387 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9827637Z 2025-12-04T10:49:10.9827785Z [W1204 10:08:03.083175346 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9827787Z 2025-12-04T10:49:10.9827933Z [W1204 10:08:03.085396128 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9827935Z 2025-12-04T10:49:10.9828085Z [W1204 10:08:03.085656763 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9828087Z 2025-12-04T10:49:10.9828232Z [W1204 10:08:03.085739421 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9828234Z 2025-12-04T10:49:10.9828283Z ('RERUN', {'yellow': True}) [0.7330s] [100%] 2025-12-04T10:49:10.9828635Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:08:04.814247148 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9828659Z 2025-12-04T10:49:10.9828808Z [W1204 10:08:04.814713198 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9828810Z 2025-12-04T10:49:10.9828959Z [W1204 10:08:04.814816556 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9828961Z 2025-12-04T10:49:10.9829106Z [W1204 10:08:04.816250755 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9829108Z 2025-12-04T10:49:10.9829254Z [W1204 10:08:04.816525680 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9829256Z 2025-12-04T10:49:10.9829403Z [W1204 10:08:04.816604878 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9829407Z 2025-12-04T10:49:10.9829555Z [W1204 10:08:04.818868880 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9829556Z 2025-12-04T10:49:10.9829705Z [W1204 10:08:04.819135744 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9829708Z 2025-12-04T10:49:10.9829853Z [W1204 10:08:04.819215802 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9829855Z 2025-12-04T10:49:10.9829893Z FAILED [0.6955s] [100%] 2025-12-04T10:49:10.9829895Z 2025-12-04T10:49:10.9829945Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9830095Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:10.9830141Z Traceback (most recent call last): 2025-12-04T10:49:10.9830299Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9830338Z method(*args, **kwargs) 2025-12-04T10:49:10.9830512Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9830552Z method(*args, **kwargs) 2025-12-04T10:49:10.9830702Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9830738Z with policy(): 2025-12-04T10:49:10.9830890Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9830930Z raise RuntimeError(msg) 2025-12-04T10:49:10.9831323Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:10.9831327Z 2025-12-04T10:49:10.9831400Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9831687Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9831689Z 2025-12-04T10:49:10.9831775Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9831893Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9831950Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9832251Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9832323Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9832359Z graph_break [] 2025-12-04T10:49:10.9832433Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9832776Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9832820Z if out == self.unknown_value: 2025-12-04T10:49:10.9832969Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:10.9833016Z Traceback (most recent call last): 2025-12-04T10:49:10.9833168Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9833207Z method(*args, **kwargs) 2025-12-04T10:49:10.9833358Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9833398Z method(*args, **kwargs) 2025-12-04T10:49:10.9833547Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9833583Z with policy(): 2025-12-04T10:49:10.9833734Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9833774Z raise RuntimeError(msg) 2025-12-04T10:49:10.9834178Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:10.9834182Z 2025-12-04T10:49:10.9834253Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9834575Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9834578Z 2025-12-04T10:49:10.9834664Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9834735Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9834791Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9835059Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9835133Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9835170Z graph_break [] 2025-12-04T10:49:10.9835242Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9835582Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9835626Z if out == self.unknown_value: 2025-12-04T10:49:10.9835696Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9835751Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9835821Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9836112Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9836147Z graph_break [] 2025-12-04T10:49:10.9836202Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9836350Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:10.9836397Z Traceback (most recent call last): 2025-12-04T10:49:10.9836550Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9836588Z method(*args, **kwargs) 2025-12-04T10:49:10.9836739Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9836778Z method(*args, **kwargs) 2025-12-04T10:49:10.9836928Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9836964Z with policy(): 2025-12-04T10:49:10.9837115Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9837157Z raise RuntimeError(msg) 2025-12-04T10:49:10.9837561Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:10.9837564Z 2025-12-04T10:49:10.9837634Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9837922Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9837924Z 2025-12-04T10:49:10.9838010Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9838103Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9838159Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9838428Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9838499Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9838534Z graph_break [] 2025-12-04T10:49:10.9838608Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9838948Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9838992Z if out == self.unknown_value: 2025-12-04T10:49:10.9839064Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9839118Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9839188Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9839458Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9839494Z graph_break [] 2025-12-04T10:49:10.9839586Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9839639Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9839709Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9839978Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9840013Z graph_break [] 2025-12-04T10:49:10.9840256Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-5fd9c03999eb1759.xml - 2025-12-04T10:49:10.9840314Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9840947Z FAILED [0.6955s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:10.9840952Z 2025-12-04T10:49:10.9841023Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9841313Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9841315Z 2025-12-04T10:49:10.9841400Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9841461Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9841529Z ================== 1 failed, 57 deselected, 2 rerun in 12.31s ================== 2025-12-04T10:49:10.9841564Z Got exit code 1 2025-12-04T10:49:10.9841606Z Retrying single test... 2025-12-04T10:49:10.9841804Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-04887deaa8b42b54.xml 2025-12-04T10:49:10.9841924Z ============================= test session starts ============================== 2025-12-04T10:49:10.9842035Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9842075Z cachedir: .pytest_cache 2025-12-04T10:49:10.9842231Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9842277Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9842316Z configfile: pytest.ini 2025-12-04T10:49:10.9842478Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9842549Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9842832Z stepcurrent: skipping 6 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9842876Z Running 1 items in this shard 2025-12-04T10:49:10.9842878Z 2025-12-04T10:49:10.9843238Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:08:14.867810842 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9843241Z 2025-12-04T10:49:10.9843392Z [W1204 10:08:21.279511683 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9843421Z 2025-12-04T10:49:10.9843569Z [W1204 10:08:21.279664670 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9843571Z 2025-12-04T10:49:10.9843723Z [W1204 10:08:21.283517278 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9843724Z 2025-12-04T10:49:10.9843871Z [W1204 10:08:21.283814362 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9843874Z 2025-12-04T10:49:10.9844020Z [W1204 10:08:21.283889900 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9844022Z 2025-12-04T10:49:10.9844168Z [W1204 10:08:21.286383078 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9844171Z 2025-12-04T10:49:10.9844316Z [W1204 10:08:21.286659512 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9844318Z 2025-12-04T10:49:10.9844465Z [W1204 10:08:21.286733450 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9844468Z 2025-12-04T10:49:10.9844517Z ('RERUN', {'yellow': True}) [10.4843s] [100%] 2025-12-04T10:49:10.9844870Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:08:22.092851313 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9844872Z 2025-12-04T10:49:10.9845021Z [W1204 10:08:22.093271765 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9845024Z 2025-12-04T10:49:10.9845170Z [W1204 10:08:22.093362173 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9845172Z 2025-12-04T10:49:10.9845322Z [W1204 10:08:22.094755543 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9845346Z 2025-12-04T10:49:10.9845495Z [W1204 10:08:22.095015928 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9845497Z 2025-12-04T10:49:10.9845643Z [W1204 10:08:22.095097906 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9845645Z 2025-12-04T10:49:10.9845792Z [W1204 10:08:22.097281200 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9845796Z 2025-12-04T10:49:10.9845941Z [W1204 10:08:22.097540074 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9845943Z 2025-12-04T10:49:10.9846090Z [W1204 10:08:22.097619822 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9846093Z 2025-12-04T10:49:10.9846142Z ('RERUN', {'yellow': True}) [0.6563s] [100%] 2025-12-04T10:49:10.9846496Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:08:23.736259340 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9846498Z 2025-12-04T10:49:10.9846644Z [W1204 10:08:23.736695591 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9846668Z 2025-12-04T10:49:10.9846815Z [W1204 10:08:23.736787739 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9846817Z 2025-12-04T10:49:10.9846963Z [W1204 10:08:23.738182769 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9846966Z 2025-12-04T10:49:10.9847113Z [W1204 10:08:23.738434364 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9847114Z 2025-12-04T10:49:10.9847264Z [W1204 10:08:23.738510702 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9847265Z 2025-12-04T10:49:10.9847413Z [W1204 10:08:23.740689626 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9847418Z 2025-12-04T10:49:10.9847564Z [W1204 10:08:23.740949591 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9847566Z 2025-12-04T10:49:10.9847714Z [W1204 10:08:23.741030529 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9847715Z 2025-12-04T10:49:10.9847755Z FAILED [0.6520s] [100%] 2025-12-04T10:49:10.9847757Z 2025-12-04T10:49:10.9847810Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9847958Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:10.9848003Z Traceback (most recent call last): 2025-12-04T10:49:10.9848159Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9848203Z method(*args, **kwargs) 2025-12-04T10:49:10.9848353Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9848393Z method(*args, **kwargs) 2025-12-04T10:49:10.9848542Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9848580Z with policy(): 2025-12-04T10:49:10.9848763Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9848805Z raise RuntimeError(msg) 2025-12-04T10:49:10.9849198Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:10.9849202Z 2025-12-04T10:49:10.9849274Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9849561Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9849564Z 2025-12-04T10:49:10.9849650Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9849722Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9849776Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9850045Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9850147Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9850185Z graph_break [] 2025-12-04T10:49:10.9850255Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9850601Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9850645Z if out == self.unknown_value: 2025-12-04T10:49:10.9850792Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:10.9850838Z Traceback (most recent call last): 2025-12-04T10:49:10.9850989Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9851029Z method(*args, **kwargs) 2025-12-04T10:49:10.9851179Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9851219Z method(*args, **kwargs) 2025-12-04T10:49:10.9851367Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9851404Z with policy(): 2025-12-04T10:49:10.9851555Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9851596Z raise RuntimeError(msg) 2025-12-04T10:49:10.9852030Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:10.9852033Z 2025-12-04T10:49:10.9852108Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9852393Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9852395Z 2025-12-04T10:49:10.9852480Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9852582Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9852637Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9852904Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9852975Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9853014Z graph_break [] 2025-12-04T10:49:10.9853084Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9853424Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9853468Z if out == self.unknown_value: 2025-12-04T10:49:10.9853539Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9853592Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9853662Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9853928Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9853992Z graph_break [] 2025-12-04T10:49:10.9854044Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9854193Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:10.9854239Z Traceback (most recent call last): 2025-12-04T10:49:10.9854394Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9854436Z method(*args, **kwargs) 2025-12-04T10:49:10.9854585Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9854626Z method(*args, **kwargs) 2025-12-04T10:49:10.9854776Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9854817Z with policy(): 2025-12-04T10:49:10.9854968Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9855011Z raise RuntimeError(msg) 2025-12-04T10:49:10.9855414Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:10.9855417Z 2025-12-04T10:49:10.9855491Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9855776Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9855780Z 2025-12-04T10:49:10.9855865Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9855938Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9855993Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9856287Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9856359Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9856399Z graph_break [] 2025-12-04T10:49:10.9856471Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9856813Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9856858Z if out == self.unknown_value: 2025-12-04T10:49:10.9856932Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9856987Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9857059Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9857329Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9857369Z graph_break [] 2025-12-04T10:49:10.9857438Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9857496Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9857568Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9857855Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9857892Z graph_break [] 2025-12-04T10:49:10.9858136Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-04887deaa8b42b54.xml - 2025-12-04T10:49:10.9858196Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9858823Z FAILED [0.6520s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:10.9858827Z 2025-12-04T10:49:10.9858899Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9859187Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9859189Z 2025-12-04T10:49:10.9859273Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9859337Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9859402Z ================== 1 failed, 57 deselected, 2 rerun in 11.96s ================== 2025-12-04T10:49:10.9859440Z Got exit code 1 2025-12-04T10:49:10.9859680Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9859810Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:10.9860005Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-b5bdc697707e9f87.xml 2025-12-04T10:49:10.9860089Z ============================= test session starts ============================== 2025-12-04T10:49:10.9860201Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9860245Z cachedir: .pytest_cache 2025-12-04T10:49:10.9860403Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9860451Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9860490Z configfile: pytest.ini 2025-12-04T10:49:10.9860656Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9860728Z collecting ... collected 58 items / 7 deselected / 51 selected 2025-12-04T10:49:10.9860783Z stepcurrent: skipping 7 already run items. 2025-12-04T10:49:10.9860825Z Running 51 items in this shard 2025-12-04T10:49:10.9860828Z 2025-12-04T10:49:10.9861079Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.6399s] [ 1%] 2025-12-04T10:49:10.9861325Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.4927s] [ 1%] 2025-12-04T10:49:10.9861546Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 FAILED [0.4810s] [ 1%] 2025-12-04T10:49:10.9861574Z 2025-12-04T10:49:10.9861627Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9861777Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9861826Z Traceback (most recent call last): 2025-12-04T10:49:10.9862030Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9862074Z method(*args, **kwargs) 2025-12-04T10:49:10.9862225Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9862268Z method(*args, **kwargs) 2025-12-04T10:49:10.9862416Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9862457Z with policy(): 2025-12-04T10:49:10.9862614Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9862654Z raise RuntimeError(msg) 2025-12-04T10:49:10.9863056Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9863058Z 2025-12-04T10:49:10.9863130Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9863422Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9863426Z 2025-12-04T10:49:10.9863511Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9863585Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9863639Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9863936Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9864009Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9864046Z graph_break [] 2025-12-04T10:49:10.9864195Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9864240Z Traceback (most recent call last): 2025-12-04T10:49:10.9864396Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9864436Z method(*args, **kwargs) 2025-12-04T10:49:10.9864589Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9864628Z method(*args, **kwargs) 2025-12-04T10:49:10.9864782Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9864819Z with policy(): 2025-12-04T10:49:10.9864970Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9865010Z raise RuntimeError(msg) 2025-12-04T10:49:10.9865420Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9865449Z 2025-12-04T10:49:10.9865520Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9865809Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9865812Z 2025-12-04T10:49:10.9865899Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9865968Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9866025Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9866293Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9866442Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9866478Z graph_break [] 2025-12-04T10:49:10.9866550Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9866604Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9866677Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9866945Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9866985Z graph_break [] 2025-12-04T10:49:10.9867037Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9867189Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9867235Z Traceback (most recent call last): 2025-12-04T10:49:10.9867389Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9867428Z method(*args, **kwargs) 2025-12-04T10:49:10.9867600Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9867640Z method(*args, **kwargs) 2025-12-04T10:49:10.9867792Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9867831Z with policy(): 2025-12-04T10:49:10.9867981Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9868024Z raise RuntimeError(msg) 2025-12-04T10:49:10.9868428Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9868432Z 2025-12-04T10:49:10.9868505Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9868793Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9868795Z 2025-12-04T10:49:10.9868881Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9868951Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9869006Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9869305Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9869375Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9869412Z graph_break [] 2025-12-04T10:49:10.9869483Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9869538Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9869610Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9869880Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9869915Z graph_break [] 2025-12-04T10:49:10.9869988Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9870042Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9870113Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9870380Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9870419Z graph_break [] 2025-12-04T10:49:10.9870663Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-b5bdc697707e9f87.xml - 2025-12-04T10:49:10.9870723Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9871357Z FAILED [0.4810s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9871361Z 2025-12-04T10:49:10.9871453Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9871739Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9871742Z 2025-12-04T10:49:10.9871824Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9871928Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9871996Z =================== 1 failed, 7 deselected, 2 rerun in 3.78s =================== 2025-12-04T10:49:10.9872036Z Got exit code 1 2025-12-04T10:49:10.9872076Z Retrying single test... 2025-12-04T10:49:10.9872271Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6094628dd2bd9a1d.xml 2025-12-04T10:49:10.9872328Z ============================= test session starts ============================== 2025-12-04T10:49:10.9872438Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9872479Z cachedir: .pytest_cache 2025-12-04T10:49:10.9872635Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9872680Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9872720Z configfile: pytest.ini 2025-12-04T10:49:10.9872914Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9872987Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9873274Z stepcurrent: skipping 7 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9873316Z Running 1 items in this shard 2025-12-04T10:49:10.9873318Z 2025-12-04T10:49:10.9873682Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:08:44.592593432 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9873684Z 2025-12-04T10:49:10.9873835Z [W1204 10:08:51.265034955 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9873840Z 2025-12-04T10:49:10.9873990Z [W1204 10:08:51.265215761 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9873992Z 2025-12-04T10:49:10.9874140Z [W1204 10:08:51.269063390 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9874142Z 2025-12-04T10:49:10.9874288Z [W1204 10:08:51.269351644 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9874290Z 2025-12-04T10:49:10.9874438Z [W1204 10:08:51.269428213 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9874440Z 2025-12-04T10:49:10.9874586Z [W1204 10:08:51.271926840 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9874590Z 2025-12-04T10:49:10.9874737Z [W1204 10:08:51.272205074 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9874739Z 2025-12-04T10:49:10.9874886Z [W1204 10:08:51.272282923 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9874914Z 2025-12-04T10:49:10.9874966Z ('RERUN', {'yellow': True}) [11.1805s] [100%] 2025-12-04T10:49:10.9875325Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:08:52.084348523 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9875327Z 2025-12-04T10:49:10.9875474Z [W1204 10:08:52.084766215 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9875477Z 2025-12-04T10:49:10.9875625Z [W1204 10:08:52.084853003 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9875627Z 2025-12-04T10:49:10.9875777Z [W1204 10:08:52.086228204 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9875780Z 2025-12-04T10:49:10.9875927Z [W1204 10:08:52.086479178 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9875929Z 2025-12-04T10:49:10.9876076Z [W1204 10:08:52.086553387 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9876078Z 2025-12-04T10:49:10.9876225Z [W1204 10:08:52.088722471 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9876254Z 2025-12-04T10:49:10.9876402Z [W1204 10:08:52.088983616 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9876404Z 2025-12-04T10:49:10.9876551Z [W1204 10:08:52.089063774 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9876553Z 2025-12-04T10:49:10.9876602Z ('RERUN', {'yellow': True}) [0.6725s] [100%] 2025-12-04T10:49:10.9876958Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:08:53.729318460 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9876960Z 2025-12-04T10:49:10.9877107Z [W1204 10:08:53.729742912 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9877111Z 2025-12-04T10:49:10.9877259Z [W1204 10:08:53.729831790 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9877261Z 2025-12-04T10:49:10.9877407Z [W1204 10:08:53.731231790 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9877411Z 2025-12-04T10:49:10.9877557Z [W1204 10:08:53.731490115 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9877559Z 2025-12-04T10:49:10.9877707Z [W1204 10:08:53.731567593 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9877709Z 2025-12-04T10:49:10.9877855Z [W1204 10:08:53.733754687 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9877859Z 2025-12-04T10:49:10.9878008Z [W1204 10:08:53.734018012 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9878010Z 2025-12-04T10:49:10.9878155Z [W1204 10:08:53.734094990 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9878157Z 2025-12-04T10:49:10.9878217Z FAILED [0.6323s] [100%] 2025-12-04T10:49:10.9878219Z 2025-12-04T10:49:10.9878272Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9878421Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9878467Z Traceback (most recent call last): 2025-12-04T10:49:10.9878622Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9878665Z method(*args, **kwargs) 2025-12-04T10:49:10.9878815Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9878855Z method(*args, **kwargs) 2025-12-04T10:49:10.9879004Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9879043Z with policy(): 2025-12-04T10:49:10.9879195Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9879237Z raise RuntimeError(msg) 2025-12-04T10:49:10.9879632Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9879654Z 2025-12-04T10:49:10.9879728Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9880017Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9880020Z 2025-12-04T10:49:10.9880108Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9880181Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9880235Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9880507Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9880578Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9880617Z graph_break [] 2025-12-04T10:49:10.9880687Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9881032Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9881075Z if out == self.unknown_value: 2025-12-04T10:49:10.9881227Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9881271Z Traceback (most recent call last): 2025-12-04T10:49:10.9881424Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9881463Z method(*args, **kwargs) 2025-12-04T10:49:10.9881617Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9881656Z method(*args, **kwargs) 2025-12-04T10:49:10.9881805Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9881842Z with policy(): 2025-12-04T10:49:10.9882069Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9882111Z raise RuntimeError(msg) 2025-12-04T10:49:10.9882517Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9882519Z 2025-12-04T10:49:10.9882594Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9882882Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9882884Z 2025-12-04T10:49:10.9882971Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9883043Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9883101Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9883370Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9883442Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9883509Z graph_break [] 2025-12-04T10:49:10.9883579Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9883918Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9883962Z if out == self.unknown_value: 2025-12-04T10:49:10.9884033Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9884087Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9884158Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9884425Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9884464Z graph_break [] 2025-12-04T10:49:10.9884515Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9884667Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9884711Z Traceback (most recent call last): 2025-12-04T10:49:10.9884868Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9884907Z method(*args, **kwargs) 2025-12-04T10:49:10.9885057Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9885096Z method(*args, **kwargs) 2025-12-04T10:49:10.9885247Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9885282Z with policy(): 2025-12-04T10:49:10.9885435Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9885476Z raise RuntimeError(msg) 2025-12-04T10:49:10.9885899Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9885902Z 2025-12-04T10:49:10.9885974Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9886259Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9886261Z 2025-12-04T10:49:10.9886349Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9886419Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9886475Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9886745Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9886816Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9886853Z graph_break [] 2025-12-04T10:49:10.9886923Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9887263Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9887329Z if out == self.unknown_value: 2025-12-04T10:49:10.9887400Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9887453Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9887525Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9887792Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9887829Z graph_break [] 2025-12-04T10:49:10.9887898Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9887952Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9888021Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9888289Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9888325Z graph_break [] 2025-12-04T10:49:10.9888566Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6094628dd2bd9a1d.xml - 2025-12-04T10:49:10.9888625Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9889261Z FAILED [0.6323s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9889265Z 2025-12-04T10:49:10.9889337Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9889652Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9889655Z 2025-12-04T10:49:10.9889740Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9889800Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9889867Z ================== 1 failed, 57 deselected, 2 rerun in 12.65s ================== 2025-12-04T10:49:10.9889903Z Got exit code 1 2025-12-04T10:49:10.9889944Z Retrying single test... 2025-12-04T10:49:10.9890143Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-1e1dcabaadcea7ab.xml 2025-12-04T10:49:10.9890200Z ============================= test session starts ============================== 2025-12-04T10:49:10.9890311Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9890353Z cachedir: .pytest_cache 2025-12-04T10:49:10.9890512Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9890556Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9890597Z configfile: pytest.ini 2025-12-04T10:49:10.9890757Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9890830Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9891114Z stepcurrent: skipping 7 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9891184Z Running 1 items in this shard 2025-12-04T10:49:10.9891185Z 2025-12-04T10:49:10.9891547Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:09:03.601160217 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9891549Z 2025-12-04T10:49:10.9891703Z [W1204 10:09:10.193212988 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9891705Z 2025-12-04T10:49:10.9891882Z [W1204 10:09:10.193371644 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9891886Z 2025-12-04T10:49:10.9892034Z [W1204 10:09:10.196982829 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9892036Z 2025-12-04T10:49:10.9892184Z [W1204 10:09:10.197284163 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9892185Z 2025-12-04T10:49:10.9892333Z [W1204 10:09:10.197361961 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9892335Z 2025-12-04T10:49:10.9892482Z [W1204 10:09:10.199784520 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9892484Z 2025-12-04T10:49:10.9892631Z [W1204 10:09:10.200054334 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9892635Z 2025-12-04T10:49:10.9892781Z [W1204 10:09:10.200131823 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9892783Z 2025-12-04T10:49:10.9892834Z ('RERUN', {'yellow': True}) [10.2545s] [100%] 2025-12-04T10:49:10.9893217Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:09:11.814072335 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9893220Z 2025-12-04T10:49:10.9893369Z [W1204 10:09:11.814503866 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9893371Z 2025-12-04T10:49:10.9893518Z [W1204 10:09:11.814591024 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9893522Z 2025-12-04T10:49:10.9893667Z [W1204 10:09:11.815967945 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9893669Z 2025-12-04T10:49:10.9893818Z [W1204 10:09:11.816234140 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9893820Z 2025-12-04T10:49:10.9893968Z [W1204 10:09:11.816314158 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9893970Z 2025-12-04T10:49:10.9894118Z [W1204 10:09:11.818480213 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9894120Z 2025-12-04T10:49:10.9894265Z [W1204 10:09:11.818737717 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9894269Z 2025-12-04T10:49:10.9894442Z [W1204 10:09:11.818811876 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9894444Z 2025-12-04T10:49:10.9894493Z ('RERUN', {'yellow': True}) [0.4765s] [100%] 2025-12-04T10:49:10.9894849Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:09:11.292271869 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9894851Z 2025-12-04T10:49:10.9895000Z [W1204 10:09:11.292669891 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9895001Z 2025-12-04T10:49:10.9895149Z [W1204 10:09:11.292765389 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9895156Z 2025-12-04T10:49:10.9895303Z [W1204 10:09:11.294154100 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9895304Z 2025-12-04T10:49:10.9895453Z [W1204 10:09:11.294416194 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9895455Z 2025-12-04T10:49:10.9895603Z [W1204 10:09:11.294490013 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9895604Z 2025-12-04T10:49:10.9895751Z [W1204 10:09:11.296646218 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9895753Z 2025-12-04T10:49:10.9895899Z [W1204 10:09:11.296900492 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9895901Z 2025-12-04T10:49:10.9896051Z [W1204 10:09:11.296976311 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9896052Z 2025-12-04T10:49:10.9896091Z FAILED [0.4770s] [100%] 2025-12-04T10:49:10.9896093Z 2025-12-04T10:49:10.9896145Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9896318Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9896364Z Traceback (most recent call last): 2025-12-04T10:49:10.9896521Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9896560Z method(*args, **kwargs) 2025-12-04T10:49:10.9896712Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9896751Z method(*args, **kwargs) 2025-12-04T10:49:10.9896904Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9896941Z with policy(): 2025-12-04T10:49:10.9897093Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9897132Z raise RuntimeError(msg) 2025-12-04T10:49:10.9897531Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9897534Z 2025-12-04T10:49:10.9897607Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9897896Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9897921Z 2025-12-04T10:49:10.9898008Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9898079Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9898136Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9898406Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9898478Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9898514Z graph_break [] 2025-12-04T10:49:10.9898587Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9898927Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9898973Z if out == self.unknown_value: 2025-12-04T10:49:10.9899121Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9899169Z Traceback (most recent call last): 2025-12-04T10:49:10.9899320Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9899361Z method(*args, **kwargs) 2025-12-04T10:49:10.9899512Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9899550Z method(*args, **kwargs) 2025-12-04T10:49:10.9899701Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9899738Z with policy(): 2025-12-04T10:49:10.9899891Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9899931Z raise RuntimeError(msg) 2025-12-04T10:49:10.9900363Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9900365Z 2025-12-04T10:49:10.9900438Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9900724Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9900728Z 2025-12-04T10:49:10.9900813Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9900885Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9900940Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9901208Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9901280Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9901316Z graph_break [] 2025-12-04T10:49:10.9901388Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9901727Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9901796Z if out == self.unknown_value: 2025-12-04T10:49:10.9901909Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9901964Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9902036Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9902305Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9902341Z graph_break [] 2025-12-04T10:49:10.9902394Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9902543Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:10.9902594Z Traceback (most recent call last): 2025-12-04T10:49:10.9902747Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9902788Z method(*args, **kwargs) 2025-12-04T10:49:10.9902939Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9902978Z method(*args, **kwargs) 2025-12-04T10:49:10.9903128Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9903165Z with policy(): 2025-12-04T10:49:10.9903317Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9903357Z raise RuntimeError(msg) 2025-12-04T10:49:10.9903764Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9903768Z 2025-12-04T10:49:10.9903839Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9904163Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9904165Z 2025-12-04T10:49:10.9904250Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9904322Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9904376Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9904646Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9904717Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9904753Z graph_break [] 2025-12-04T10:49:10.9904828Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9905166Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9905210Z if out == self.unknown_value: 2025-12-04T10:49:10.9905280Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9905334Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9905434Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9905701Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9905737Z graph_break [] 2025-12-04T10:49:10.9905810Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9905863Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9905934Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9906199Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9906238Z graph_break [] 2025-12-04T10:49:10.9906484Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-1e1dcabaadcea7ab.xml - 2025-12-04T10:49:10.9906542Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9907178Z FAILED [0.4770s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9907180Z 2025-12-04T10:49:10.9907251Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9907539Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9907541Z 2025-12-04T10:49:10.9907625Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9907704Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9907771Z ================== 1 failed, 57 deselected, 2 rerun in 11.38s ================== 2025-12-04T10:49:10.9907807Z Got exit code 1 2025-12-04T10:49:10.9908045Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:10.9908171Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:10.9908371Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-5657165bc5ba6c7b.xml 2025-12-04T10:49:10.9908428Z ============================= test session starts ============================== 2025-12-04T10:49:10.9908540Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9908581Z cachedir: .pytest_cache 2025-12-04T10:49:10.9908743Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9908788Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9908830Z configfile: pytest.ini 2025-12-04T10:49:10.9908992Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9909065Z collecting ... collected 58 items / 8 deselected / 50 selected 2025-12-04T10:49:10.9909139Z stepcurrent: skipping 8 already run items. 2025-12-04T10:49:10.9909184Z Running 50 items in this shard 2025-12-04T10:49:10.9909186Z 2025-12-04T10:49:10.9909432Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.7161s] [ 2%] 2025-12-04T10:49:10.9909675Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6234s] [ 2%] 2025-12-04T10:49:10.9909896Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 FAILED [0.6193s] [ 2%] 2025-12-04T10:49:10.9909899Z 2025-12-04T10:49:10.9909950Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9910101Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9910147Z Traceback (most recent call last): 2025-12-04T10:49:10.9910303Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9910343Z method(*args, **kwargs) 2025-12-04T10:49:10.9910496Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9910535Z method(*args, **kwargs) 2025-12-04T10:49:10.9910688Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9910726Z with policy(): 2025-12-04T10:49:10.9910880Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9910921Z raise RuntimeError(msg) 2025-12-04T10:49:10.9911314Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9911317Z 2025-12-04T10:49:10.9911391Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9911695Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9911697Z 2025-12-04T10:49:10.9911785Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9911884Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9911940Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9912211Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9912284Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9912320Z graph_break [] 2025-12-04T10:49:10.9912471Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9912515Z Traceback (most recent call last): 2025-12-04T10:49:10.9912669Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9912707Z method(*args, **kwargs) 2025-12-04T10:49:10.9912858Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9912932Z method(*args, **kwargs) 2025-12-04T10:49:10.9913082Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9913119Z with policy(): 2025-12-04T10:49:10.9913270Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9913311Z raise RuntimeError(msg) 2025-12-04T10:49:10.9913711Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9913714Z 2025-12-04T10:49:10.9913786Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9914071Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9914075Z 2025-12-04T10:49:10.9914161Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9914232Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9914289Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9914560Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9914631Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9914668Z graph_break [] 2025-12-04T10:49:10.9914738Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9914794Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9914863Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9915131Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9915192Z graph_break [] 2025-12-04T10:49:10.9915247Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9915395Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9915441Z Traceback (most recent call last): 2025-12-04T10:49:10.9915592Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9915634Z method(*args, **kwargs) 2025-12-04T10:49:10.9915783Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9915823Z method(*args, **kwargs) 2025-12-04T10:49:10.9915972Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9916009Z with policy(): 2025-12-04T10:49:10.9916161Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9916203Z raise RuntimeError(msg) 2025-12-04T10:49:10.9916598Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9916631Z 2025-12-04T10:49:10.9916702Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9916987Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9916989Z 2025-12-04T10:49:10.9917074Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9917146Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9917200Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9917471Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9917541Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9917582Z graph_break [] 2025-12-04T10:49:10.9917653Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9917707Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9917776Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9918045Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9918129Z graph_break [] 2025-12-04T10:49:10.9918211Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9924417Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9924519Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9924821Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9924860Z graph_break [] 2025-12-04T10:49:10.9925182Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-5657165bc5ba6c7b.xml - 2025-12-04T10:49:10.9925251Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9925885Z FAILED [0.6193s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9925890Z 2025-12-04T10:49:10.9925968Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9926263Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9926266Z 2025-12-04T10:49:10.9926354Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9926422Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9926490Z =================== 1 failed, 8 deselected, 2 rerun in 4.13s =================== 2025-12-04T10:49:10.9926530Z Got exit code 1 2025-12-04T10:49:10.9926571Z Retrying single test... 2025-12-04T10:49:10.9926770Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-2b813152e0f62b23.xml 2025-12-04T10:49:10.9926868Z ============================= test session starts ============================== 2025-12-04T10:49:10.9926986Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9927027Z cachedir: .pytest_cache 2025-12-04T10:49:10.9927191Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9927238Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9927280Z configfile: pytest.ini 2025-12-04T10:49:10.9927445Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9927523Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9927807Z stepcurrent: skipping 8 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9927857Z Running 1 items in this shard 2025-12-04T10:49:10.9927859Z 2025-12-04T10:49:10.9928229Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:09:32.658246578 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9928232Z 2025-12-04T10:49:10.9928385Z [W1204 10:09:39.023181706 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9928387Z 2025-12-04T10:49:10.9928542Z [W1204 10:09:39.023336993 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9928544Z 2025-12-04T10:49:10.9928694Z [W1204 10:09:39.027340160 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9928696Z 2025-12-04T10:49:10.9928847Z [W1204 10:09:39.027647504 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9928849Z 2025-12-04T10:49:10.9929020Z [W1204 10:09:39.027744762 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9929022Z 2025-12-04T10:49:10.9929169Z [W1204 10:09:39.030225480 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9929171Z 2025-12-04T10:49:10.9929319Z [W1204 10:09:39.030491585 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9929321Z 2025-12-04T10:49:10.9929467Z [W1204 10:09:39.030566173 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9929474Z 2025-12-04T10:49:10.9929527Z ('RERUN', {'yellow': True}) [10.0489s] [100%] 2025-12-04T10:49:10.9929887Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:09:40.770745468 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9929893Z 2025-12-04T10:49:10.9930049Z [W1204 10:09:40.771147629 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9930051Z 2025-12-04T10:49:10.9930200Z [W1204 10:09:40.771237168 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9930202Z 2025-12-04T10:49:10.9930379Z [W1204 10:09:40.772717477 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9930381Z 2025-12-04T10:49:10.9930530Z [W1204 10:09:40.773033760 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9930532Z 2025-12-04T10:49:10.9930683Z [W1204 10:09:40.773112389 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9930687Z 2025-12-04T10:49:10.9930834Z [W1204 10:09:40.775396171 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9930836Z 2025-12-04T10:49:10.9930985Z [W1204 10:09:40.775666176 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9930987Z 2025-12-04T10:49:10.9931135Z [W1204 10:09:40.775740444 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9931138Z 2025-12-04T10:49:10.9931191Z ('RERUN', {'yellow': True}) [0.6025s] [100%] 2025-12-04T10:49:10.9931545Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:09:40.358182587 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9931547Z 2025-12-04T10:49:10.9931700Z [W1204 10:09:40.358551030 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9931702Z 2025-12-04T10:49:10.9931906Z [W1204 10:09:40.358634988 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9931908Z 2025-12-04T10:49:10.9932060Z [W1204 10:09:40.360023389 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9932064Z 2025-12-04T10:49:10.9932216Z [W1204 10:09:40.360273054 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9932219Z 2025-12-04T10:49:10.9932394Z [W1204 10:09:40.360348202 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9932396Z 2025-12-04T10:49:10.9932546Z [W1204 10:09:40.362560476 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9932548Z 2025-12-04T10:49:10.9932697Z [W1204 10:09:40.362812601 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9932699Z 2025-12-04T10:49:10.9932846Z [W1204 10:09:40.362885890 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9932850Z 2025-12-04T10:49:10.9932891Z FAILED [0.5849s] [100%] 2025-12-04T10:49:10.9932893Z 2025-12-04T10:49:10.9932947Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9933103Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9933152Z Traceback (most recent call last): 2025-12-04T10:49:10.9933316Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9933357Z method(*args, **kwargs) 2025-12-04T10:49:10.9933511Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9933551Z method(*args, **kwargs) 2025-12-04T10:49:10.9933705Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9933767Z with policy(): 2025-12-04T10:49:10.9933922Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9933963Z raise RuntimeError(msg) 2025-12-04T10:49:10.9934362Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9934364Z 2025-12-04T10:49:10.9934441Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9934732Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9934737Z 2025-12-04T10:49:10.9934826Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9934902Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9934962Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9935237Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9935313Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9935350Z graph_break [] 2025-12-04T10:49:10.9935425Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9935773Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9935820Z if out == self.unknown_value: 2025-12-04T10:49:10.9935973Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9936018Z Traceback (most recent call last): 2025-12-04T10:49:10.9936197Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9936237Z method(*args, **kwargs) 2025-12-04T10:49:10.9936389Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9936428Z method(*args, **kwargs) 2025-12-04T10:49:10.9936578Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9936616Z with policy(): 2025-12-04T10:49:10.9936768Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9936808Z raise RuntimeError(msg) 2025-12-04T10:49:10.9937212Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9937215Z 2025-12-04T10:49:10.9937290Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9958167Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9958215Z 2025-12-04T10:49:10.9958302Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9958375Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9958434Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9958706Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9958779Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9958814Z graph_break [] 2025-12-04T10:49:10.9958885Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9959227Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9959274Z if out == self.unknown_value: 2025-12-04T10:49:10.9959344Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9959399Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9959468Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9959740Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9959776Z graph_break [] 2025-12-04T10:49:10.9959827Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9959980Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9960027Z Traceback (most recent call last): 2025-12-04T10:49:10.9960182Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9960222Z method(*args, **kwargs) 2025-12-04T10:49:10.9960374Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9960440Z method(*args, **kwargs) 2025-12-04T10:49:10.9960590Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9960626Z with policy(): 2025-12-04T10:49:10.9960778Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9960817Z raise RuntimeError(msg) 2025-12-04T10:49:10.9961217Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9961221Z 2025-12-04T10:49:10.9961293Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9961584Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9961586Z 2025-12-04T10:49:10.9961672Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9961743Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9961798Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9962108Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9962213Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9962249Z graph_break [] 2025-12-04T10:49:10.9962320Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9962660Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9962703Z if out == self.unknown_value: 2025-12-04T10:49:10.9962772Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9962826Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9962897Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9963163Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9963199Z graph_break [] 2025-12-04T10:49:10.9963269Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9963322Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9963391Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9963657Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9963691Z graph_break [] 2025-12-04T10:49:10.9963935Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-2b813152e0f62b23.xml - 2025-12-04T10:49:10.9963993Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9964649Z FAILED [0.5849s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9964652Z 2025-12-04T10:49:10.9964724Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9965010Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9965014Z 2025-12-04T10:49:10.9965099Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9965158Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9965226Z ================== 1 failed, 57 deselected, 2 rerun in 11.39s ================== 2025-12-04T10:49:10.9965261Z Got exit code 1 2025-12-04T10:49:10.9965302Z Retrying single test... 2025-12-04T10:49:10.9965497Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d8b0179133188cec.xml 2025-12-04T10:49:10.9965554Z ============================= test session starts ============================== 2025-12-04T10:49:10.9965667Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9965728Z cachedir: .pytest_cache 2025-12-04T10:49:10.9965887Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9965934Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9965972Z configfile: pytest.ini 2025-12-04T10:49:10.9966137Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9966211Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9966496Z stepcurrent: skipping 8 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9966540Z Running 1 items in this shard 2025-12-04T10:49:10.9966543Z 2025-12-04T10:49:10.9966901Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:09:50.054681412 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9966905Z 2025-12-04T10:49:10.9967059Z [W1204 10:09:57.385480534 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9967061Z 2025-12-04T10:49:10.9967211Z [W1204 10:09:57.385649500 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9967213Z 2025-12-04T10:49:10.9967360Z [W1204 10:09:57.389449462 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9967362Z 2025-12-04T10:49:10.9967509Z [W1204 10:09:57.389745036 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9967513Z 2025-12-04T10:49:10.9967659Z [W1204 10:09:57.389820134 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9967661Z 2025-12-04T10:49:10.9967807Z [W1204 10:09:57.392288983 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9967832Z 2025-12-04T10:49:10.9967979Z [W1204 10:09:57.392559947 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9967981Z 2025-12-04T10:49:10.9968127Z [W1204 10:09:57.392633376 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9968129Z 2025-12-04T10:49:10.9968179Z ('RERUN', {'yellow': True}) [10.0532s] [100%] 2025-12-04T10:49:10.9968533Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:09:58.135684381 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9968539Z 2025-12-04T10:49:10.9968687Z [W1204 10:09:58.136046934 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9968691Z 2025-12-04T10:49:10.9968837Z [W1204 10:09:58.136130772 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9968838Z 2025-12-04T10:49:10.9968985Z [W1204 10:09:58.137481764 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9968987Z 2025-12-04T10:49:10.9969135Z [W1204 10:09:58.137732859 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9969157Z 2025-12-04T10:49:10.9969303Z [W1204 10:09:58.137806497 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9969305Z 2025-12-04T10:49:10.9969451Z [W1204 10:09:58.139970283 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9969453Z 2025-12-04T10:49:10.9969600Z [W1204 10:09:58.140228617 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9969602Z 2025-12-04T10:49:10.9969749Z [W1204 10:09:58.140305306 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9969750Z 2025-12-04T10:49:10.9969798Z ('RERUN', {'yellow': True}) [0.5996s] [100%] 2025-12-04T10:49:10.9970154Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:09:59.731380735 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9970157Z 2025-12-04T10:49:10.9970306Z [W1204 10:09:59.731752447 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9970308Z 2025-12-04T10:49:10.9970456Z [W1204 10:09:59.731839926 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9970457Z 2025-12-04T10:49:10.9970605Z [W1204 10:09:59.733211787 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9970607Z 2025-12-04T10:49:10.9970754Z [W1204 10:09:59.733462842 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9970758Z 2025-12-04T10:49:10.9970905Z [W1204 10:09:59.733537570 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9970907Z 2025-12-04T10:49:10.9971056Z [W1204 10:09:59.735676486 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9971058Z 2025-12-04T10:49:10.9971230Z [W1204 10:09:59.735929891 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9971232Z 2025-12-04T10:49:10.9971380Z [W1204 10:09:59.736008239 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9971381Z 2025-12-04T10:49:10.9971419Z FAILED [0.5993s] [100%] 2025-12-04T10:49:10.9971421Z 2025-12-04T10:49:10.9971475Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9971627Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9971674Z Traceback (most recent call last): 2025-12-04T10:49:10.9971831Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9971911Z method(*args, **kwargs) 2025-12-04T10:49:10.9972063Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9972104Z method(*args, **kwargs) 2025-12-04T10:49:10.9972256Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9972293Z with policy(): 2025-12-04T10:49:10.9972446Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9972516Z raise RuntimeError(msg) 2025-12-04T10:49:10.9972912Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:10.9972915Z 2025-12-04T10:49:10.9972990Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9973278Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9973281Z 2025-12-04T10:49:10.9973366Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9973442Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9973500Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9973772Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9973845Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9973882Z graph_break [] 2025-12-04T10:49:10.9973955Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9974297Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9974341Z if out == self.unknown_value: 2025-12-04T10:49:10.9974490Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9974537Z Traceback (most recent call last): 2025-12-04T10:49:10.9974689Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9974731Z method(*args, **kwargs) 2025-12-04T10:49:10.9974912Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9974952Z method(*args, **kwargs) 2025-12-04T10:49:10.9975102Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9975140Z with policy(): 2025-12-04T10:49:10.9975290Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9975332Z raise RuntimeError(msg) 2025-12-04T10:49:10.9975729Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:10.9975733Z 2025-12-04T10:49:10.9975805Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9976092Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9976095Z 2025-12-04T10:49:10.9976180Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9976252Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9976307Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9976598Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9976669Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9976707Z graph_break [] 2025-12-04T10:49:10.9976780Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9977120Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9977164Z if out == self.unknown_value: 2025-12-04T10:49:10.9977235Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9977290Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9977363Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9977635Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9977671Z graph_break [] 2025-12-04T10:49:10.9977725Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9977875Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:10.9977921Z Traceback (most recent call last): 2025-12-04T10:49:10.9978074Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9978114Z method(*args, **kwargs) 2025-12-04T10:49:10.9978264Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9978304Z method(*args, **kwargs) 2025-12-04T10:49:10.9978453Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9978489Z with policy(): 2025-12-04T10:49:10.9978661Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9978701Z raise RuntimeError(msg) 2025-12-04T10:49:10.9979099Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9979101Z 2025-12-04T10:49:10.9979172Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9979463Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9979465Z 2025-12-04T10:49:10.9979552Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9979628Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9979683Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9979954Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9980025Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9980089Z graph_break [] 2025-12-04T10:49:10.9980161Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:10.9980503Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:10.9980552Z if out == self.unknown_value: 2025-12-04T10:49:10.9980623Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9980681Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9980752Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9981021Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9981059Z graph_break [] 2025-12-04T10:49:10.9981132Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9981186Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9981259Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9981528Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9981567Z graph_break [] 2025-12-04T10:49:10.9981812Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d8b0179133188cec.xml - 2025-12-04T10:49:10.9981915Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9982580Z FAILED [0.5993s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:10.9982583Z 2025-12-04T10:49:10.9982655Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9982943Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9982946Z 2025-12-04T10:49:10.9983031Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9983096Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9983162Z ================== 1 failed, 57 deselected, 2 rerun in 11.40s ================== 2025-12-04T10:49:10.9983201Z Got exit code 1 2025-12-04T10:49:10.9983438Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:10.9983569Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:10.9983769Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-1429ab1eba6cb4a9.xml 2025-12-04T10:49:10.9983828Z ============================= test session starts ============================== 2025-12-04T10:49:10.9983943Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9984019Z cachedir: .pytest_cache 2025-12-04T10:49:10.9984181Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9984229Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9984272Z configfile: pytest.ini 2025-12-04T10:49:10.9984435Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9984511Z collecting ... collected 58 items / 9 deselected / 49 selected 2025-12-04T10:49:10.9984563Z stepcurrent: skipping 9 already run items. 2025-12-04T10:49:10.9984612Z Running 49 items in this shard 2025-12-04T10:49:10.9984614Z 2025-12-04T10:49:10.9984860Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [3.0147s] [ 2%] 2025-12-04T10:49:10.9985104Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.5682s] [ 2%] 2025-12-04T10:49:10.9985324Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 FAILED [0.5660s] [ 2%] 2025-12-04T10:49:10.9985329Z 2025-12-04T10:49:10.9985382Z ==================================== RERUNS ==================================== 2025-12-04T10:49:10.9985532Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:10.9985577Z Traceback (most recent call last): 2025-12-04T10:49:10.9985735Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9985776Z method(*args, **kwargs) 2025-12-04T10:49:10.9985931Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9985972Z method(*args, **kwargs) 2025-12-04T10:49:10.9986126Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9986163Z with policy(): 2025-12-04T10:49:10.9986345Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9986386Z raise RuntimeError(msg) 2025-12-04T10:49:10.9986782Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:10.9986784Z 2025-12-04T10:49:10.9986857Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9987145Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9987148Z 2025-12-04T10:49:10.9987236Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9987310Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9987368Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9987640Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9987715Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9987773Z graph_break [] 2025-12-04T10:49:10.9987924Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:10.9987970Z Traceback (most recent call last): 2025-12-04T10:49:10.9988125Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9988164Z method(*args, **kwargs) 2025-12-04T10:49:10.9988318Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9988357Z method(*args, **kwargs) 2025-12-04T10:49:10.9988510Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9988545Z with policy(): 2025-12-04T10:49:10.9988700Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9988743Z raise RuntimeError(msg) 2025-12-04T10:49:10.9989139Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:10.9989142Z 2025-12-04T10:49:10.9989218Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9989505Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9989507Z 2025-12-04T10:49:10.9989596Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9989667Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9989725Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9989995Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9990092Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9990129Z graph_break [] 2025-12-04T10:49:10.9990204Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9990259Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9990331Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9990602Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9990640Z graph_break [] 2025-12-04T10:49:10.9990692Z =================================== FAILURES =================================== 2025-12-04T10:49:10.9990841Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:10.9990889Z Traceback (most recent call last): 2025-12-04T10:49:10.9991043Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9991086Z method(*args, **kwargs) 2025-12-04T10:49:10.9991237Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:10.9991278Z method(*args, **kwargs) 2025-12-04T10:49:10.9991427Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:10.9991490Z with policy(): 2025-12-04T10:49:10.9991641Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:10.9991684Z raise RuntimeError(msg) 2025-12-04T10:49:10.9992121Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:10.9992123Z 2025-12-04T10:49:10.9992199Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9992486Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9992490Z 2025-12-04T10:49:10.9992575Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9992651Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9992705Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9992981Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9993050Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9993089Z graph_break [] 2025-12-04T10:49:10.9993161Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9993220Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9993291Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9993564Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9993600Z graph_break [] 2025-12-04T10:49:10.9993673Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:10.9993763Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:10.9993839Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:10.9994110Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:10.9994146Z graph_break [] 2025-12-04T10:49:10.9994396Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-1429ab1eba6cb4a9.xml - 2025-12-04T10:49:10.9994457Z =========================== short test summary info ============================ 2025-12-04T10:49:10.9995085Z FAILED [0.5660s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:10.9995087Z 2025-12-04T10:49:10.9995160Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:10.9995448Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9995478Z 2025-12-04T10:49:10.9995564Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:10.9995628Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:10.9995697Z =================== 1 failed, 9 deselected, 2 rerun in 4.29s =================== 2025-12-04T10:49:10.9995737Z Got exit code 1 2025-12-04T10:49:10.9995779Z Retrying single test... 2025-12-04T10:49:10.9995977Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d4e44474d4407c86.xml 2025-12-04T10:49:10.9996036Z ============================= test session starts ============================== 2025-12-04T10:49:10.9996148Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:10.9996191Z cachedir: .pytest_cache 2025-12-04T10:49:10.9996350Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:10.9996399Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:10.9996439Z configfile: pytest.ini 2025-12-04T10:49:10.9996606Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:10.9996683Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:10.9996967Z stepcurrent: skipping 9 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:10.9997011Z Running 1 items in this shard 2025-12-04T10:49:10.9997013Z 2025-12-04T10:49:10.9997376Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:10:20.903052462 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9997380Z 2025-12-04T10:49:10.9997534Z [W1204 10:10:27.430452132 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9997536Z 2025-12-04T10:49:10.9997709Z [W1204 10:10:27.430630108 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9997712Z 2025-12-04T10:49:10.9997864Z [W1204 10:10:27.434200435 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9997866Z 2025-12-04T10:49:10.9998013Z [W1204 10:10:27.434466459 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9998019Z 2025-12-04T10:49:10.9998166Z [W1204 10:10:27.434541588 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9998168Z 2025-12-04T10:49:10.9998316Z [W1204 10:10:27.436935088 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9998317Z 2025-12-04T10:49:10.9998467Z [W1204 10:10:27.437204023 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9998469Z 2025-12-04T10:49:10.9998617Z [W1204 10:10:27.437279931 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9998618Z 2025-12-04T10:49:10.9998671Z ('RERUN', {'yellow': True}) [10.4040s] [100%] 2025-12-04T10:49:10.9999029Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:10:28.004877348 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9999066Z 2025-12-04T10:49:10.9999218Z [W1204 10:10:28.005254670 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9999220Z 2025-12-04T10:49:10.9999368Z [W1204 10:10:28.005337508 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9999370Z 2025-12-04T10:49:10.9999521Z [W1204 10:10:28.006699000 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9999523Z 2025-12-04T10:49:10.9999670Z [W1204 10:10:28.006945445 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9999672Z 2025-12-04T10:49:10.9999823Z [W1204 10:10:28.007024044 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9999825Z 2025-12-04T10:49:10.9999972Z [W1204 10:10:28.009170070 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:10.9999974Z 2025-12-04T10:49:11.0000122Z [W1204 10:10:28.009425504 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0000124Z 2025-12-04T10:49:11.0000272Z [W1204 10:10:28.009500723 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0000274Z 2025-12-04T10:49:11.0000322Z ('RERUN', {'yellow': True}) [0.4347s] [100%] 2025-12-04T10:49:11.0000678Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:10:28.439581813 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0000682Z 2025-12-04T10:49:11.0000830Z [W1204 10:10:28.439921006 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0000835Z 2025-12-04T10:49:11.0001003Z [W1204 10:10:28.440036034 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0001005Z 2025-12-04T10:49:11.0001154Z [W1204 10:10:28.441419425 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0001156Z 2025-12-04T10:49:11.0001308Z [W1204 10:10:28.441676660 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0001309Z 2025-12-04T10:49:11.0001456Z [W1204 10:10:28.441752378 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0001460Z 2025-12-04T10:49:11.0001609Z [W1204 10:10:28.443911154 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0001610Z 2025-12-04T10:49:11.0001759Z [W1204 10:10:28.444168089 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0001763Z 2025-12-04T10:49:11.0001944Z [W1204 10:10:28.444244067 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0001946Z 2025-12-04T10:49:11.0001985Z FAILED [0.4310s] [100%] 2025-12-04T10:49:11.0001987Z 2025-12-04T10:49:11.0002039Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0002189Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0002273Z Traceback (most recent call last): 2025-12-04T10:49:11.0002431Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0002470Z method(*args, **kwargs) 2025-12-04T10:49:11.0002625Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0002664Z method(*args, **kwargs) 2025-12-04T10:49:11.0002815Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0002851Z with policy(): 2025-12-04T10:49:11.0003004Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0003043Z raise RuntimeError(msg) 2025-12-04T10:49:11.0003451Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0003455Z 2025-12-04T10:49:11.0003529Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0003818Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0003821Z 2025-12-04T10:49:11.0003908Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0003979Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0004035Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0004307Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0004379Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0004415Z graph_break [] 2025-12-04T10:49:11.0004515Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0004857Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0004903Z if out == self.unknown_value: 2025-12-04T10:49:11.0005053Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0005100Z Traceback (most recent call last): 2025-12-04T10:49:11.0005253Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0005292Z method(*args, **kwargs) 2025-12-04T10:49:11.0005442Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0005481Z method(*args, **kwargs) 2025-12-04T10:49:11.0005632Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0005668Z with policy(): 2025-12-04T10:49:11.0005820Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0005860Z raise RuntimeError(msg) 2025-12-04T10:49:11.0006260Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0006290Z 2025-12-04T10:49:11.0006361Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0006647Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0006649Z 2025-12-04T10:49:11.0006735Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0006806Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0006861Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0007131Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0007206Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0007241Z graph_break [] 2025-12-04T10:49:11.0007312Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0007656Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0007700Z if out == self.unknown_value: 2025-12-04T10:49:11.0007770Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0007826Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0007896Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0008165Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0008201Z graph_break [] 2025-12-04T10:49:11.0008277Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0008426Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0008473Z Traceback (most recent call last): 2025-12-04T10:49:11.0008628Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0008667Z method(*args, **kwargs) 2025-12-04T10:49:11.0008817Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0008858Z method(*args, **kwargs) 2025-12-04T10:49:11.0009008Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0009043Z with policy(): 2025-12-04T10:49:11.0009198Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0009238Z raise RuntimeError(msg) 2025-12-04T10:49:11.0009639Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0009641Z 2025-12-04T10:49:11.0009713Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0010023Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0010025Z 2025-12-04T10:49:11.0010110Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0010183Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0010239Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0010508Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0010579Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0010615Z graph_break [] 2025-12-04T10:49:11.0010687Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0011027Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0011071Z if out == self.unknown_value: 2025-12-04T10:49:11.0011142Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0011197Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0011268Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0011537Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0011571Z graph_break [] 2025-12-04T10:49:11.0011644Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0011697Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0011768Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0012101Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0012138Z graph_break [] 2025-12-04T10:49:11.0012382Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d4e44474d4407c86.xml - 2025-12-04T10:49:11.0012440Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0013065Z FAILED [0.4310s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0013069Z 2025-12-04T10:49:11.0013142Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0013428Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0013430Z 2025-12-04T10:49:11.0013514Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0013575Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0013675Z ================== 1 failed, 57 deselected, 2 rerun in 11.41s ================== 2025-12-04T10:49:11.0013711Z Got exit code 1 2025-12-04T10:49:11.0013752Z Retrying single test... 2025-12-04T10:49:11.0013948Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-a9b6a3016fdbd013.xml 2025-12-04T10:49:11.0014007Z ============================= test session starts ============================== 2025-12-04T10:49:11.0014118Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0014160Z cachedir: .pytest_cache 2025-12-04T10:49:11.0014318Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0014363Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0014402Z configfile: pytest.ini 2025-12-04T10:49:11.0014568Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0014640Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0014923Z stepcurrent: skipping 9 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0014967Z Running 1 items in this shard 2025-12-04T10:49:11.0014969Z 2025-12-04T10:49:11.0015325Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:10:38.969184835 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0015328Z 2025-12-04T10:49:11.0015481Z [W1204 10:10:45.306951940 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0015485Z 2025-12-04T10:49:11.0015634Z [W1204 10:10:45.307106437 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0015636Z 2025-12-04T10:49:11.0015811Z [W1204 10:10:45.310495148 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0015813Z 2025-12-04T10:49:11.0015963Z [W1204 10:10:45.310785682 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0015964Z 2025-12-04T10:49:11.0016113Z [W1204 10:10:45.310862050 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0016115Z 2025-12-04T10:49:11.0016263Z [W1204 10:10:45.313450337 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0016266Z 2025-12-04T10:49:11.0016412Z [W1204 10:10:45.313719432 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0016414Z 2025-12-04T10:49:11.0016562Z [W1204 10:10:45.313793770 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0016566Z 2025-12-04T10:49:11.0016615Z ('RERUN', {'yellow': True}) [10.3604s] [100%] 2025-12-04T10:49:11.0016970Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:10:46.955085467 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0016973Z 2025-12-04T10:49:11.0017122Z [W1204 10:10:46.955476419 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0017155Z 2025-12-04T10:49:11.0017302Z [W1204 10:10:46.955556177 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0017304Z 2025-12-04T10:49:11.0017453Z [W1204 10:10:46.956940569 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0017456Z 2025-12-04T10:49:11.0017603Z [W1204 10:10:46.957197474 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0017605Z 2025-12-04T10:49:11.0017753Z [W1204 10:10:46.957275012 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0017755Z 2025-12-04T10:49:11.0017902Z [W1204 10:10:46.959590365 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0017906Z 2025-12-04T10:49:11.0018052Z [W1204 10:10:46.959850059 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0018054Z 2025-12-04T10:49:11.0018201Z [W1204 10:10:46.959925708 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0018203Z 2025-12-04T10:49:11.0018252Z ('RERUN', {'yellow': True}) [0.5058s] [100%] 2025-12-04T10:49:11.0018607Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:10:46.460240748 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0018610Z 2025-12-04T10:49:11.0018760Z [W1204 10:10:46.460637350 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0018765Z 2025-12-04T10:49:11.0018912Z [W1204 10:10:46.460733768 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0018914Z 2025-12-04T10:49:11.0019062Z [W1204 10:10:46.462141389 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0019064Z 2025-12-04T10:49:11.0019233Z [W1204 10:10:46.462414394 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0019236Z 2025-12-04T10:49:11.0019384Z [W1204 10:10:46.462491622 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0019386Z 2025-12-04T10:49:11.0019532Z [W1204 10:10:46.464775445 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0019536Z 2025-12-04T10:49:11.0019684Z [W1204 10:10:46.465042500 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0019686Z 2025-12-04T10:49:11.0019834Z [W1204 10:10:46.465119868 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0019836Z 2025-12-04T10:49:11.0019874Z FAILED [0.4991s] [100%] 2025-12-04T10:49:11.0019876Z 2025-12-04T10:49:11.0019931Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0020080Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0020126Z Traceback (most recent call last): 2025-12-04T10:49:11.0020281Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0020343Z method(*args, **kwargs) 2025-12-04T10:49:11.0020494Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0020534Z method(*args, **kwargs) 2025-12-04T10:49:11.0020684Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0020722Z with policy(): 2025-12-04T10:49:11.0020876Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0020916Z raise RuntimeError(msg) 2025-12-04T10:49:11.0021306Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0021311Z 2025-12-04T10:49:11.0021385Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0021672Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0021674Z 2025-12-04T10:49:11.0021761Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0021832Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0021922Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0022193Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0022264Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0022304Z graph_break [] 2025-12-04T10:49:11.0022374Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0022748Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0022792Z if out == self.unknown_value: 2025-12-04T10:49:11.0022940Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0022984Z Traceback (most recent call last): 2025-12-04T10:49:11.0023137Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0023177Z method(*args, **kwargs) 2025-12-04T10:49:11.0023326Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0023369Z method(*args, **kwargs) 2025-12-04T10:49:11.0023518Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0023556Z with policy(): 2025-12-04T10:49:11.0023709Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0023749Z raise RuntimeError(msg) 2025-12-04T10:49:11.0024145Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0024147Z 2025-12-04T10:49:11.0024221Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0024532Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0024536Z 2025-12-04T10:49:11.0024620Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0024692Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0024746Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0025015Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0025086Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0025127Z graph_break [] 2025-12-04T10:49:11.0025198Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0025540Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0025584Z if out == self.unknown_value: 2025-12-04T10:49:11.0025655Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0025708Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0025779Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0026047Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0026086Z graph_break [] 2025-12-04T10:49:11.0026136Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0026286Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0026331Z Traceback (most recent call last): 2025-12-04T10:49:11.0026510Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0026551Z method(*args, **kwargs) 2025-12-04T10:49:11.0026700Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0026739Z method(*args, **kwargs) 2025-12-04T10:49:11.0026888Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0026925Z with policy(): 2025-12-04T10:49:11.0027077Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0027118Z raise RuntimeError(msg) 2025-12-04T10:49:11.0027519Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0027522Z 2025-12-04T10:49:11.0027594Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0027877Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0027881Z 2025-12-04T10:49:11.0027965Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0028063Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0028117Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0028388Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0028458Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0028497Z graph_break [] 2025-12-04T10:49:11.0028567Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0028906Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0028950Z if out == self.unknown_value: 2025-12-04T10:49:11.0029022Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0029075Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0029146Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0029414Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0029450Z graph_break [] 2025-12-04T10:49:11.0029520Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0029574Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0029643Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0029911Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0029948Z graph_break [] 2025-12-04T10:49:11.0030212Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-a9b6a3016fdbd013.xml - 2025-12-04T10:49:11.0030273Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0030897Z FAILED [0.4991s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0030901Z 2025-12-04T10:49:11.0030972Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0031258Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0031260Z 2025-12-04T10:49:11.0031343Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0031405Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0031470Z ================== 1 failed, 57 deselected, 2 rerun in 11.51s ================== 2025-12-04T10:49:11.0031507Z Got exit code 1 2025-12-04T10:49:11.0031743Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0031930Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0032124Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-1df0f8658b584a76.xml 2025-12-04T10:49:11.0032183Z ============================= test session starts ============================== 2025-12-04T10:49:11.0032293Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0032335Z cachedir: .pytest_cache 2025-12-04T10:49:11.0032492Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0032538Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0032577Z configfile: pytest.ini 2025-12-04T10:49:11.0032739Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0032813Z collecting ... collected 58 items / 10 deselected / 48 selected 2025-12-04T10:49:11.0032865Z stepcurrent: skipping 10 already run items. 2025-12-04T10:49:11.0032908Z Running 48 items in this shard 2025-12-04T10:49:11.0032912Z 2025-12-04T10:49:11.0033162Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.7511s] [ 2%] 2025-12-04T10:49:11.0033408Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6657s] [ 2%] 2025-12-04T10:49:11.0033629Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 FAILED [0.6577s] [ 2%] 2025-12-04T10:49:11.0033633Z 2025-12-04T10:49:11.0033685Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0033836Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0033881Z Traceback (most recent call last): 2025-12-04T10:49:11.0034069Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0034111Z method(*args, **kwargs) 2025-12-04T10:49:11.0034262Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0034302Z method(*args, **kwargs) 2025-12-04T10:49:11.0034451Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0034489Z with policy(): 2025-12-04T10:49:11.0034641Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0034683Z raise RuntimeError(msg) 2025-12-04T10:49:11.0035083Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0035085Z 2025-12-04T10:49:11.0035156Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0035446Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0035449Z 2025-12-04T10:49:11.0035559Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0035634Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0035688Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0035962Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0036034Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0036070Z graph_break [] 2025-12-04T10:49:11.0036220Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0036264Z Traceback (most recent call last): 2025-12-04T10:49:11.0036417Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0036457Z method(*args, **kwargs) 2025-12-04T10:49:11.0036607Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0036645Z method(*args, **kwargs) 2025-12-04T10:49:11.0036795Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0036832Z with policy(): 2025-12-04T10:49:11.0036984Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0037024Z raise RuntimeError(msg) 2025-12-04T10:49:11.0037429Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0037433Z 2025-12-04T10:49:11.0037504Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0037792Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0037817Z 2025-12-04T10:49:11.0037903Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0037973Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0038028Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0038297Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0038370Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0038405Z graph_break [] 2025-12-04T10:49:11.0038477Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0038531Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0038601Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0038867Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0038906Z graph_break [] 2025-12-04T10:49:11.0038957Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0039108Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0039177Z Traceback (most recent call last): 2025-12-04T10:49:11.0039330Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0039369Z method(*args, **kwargs) 2025-12-04T10:49:11.0039520Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0039561Z method(*args, **kwargs) 2025-12-04T10:49:11.0039711Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0039748Z with policy(): 2025-12-04T10:49:11.0039898Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0039939Z raise RuntimeError(msg) 2025-12-04T10:49:11.0040342Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0040346Z 2025-12-04T10:49:11.0040417Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0040709Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0040712Z 2025-12-04T10:49:11.0040797Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0040867Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0040922Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0041190Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0041264Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0041300Z graph_break [] 2025-12-04T10:49:11.0041398Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0041453Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0041521Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0041788Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0041823Z graph_break [] 2025-12-04T10:49:11.0041925Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0041979Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0042049Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0042315Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0042351Z graph_break [] 2025-12-04T10:49:11.0042592Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-1df0f8658b584a76.xml - 2025-12-04T10:49:11.0042651Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0043288Z FAILED [0.6577s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0043321Z 2025-12-04T10:49:11.0043393Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0043679Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0043681Z 2025-12-04T10:49:11.0043764Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0043825Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0043892Z ================== 1 failed, 10 deselected, 2 rerun in 4.24s =================== 2025-12-04T10:49:11.0043930Z Got exit code 1 2025-12-04T10:49:11.0043970Z Retrying single test... 2025-12-04T10:49:11.0044167Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-b0b76e39aca554e1.xml 2025-12-04T10:49:11.0044224Z ============================= test session starts ============================== 2025-12-04T10:49:11.0044336Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0044375Z cachedir: .pytest_cache 2025-12-04T10:49:11.0044533Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0044579Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0044618Z configfile: pytest.ini 2025-12-04T10:49:11.0044780Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0044855Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0045141Z stepcurrent: skipping 10 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0045210Z Running 1 items in this shard 2025-12-04T10:49:11.0045213Z 2025-12-04T10:49:11.0045574Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:11:07.861210035 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0045576Z 2025-12-04T10:49:11.0045728Z [W1204 10:11:14.459575164 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0045732Z 2025-12-04T10:49:11.0045883Z [W1204 10:11:14.459737191 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0045885Z 2025-12-04T10:49:11.0046033Z [W1204 10:11:14.463714890 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0046035Z 2025-12-04T10:49:11.0046182Z [W1204 10:11:14.464026974 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0046184Z 2025-12-04T10:49:11.0046331Z [W1204 10:11:14.464105392 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0046333Z 2025-12-04T10:49:11.0046479Z [W1204 10:11:14.466606271 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0046506Z 2025-12-04T10:49:11.0046654Z [W1204 10:11:14.466882086 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0046656Z 2025-12-04T10:49:11.0046804Z [W1204 10:11:14.466958124 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0046805Z 2025-12-04T10:49:11.0046858Z ('RERUN', {'yellow': True}) [10.3891s] [100%] 2025-12-04T10:49:11.0047220Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:11:15.297052298 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0047222Z 2025-12-04T10:49:11.0047370Z [W1204 10:11:15.297427560 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0047374Z 2025-12-04T10:49:11.0047523Z [W1204 10:11:15.297506929 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0047525Z 2025-12-04T10:49:11.0047674Z [W1204 10:11:15.298885261 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0047675Z 2025-12-04T10:49:11.0047823Z [W1204 10:11:15.299136665 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0047825Z 2025-12-04T10:49:11.0047974Z [W1204 10:11:15.299213594 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0047975Z 2025-12-04T10:49:11.0048122Z [W1204 10:11:15.301476668 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0048125Z 2025-12-04T10:49:11.0048274Z [W1204 10:11:15.301727223 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0048276Z 2025-12-04T10:49:11.0048423Z [W1204 10:11:15.301801371 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0048426Z 2025-12-04T10:49:11.0048494Z ('RERUN', {'yellow': True}) [0.5646s] [100%] 2025-12-04T10:49:11.0048850Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:11:16.776078826 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0048852Z 2025-12-04T10:49:11.0048999Z [W1204 10:11:16.776519047 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0049002Z 2025-12-04T10:49:11.0049150Z [W1204 10:11:16.776612735 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0049152Z 2025-12-04T10:49:11.0049300Z [W1204 10:11:16.778017026 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0049302Z 2025-12-04T10:49:11.0049451Z [W1204 10:11:16.778279941 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0049452Z 2025-12-04T10:49:11.0049600Z [W1204 10:11:16.778355840 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0049601Z 2025-12-04T10:49:11.0049746Z [W1204 10:11:16.780633093 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0049767Z 2025-12-04T10:49:11.0049916Z [W1204 10:11:16.780888198 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0049917Z 2025-12-04T10:49:11.0050064Z [W1204 10:11:16.780963027 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0050066Z 2025-12-04T10:49:11.0050105Z FAILED [0.4825s] [100%] 2025-12-04T10:49:11.0050109Z 2025-12-04T10:49:11.0050160Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0050311Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0050357Z Traceback (most recent call last): 2025-12-04T10:49:11.0050514Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0050554Z method(*args, **kwargs) 2025-12-04T10:49:11.0050707Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0050747Z method(*args, **kwargs) 2025-12-04T10:49:11.0050896Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0050934Z with policy(): 2025-12-04T10:49:11.0051086Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0051127Z raise RuntimeError(msg) 2025-12-04T10:49:11.0051526Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0051529Z 2025-12-04T10:49:11.0051605Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0051922Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0051926Z 2025-12-04T10:49:11.0052040Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0052114Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0052168Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0052439Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0052511Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0052549Z graph_break [] 2025-12-04T10:49:11.0052620Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0052964Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0053009Z if out == self.unknown_value: 2025-12-04T10:49:11.0053159Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0053204Z Traceback (most recent call last): 2025-12-04T10:49:11.0053357Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0053395Z method(*args, **kwargs) 2025-12-04T10:49:11.0053545Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0053620Z method(*args, **kwargs) 2025-12-04T10:49:11.0053772Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0053807Z with policy(): 2025-12-04T10:49:11.0053960Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0054001Z raise RuntimeError(msg) 2025-12-04T10:49:11.0054407Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0054409Z 2025-12-04T10:49:11.0054481Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0054769Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0054771Z 2025-12-04T10:49:11.0054856Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0054928Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0054983Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0055249Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0055321Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0055357Z graph_break [] 2025-12-04T10:49:11.0055429Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0055771Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0055813Z if out == self.unknown_value: 2025-12-04T10:49:11.0055906Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0055960Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0056032Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0056298Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0056336Z graph_break [] 2025-12-04T10:49:11.0056386Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0056537Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0056580Z Traceback (most recent call last): 2025-12-04T10:49:11.0056735Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0056773Z method(*args, **kwargs) 2025-12-04T10:49:11.0056924Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0056962Z method(*args, **kwargs) 2025-12-04T10:49:11.0057112Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0057147Z with policy(): 2025-12-04T10:49:11.0057328Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0057369Z raise RuntimeError(msg) 2025-12-04T10:49:11.0057777Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0057780Z 2025-12-04T10:49:11.0057851Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0058139Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0058141Z 2025-12-04T10:49:11.0058227Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0058298Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0058353Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0058623Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0058695Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0058731Z graph_break [] 2025-12-04T10:49:11.0058802Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0059142Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0059186Z if out == self.unknown_value: 2025-12-04T10:49:11.0059257Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0059310Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0059380Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0059669Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0059706Z graph_break [] 2025-12-04T10:49:11.0059776Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0059829Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0059899Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0060170Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0060205Z graph_break [] 2025-12-04T10:49:11.0060454Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-b0b76e39aca554e1.xml - 2025-12-04T10:49:11.0060512Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0061150Z FAILED [0.4825s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0061173Z 2025-12-04T10:49:11.0061245Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0061532Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0061534Z 2025-12-04T10:49:11.0061619Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0061678Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0061745Z ================== 1 failed, 57 deselected, 2 rerun in 11.60s ================== 2025-12-04T10:49:11.0061780Z Got exit code 1 2025-12-04T10:49:11.0061820Z Retrying single test... 2025-12-04T10:49:11.0062051Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-1149fd30a5548eb9.xml 2025-12-04T10:49:11.0062108Z ============================= test session starts ============================== 2025-12-04T10:49:11.0062220Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0062260Z cachedir: .pytest_cache 2025-12-04T10:49:11.0062421Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0062466Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0062506Z configfile: pytest.ini 2025-12-04T10:49:11.0062666Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0062739Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0063020Z stepcurrent: skipping 10 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0063065Z Running 1 items in this shard 2025-12-04T10:49:11.0063067Z 2025-12-04T10:49:11.0063458Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:11:25.058850249 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0063461Z 2025-12-04T10:49:11.0063612Z [W1204 10:11:33.703885665 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0063614Z 2025-12-04T10:49:11.0063764Z [W1204 10:11:33.704066702 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0063768Z 2025-12-04T10:49:11.0063915Z [W1204 10:11:33.708054641 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0063917Z 2025-12-04T10:49:11.0064065Z [W1204 10:11:33.708356275 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0064067Z 2025-12-04T10:49:11.0064214Z [W1204 10:11:33.708432573 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0064216Z 2025-12-04T10:49:11.0064364Z [W1204 10:11:33.711068890 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0064366Z 2025-12-04T10:49:11.0064514Z [W1204 10:11:33.711340424 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0064516Z 2025-12-04T10:49:11.0064689Z [W1204 10:11:33.711416343 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0064691Z 2025-12-04T10:49:11.0064741Z ('RERUN', {'yellow': True}) [10.4308s] [100%] 2025-12-04T10:49:11.0065098Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:11:34.528211239 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0065100Z 2025-12-04T10:49:11.0065248Z [W1204 10:11:34.528621781 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0065250Z 2025-12-04T10:49:11.0065396Z [W1204 10:11:34.528710399 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0065398Z 2025-12-04T10:49:11.0065545Z [W1204 10:11:34.530115311 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0065547Z 2025-12-04T10:49:11.0065694Z [W1204 10:11:34.530372316 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0065696Z 2025-12-04T10:49:11.0065843Z [W1204 10:11:34.530451054 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0065844Z 2025-12-04T10:49:11.0065991Z [W1204 10:11:34.532717268 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0065993Z 2025-12-04T10:49:11.0066138Z [W1204 10:11:34.532977483 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0066141Z 2025-12-04T10:49:11.0066288Z [W1204 10:11:34.533058361 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0066292Z 2025-12-04T10:49:11.0066342Z ('RERUN', {'yellow': True}) [0.7148s] [100%] 2025-12-04T10:49:11.0066719Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:11:34.239777469 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0066721Z 2025-12-04T10:49:11.0066872Z [W1204 10:11:34.240237129 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0066874Z 2025-12-04T10:49:11.0067020Z [W1204 10:11:34.240333887 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0067022Z 2025-12-04T10:49:11.0067173Z [W1204 10:11:34.241755079 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0067177Z 2025-12-04T10:49:11.0067327Z [W1204 10:11:34.242025823 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0067328Z 2025-12-04T10:49:11.0067475Z [W1204 10:11:34.242106792 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0067477Z 2025-12-04T10:49:11.0067625Z [W1204 10:11:34.244430015 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0067627Z 2025-12-04T10:49:11.0067774Z [W1204 10:11:34.244707959 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0067776Z 2025-12-04T10:49:11.0067927Z [W1204 10:11:34.244784077 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0067957Z 2025-12-04T10:49:11.0067998Z FAILED [0.6708s] [100%] 2025-12-04T10:49:11.0068000Z 2025-12-04T10:49:11.0068052Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0068206Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0068253Z Traceback (most recent call last): 2025-12-04T10:49:11.0068411Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0068451Z method(*args, **kwargs) 2025-12-04T10:49:11.0068605Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0068644Z method(*args, **kwargs) 2025-12-04T10:49:11.0068796Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0068835Z with policy(): 2025-12-04T10:49:11.0068989Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0069030Z raise RuntimeError(msg) 2025-12-04T10:49:11.0069432Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0069435Z 2025-12-04T10:49:11.0069507Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0069799Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0069803Z 2025-12-04T10:49:11.0069892Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0069963Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0070020Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0070314Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0070388Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0070425Z graph_break [] 2025-12-04T10:49:11.0070498Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0070840Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0070886Z if out == self.unknown_value: 2025-12-04T10:49:11.0071036Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0071085Z Traceback (most recent call last): 2025-12-04T10:49:11.0071241Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0071283Z method(*args, **kwargs) 2025-12-04T10:49:11.0071434Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0071475Z method(*args, **kwargs) 2025-12-04T10:49:11.0071628Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0071687Z with policy(): 2025-12-04T10:49:11.0071840Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0071926Z raise RuntimeError(msg) 2025-12-04T10:49:11.0072333Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0072336Z 2025-12-04T10:49:11.0072408Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0072696Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0072699Z 2025-12-04T10:49:11.0072785Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0072858Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0072913Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0073185Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0073256Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0073294Z graph_break [] 2025-12-04T10:49:11.0073365Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0073707Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0073753Z if out == self.unknown_value: 2025-12-04T10:49:11.0073823Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0073880Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0073978Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0074245Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0074281Z graph_break [] 2025-12-04T10:49:11.0074336Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0074486Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0074535Z Traceback (most recent call last): 2025-12-04T10:49:11.0074689Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0074731Z method(*args, **kwargs) 2025-12-04T10:49:11.0074880Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0074922Z method(*args, **kwargs) 2025-12-04T10:49:11.0075073Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0075111Z with policy(): 2025-12-04T10:49:11.0075263Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0075302Z raise RuntimeError(msg) 2025-12-04T10:49:11.0075711Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0075742Z 2025-12-04T10:49:11.0075814Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0076106Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0076108Z 2025-12-04T10:49:11.0076193Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0076265Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0076319Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0076591Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0076661Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0076697Z graph_break [] 2025-12-04T10:49:11.0076771Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0077109Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0077155Z if out == self.unknown_value: 2025-12-04T10:49:11.0077225Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0077280Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0077352Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0077620Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0077654Z graph_break [] 2025-12-04T10:49:11.0077747Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0077801Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0077871Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0078137Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0078175Z graph_break [] 2025-12-04T10:49:11.0078418Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-1149fd30a5548eb9.xml - 2025-12-04T10:49:11.0078479Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0079117Z FAILED [0.6708s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0079119Z 2025-12-04T10:49:11.0079190Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0079502Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0079504Z 2025-12-04T10:49:11.0079587Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0079651Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0079720Z ================== 1 failed, 57 deselected, 2 rerun in 11.97s ================== 2025-12-04T10:49:11.0079756Z Got exit code 1 2025-12-04T10:49:11.0079997Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0080124Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0080324Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-38bed385a2ee3708.xml 2025-12-04T10:49:11.0080381Z ============================= test session starts ============================== 2025-12-04T10:49:11.0080494Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0080535Z cachedir: .pytest_cache 2025-12-04T10:49:11.0080697Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0080743Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0080784Z configfile: pytest.ini 2025-12-04T10:49:11.0080946Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0081020Z collecting ... collected 58 items / 11 deselected / 47 selected 2025-12-04T10:49:11.0081073Z stepcurrent: skipping 11 already run items. 2025-12-04T10:49:11.0081120Z Running 47 items in this shard 2025-12-04T10:49:11.0081122Z 2025-12-04T10:49:11.0081366Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.7300s] [ 2%] 2025-12-04T10:49:11.0081636Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.5979s] [ 2%] 2025-12-04T10:49:11.0081894Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 FAILED [0.5824s] [ 2%] 2025-12-04T10:49:11.0081896Z 2025-12-04T10:49:11.0081947Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0082099Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0082144Z Traceback (most recent call last): 2025-12-04T10:49:11.0082305Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0082345Z method(*args, **kwargs) 2025-12-04T10:49:11.0082500Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0082540Z method(*args, **kwargs) 2025-12-04T10:49:11.0082693Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0082728Z with policy(): 2025-12-04T10:49:11.0082881Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0082921Z raise RuntimeError(msg) 2025-12-04T10:49:11.0083314Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0083345Z 2025-12-04T10:49:11.0083420Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0083707Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0083709Z 2025-12-04T10:49:11.0083796Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0083868Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0083926Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0084200Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0084275Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0084311Z graph_break [] 2025-12-04T10:49:11.0084462Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0084508Z Traceback (most recent call last): 2025-12-04T10:49:11.0084663Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0084702Z method(*args, **kwargs) 2025-12-04T10:49:11.0084854Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0084893Z method(*args, **kwargs) 2025-12-04T10:49:11.0085044Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0085081Z with policy(): 2025-12-04T10:49:11.0085231Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0085273Z raise RuntimeError(msg) 2025-12-04T10:49:11.0085692Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0085695Z 2025-12-04T10:49:11.0085768Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0086053Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0086056Z 2025-12-04T10:49:11.0086142Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0086214Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0086270Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0086539Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0086611Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0086647Z graph_break [] 2025-12-04T10:49:11.0086716Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0086794Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0086862Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0087130Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0087166Z graph_break [] 2025-12-04T10:49:11.0087218Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0087367Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0087415Z Traceback (most recent call last): 2025-12-04T10:49:11.0087568Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0087607Z method(*args, **kwargs) 2025-12-04T10:49:11.0087759Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0087799Z method(*args, **kwargs) 2025-12-04T10:49:11.0087948Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0087987Z with policy(): 2025-12-04T10:49:11.0088139Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0088182Z raise RuntimeError(msg) 2025-12-04T10:49:11.0088580Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0088587Z 2025-12-04T10:49:11.0088660Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0088951Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0088953Z 2025-12-04T10:49:11.0089096Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0089169Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0089223Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0089492Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0089562Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0089601Z graph_break [] 2025-12-04T10:49:11.0089671Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0089726Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0089794Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0090066Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0090103Z graph_break [] 2025-12-04T10:49:11.0090173Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0090227Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0090297Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0090563Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0090623Z graph_break [] 2025-12-04T10:49:11.0090866Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-38bed385a2ee3708.xml - 2025-12-04T10:49:11.0090926Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0091548Z FAILED [0.5824s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0091552Z 2025-12-04T10:49:11.0091625Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0091957Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0091959Z 2025-12-04T10:49:11.0092043Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0092104Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0092170Z ================== 1 failed, 11 deselected, 2 rerun in 4.06s =================== 2025-12-04T10:49:11.0092206Z Got exit code 1 2025-12-04T10:49:11.0092247Z Retrying single test... 2025-12-04T10:49:11.0092443Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-1a18d82c8040b752.xml 2025-12-04T10:49:11.0092501Z ============================= test session starts ============================== 2025-12-04T10:49:11.0092611Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0092651Z cachedir: .pytest_cache 2025-12-04T10:49:11.0092835Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0092884Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0092924Z configfile: pytest.ini 2025-12-04T10:49:11.0093086Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0093158Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0093442Z stepcurrent: skipping 11 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0093488Z Running 1 items in this shard 2025-12-04T10:49:11.0093490Z 2025-12-04T10:49:11.0093848Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:11:55.934363641 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0093850Z 2025-12-04T10:49:11.0094003Z [W1204 10:12:03.631230566 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0094005Z 2025-12-04T10:49:11.0094153Z [W1204 10:12:03.631408242 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0094155Z 2025-12-04T10:49:11.0094302Z [W1204 10:12:03.635191976 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0094330Z 2025-12-04T10:49:11.0094479Z [W1204 10:12:03.635496410 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0094481Z 2025-12-04T10:49:11.0094628Z [W1204 10:12:03.635572119 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0094630Z 2025-12-04T10:49:11.0094778Z [W1204 10:12:03.638049209 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0094780Z 2025-12-04T10:49:11.0094925Z [W1204 10:12:03.638324453 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0094927Z 2025-12-04T10:49:11.0095076Z [W1204 10:12:03.638399412 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0095080Z 2025-12-04T10:49:11.0095129Z ('RERUN', {'yellow': True}) [10.3810s] [100%] 2025-12-04T10:49:11.0095487Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:12:03.371221107 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0095489Z 2025-12-04T10:49:11.0095639Z [W1204 10:12:03.371575180 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0095641Z 2025-12-04T10:49:11.0095786Z [W1204 10:12:03.371655418 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0095788Z 2025-12-04T10:49:11.0095935Z [W1204 10:12:03.373034700 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0095939Z 2025-12-04T10:49:11.0096085Z [W1204 10:12:03.373286155 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0096089Z 2025-12-04T10:49:11.0096262Z [W1204 10:12:03.373360764 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0096264Z 2025-12-04T10:49:11.0096413Z [W1204 10:12:03.375571359 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0096415Z 2025-12-04T10:49:11.0096560Z [W1204 10:12:03.375829624 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0096561Z 2025-12-04T10:49:11.0096711Z [W1204 10:12:03.375902983 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0096715Z 2025-12-04T10:49:11.0096764Z ('RERUN', {'yellow': True}) [0.5962s] [100%] 2025-12-04T10:49:11.0097120Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:12:04.935362778 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0097123Z 2025-12-04T10:49:11.0097271Z [W1204 10:12:04.935723210 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0097273Z 2025-12-04T10:49:11.0097419Z [W1204 10:12:04.935812979 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0097421Z 2025-12-04T10:49:11.0097570Z [W1204 10:12:04.937202401 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0097595Z 2025-12-04T10:49:11.0097742Z [W1204 10:12:04.937467355 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0097745Z 2025-12-04T10:49:11.0097895Z [W1204 10:12:04.937542134 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0097897Z 2025-12-04T10:49:11.0098044Z [W1204 10:12:04.939744470 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0098046Z 2025-12-04T10:49:11.0098192Z [W1204 10:12:04.940011724 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0098194Z 2025-12-04T10:49:11.0098342Z [W1204 10:12:04.940089973 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0098345Z 2025-12-04T10:49:11.0098382Z FAILED [0.6024s] [100%] 2025-12-04T10:49:11.0098384Z 2025-12-04T10:49:11.0098437Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0098585Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0098634Z Traceback (most recent call last): 2025-12-04T10:49:11.0098789Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0098830Z method(*args, **kwargs) 2025-12-04T10:49:11.0098982Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0099022Z method(*args, **kwargs) 2025-12-04T10:49:11.0099172Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0099211Z with policy(): 2025-12-04T10:49:11.0099365Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0099404Z raise RuntimeError(msg) 2025-12-04T10:49:11.0099822Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0099824Z 2025-12-04T10:49:11.0099897Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0100185Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0100189Z 2025-12-04T10:49:11.0100274Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0100346Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0100401Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0100673Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0100746Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0100781Z graph_break [] 2025-12-04T10:49:11.0100853Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0101194Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0101260Z if out == self.unknown_value: 2025-12-04T10:49:11.0101407Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0101454Z Traceback (most recent call last): 2025-12-04T10:49:11.0101607Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0101648Z method(*args, **kwargs) 2025-12-04T10:49:11.0101798Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0101838Z method(*args, **kwargs) 2025-12-04T10:49:11.0102048Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0102088Z with policy(): 2025-12-04T10:49:11.0102239Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0102281Z raise RuntimeError(msg) 2025-12-04T10:49:11.0102682Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0102686Z 2025-12-04T10:49:11.0102757Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0103043Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0103046Z 2025-12-04T10:49:11.0103133Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0103205Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0103260Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0103566Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0103640Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0103677Z graph_break [] 2025-12-04T10:49:11.0103747Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0104088Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0104132Z if out == self.unknown_value: 2025-12-04T10:49:11.0104204Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0104258Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0104330Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0104601Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0104637Z graph_break [] 2025-12-04T10:49:11.0104689Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0104837Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0104911Z Traceback (most recent call last): 2025-12-04T10:49:11.0105063Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0105103Z method(*args, **kwargs) 2025-12-04T10:49:11.0105252Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0105291Z method(*args, **kwargs) 2025-12-04T10:49:11.0105442Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0105479Z with policy(): 2025-12-04T10:49:11.0105630Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0105671Z raise RuntimeError(msg) 2025-12-04T10:49:11.0106069Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0106076Z 2025-12-04T10:49:11.0106147Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0106436Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0106438Z 2025-12-04T10:49:11.0106522Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0106593Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0106647Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0106916Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0106988Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0107024Z graph_break [] 2025-12-04T10:49:11.0107094Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0107458Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0107501Z if out == self.unknown_value: 2025-12-04T10:49:11.0107573Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0107626Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0107698Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0107968Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0108003Z graph_break [] 2025-12-04T10:49:11.0108074Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0108128Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0108199Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0108464Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0108500Z graph_break [] 2025-12-04T10:49:11.0108740Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-1a18d82c8040b752.xml - 2025-12-04T10:49:11.0108832Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0109456Z FAILED [0.6024s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0109460Z 2025-12-04T10:49:11.0109531Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0109818Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0109821Z 2025-12-04T10:49:11.0109904Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0109967Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0110034Z ================== 1 failed, 57 deselected, 2 rerun in 11.73s ================== 2025-12-04T10:49:11.0110072Z Got exit code 1 2025-12-04T10:49:11.0110112Z Retrying single test... 2025-12-04T10:49:11.0110310Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-a164a19da10499d2.xml 2025-12-04T10:49:11.0110366Z ============================= test session starts ============================== 2025-12-04T10:49:11.0110478Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0110519Z cachedir: .pytest_cache 2025-12-04T10:49:11.0110679Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0110724Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0110765Z configfile: pytest.ini 2025-12-04T10:49:11.0110951Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0111024Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0111309Z stepcurrent: skipping 11 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0111352Z Running 1 items in this shard 2025-12-04T10:49:11.0111354Z 2025-12-04T10:49:11.0111712Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:12:14.646212587 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0111717Z 2025-12-04T10:49:11.0111929Z [W1204 10:12:20.243481702 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0111933Z 2025-12-04T10:49:11.0112085Z [W1204 10:12:20.243631179 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0112087Z 2025-12-04T10:49:11.0112234Z [W1204 10:12:20.247551040 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0112238Z 2025-12-04T10:49:11.0112384Z [W1204 10:12:20.247850504 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0112430Z 2025-12-04T10:49:11.0112578Z [W1204 10:12:20.247925063 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0112580Z 2025-12-04T10:49:11.0112727Z [W1204 10:12:20.250379013 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0112729Z 2025-12-04T10:49:11.0112878Z [W1204 10:12:20.250646378 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0112880Z 2025-12-04T10:49:11.0113026Z [W1204 10:12:20.250720857 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0113030Z 2025-12-04T10:49:11.0113079Z ('RERUN', {'yellow': True}) [9.3216s] [100%] 2025-12-04T10:49:11.0113434Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:12:21.972173220 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0113438Z 2025-12-04T10:49:11.0113586Z [W1204 10:12:21.972529682 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0113588Z 2025-12-04T10:49:11.0113736Z [W1204 10:12:21.972612991 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0113738Z 2025-12-04T10:49:11.0113884Z [W1204 10:12:21.973976054 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0113886Z 2025-12-04T10:49:11.0114033Z [W1204 10:12:21.974229258 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0114036Z 2025-12-04T10:49:11.0114185Z [W1204 10:12:21.974307037 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0114186Z 2025-12-04T10:49:11.0114333Z [W1204 10:12:21.976467474 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0114335Z 2025-12-04T10:49:11.0114516Z [W1204 10:12:21.976723078 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0114518Z 2025-12-04T10:49:11.0114664Z [W1204 10:12:21.976797297 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0114666Z 2025-12-04T10:49:11.0114715Z ('RERUN', {'yellow': True}) [0.5822s] [100%] 2025-12-04T10:49:11.0115067Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:12:22.582358625 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0115072Z 2025-12-04T10:49:11.0115220Z [W1204 10:12:22.582722648 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0115223Z 2025-12-04T10:49:11.0115372Z [W1204 10:12:22.582809716 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0115373Z 2025-12-04T10:49:11.0115519Z [W1204 10:12:22.584190868 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0115521Z 2025-12-04T10:49:11.0115668Z [W1204 10:12:22.584455443 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0115695Z 2025-12-04T10:49:11.0115842Z [W1204 10:12:22.584532092 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0115846Z 2025-12-04T10:49:11.0115992Z [W1204 10:12:22.586715528 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0115994Z 2025-12-04T10:49:11.0116143Z [W1204 10:12:22.586977202 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0116144Z 2025-12-04T10:49:11.0116290Z [W1204 10:12:22.587059271 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0116292Z 2025-12-04T10:49:11.0116332Z FAILED [0.6057s] [100%] 2025-12-04T10:49:11.0116335Z 2025-12-04T10:49:11.0116387Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0116540Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0116584Z Traceback (most recent call last): 2025-12-04T10:49:11.0116740Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0116780Z method(*args, **kwargs) 2025-12-04T10:49:11.0116932Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0116972Z method(*args, **kwargs) 2025-12-04T10:49:11.0117121Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0117159Z with policy(): 2025-12-04T10:49:11.0117310Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0117351Z raise RuntimeError(msg) 2025-12-04T10:49:11.0117743Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0117745Z 2025-12-04T10:49:11.0117844Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0118129Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0118131Z 2025-12-04T10:49:11.0118218Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0118288Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0118344Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0118618Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0118688Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0118724Z graph_break [] 2025-12-04T10:49:11.0118796Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0119140Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0119183Z if out == self.unknown_value: 2025-12-04T10:49:11.0119331Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0119395Z Traceback (most recent call last): 2025-12-04T10:49:11.0119548Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0119586Z method(*args, **kwargs) 2025-12-04T10:49:11.0119735Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0126043Z method(*args, **kwargs) 2025-12-04T10:49:11.0126214Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0126253Z with policy(): 2025-12-04T10:49:11.0126406Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0126447Z raise RuntimeError(msg) 2025-12-04T10:49:11.0126847Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0126856Z 2025-12-04T10:49:11.0126932Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0127224Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0127227Z 2025-12-04T10:49:11.0127312Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0127387Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0127444Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0127724Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0127795Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0127832Z graph_break [] 2025-12-04T10:49:11.0127957Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0128301Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0128344Z if out == self.unknown_value: 2025-12-04T10:49:11.0128416Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0128470Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0128545Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0128812Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0128848Z graph_break [] 2025-12-04T10:49:11.0128902Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0129052Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0129099Z Traceback (most recent call last): 2025-12-04T10:49:11.0131501Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0131541Z method(*args, **kwargs) 2025-12-04T10:49:11.0131691Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0131753Z method(*args, **kwargs) 2025-12-04T10:49:11.0131948Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0131985Z with policy(): 2025-12-04T10:49:11.0132142Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0132183Z raise RuntimeError(msg) 2025-12-04T10:49:11.0132585Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0132615Z 2025-12-04T10:49:11.0132689Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0132979Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0132981Z 2025-12-04T10:49:11.0133067Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0133141Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0133197Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0133467Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0133539Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0133575Z graph_break [] 2025-12-04T10:49:11.0133649Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0133990Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0134034Z if out == self.unknown_value: 2025-12-04T10:49:11.0134138Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0134193Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0134264Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0134532Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0134570Z graph_break [] 2025-12-04T10:49:11.0134640Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0134693Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0134763Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0135034Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0135069Z graph_break [] 2025-12-04T10:49:11.0135313Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-a164a19da10499d2.xml - 2025-12-04T10:49:11.0135447Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0136081Z FAILED [0.6057s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0136104Z 2025-12-04T10:49:11.0136175Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0136460Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0136464Z 2025-12-04T10:49:11.0136549Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0136611Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0136679Z ================== 1 failed, 57 deselected, 2 rerun in 10.66s ================== 2025-12-04T10:49:11.0136715Z Got exit code 1 2025-12-04T10:49:11.0136953Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0137084Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0137283Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-b7e76e4741c7a5ea.xml 2025-12-04T10:49:11.0137340Z ============================= test session starts ============================== 2025-12-04T10:49:11.0137457Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0137498Z cachedir: .pytest_cache 2025-12-04T10:49:11.0137658Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0137704Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0137746Z configfile: pytest.ini 2025-12-04T10:49:11.0137939Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0138015Z collecting ... collected 58 items / 12 deselected / 46 selected 2025-12-04T10:49:11.0138067Z stepcurrent: skipping 12 already run items. 2025-12-04T10:49:11.0138112Z Running 46 items in this shard 2025-12-04T10:49:11.0138114Z 2025-12-04T10:49:11.0138361Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [3.0187s] [ 2%] 2025-12-04T10:49:11.0138602Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6130s] [ 2%] 2025-12-04T10:49:11.0138825Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 FAILED [0.5753s] [ 2%] 2025-12-04T10:49:11.0138828Z 2025-12-04T10:49:11.0138880Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0139029Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0139073Z Traceback (most recent call last): 2025-12-04T10:49:11.0139252Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0139292Z method(*args, **kwargs) 2025-12-04T10:49:11.0139443Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0139495Z method(*args, **kwargs) 2025-12-04T10:49:11.0139645Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0139680Z with policy(): 2025-12-04T10:49:11.0139835Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0139876Z raise RuntimeError(msg) 2025-12-04T10:49:11.0140266Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0140270Z 2025-12-04T10:49:11.0140342Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0140627Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0140630Z 2025-12-04T10:49:11.0140715Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0140787Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0140842Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0141278Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0141352Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0141390Z graph_break [] 2025-12-04T10:49:11.0141539Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0141584Z Traceback (most recent call last): 2025-12-04T10:49:11.0141736Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0141775Z method(*args, **kwargs) 2025-12-04T10:49:11.0141998Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0142038Z method(*args, **kwargs) 2025-12-04T10:49:11.0142186Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0142223Z with policy(): 2025-12-04T10:49:11.0142373Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0142415Z raise RuntimeError(msg) 2025-12-04T10:49:11.0142813Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0142816Z 2025-12-04T10:49:11.0142890Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0143178Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0143197Z 2025-12-04T10:49:11.0143281Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0143353Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0143425Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0143696Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0143767Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0143808Z graph_break [] 2025-12-04T10:49:11.0143877Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0143930Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0143999Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0144267Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0144306Z graph_break [] 2025-12-04T10:49:11.0144358Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0144505Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0144550Z Traceback (most recent call last): 2025-12-04T10:49:11.0144703Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0144743Z method(*args, **kwargs) 2025-12-04T10:49:11.0144890Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0144930Z method(*args, **kwargs) 2025-12-04T10:49:11.0145078Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0145115Z with policy(): 2025-12-04T10:49:11.0145266Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0145306Z raise RuntimeError(msg) 2025-12-04T10:49:11.0145736Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0145739Z 2025-12-04T10:49:11.0145810Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0146094Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0146097Z 2025-12-04T10:49:11.0146181Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0146253Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0146306Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0146574Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0146645Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0146680Z graph_break [] 2025-12-04T10:49:11.0146750Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0146815Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0146884Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0147150Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0147198Z graph_break [] 2025-12-04T10:49:11.0147267Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0147321Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0147392Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0147662Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0147698Z graph_break [] 2025-12-04T10:49:11.0147943Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-b7e76e4741c7a5ea.xml - 2025-12-04T10:49:11.0148002Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0148625Z FAILED [0.5753s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0148628Z 2025-12-04T10:49:11.0148700Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0148986Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0148989Z 2025-12-04T10:49:11.0149073Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0149133Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0149200Z ================== 1 failed, 12 deselected, 2 rerun in 4.35s =================== 2025-12-04T10:49:11.0149258Z Got exit code 1 2025-12-04T10:49:11.0149298Z Retrying single test... 2025-12-04T10:49:11.0149494Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6e4e98ad66105123.xml 2025-12-04T10:49:11.0149552Z ============================= test session starts ============================== 2025-12-04T10:49:11.0149665Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0149707Z cachedir: .pytest_cache 2025-12-04T10:49:11.0149865Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0149911Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0149951Z configfile: pytest.ini 2025-12-04T10:49:11.0150113Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0150189Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0150470Z stepcurrent: skipping 12 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0150527Z Running 1 items in this shard 2025-12-04T10:49:11.0150530Z 2025-12-04T10:49:11.0150887Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:12:43.723721837 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0150903Z 2025-12-04T10:49:11.0151056Z [W1204 10:12:50.050572283 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0151058Z 2025-12-04T10:49:11.0151208Z [W1204 10:12:50.050723700 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0151212Z 2025-12-04T10:49:11.0151358Z [W1204 10:12:50.054621982 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0151362Z 2025-12-04T10:49:11.0151508Z [W1204 10:12:50.054911236 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0151512Z 2025-12-04T10:49:11.0151659Z [W1204 10:12:50.054990844 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0151661Z 2025-12-04T10:49:11.0151809Z [W1204 10:12:50.057453405 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0151810Z 2025-12-04T10:49:11.0151992Z [W1204 10:12:50.057722880 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0151995Z 2025-12-04T10:49:11.0152142Z [W1204 10:12:50.057794999 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0152144Z 2025-12-04T10:49:11.0152197Z ('RERUN', {'yellow': True}) [10.3158s] [100%] 2025-12-04T10:49:11.0152550Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:12:51.746685140 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0152554Z 2025-12-04T10:49:11.0152701Z [W1204 10:12:51.747080712 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0152703Z 2025-12-04T10:49:11.0152876Z [W1204 10:12:51.747173340 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0152878Z 2025-12-04T10:49:11.0153026Z [W1204 10:12:51.748559353 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0153027Z 2025-12-04T10:49:11.0153175Z [W1204 10:12:51.748823227 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0153177Z 2025-12-04T10:49:11.0153322Z [W1204 10:12:51.748898076 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0153325Z 2025-12-04T10:49:11.0153472Z [W1204 10:12:51.751099462 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0153473Z 2025-12-04T10:49:11.0153621Z [W1204 10:12:51.751351237 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0153623Z 2025-12-04T10:49:11.0153770Z [W1204 10:12:51.751423835 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0153771Z 2025-12-04T10:49:11.0153820Z ('RERUN', {'yellow': True}) [0.5602s] [100%] 2025-12-04T10:49:11.0154186Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:12:51.308662254 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0154207Z 2025-12-04T10:49:11.0154357Z [W1204 10:12:51.309060496 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0154359Z 2025-12-04T10:49:11.0154507Z [W1204 10:12:51.309147374 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0154509Z 2025-12-04T10:49:11.0154656Z [W1204 10:12:51.310534446 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0154658Z 2025-12-04T10:49:11.0154803Z [W1204 10:12:51.310782652 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0154808Z 2025-12-04T10:49:11.0154953Z [W1204 10:12:51.310857160 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0154956Z 2025-12-04T10:49:11.0155103Z [W1204 10:12:51.313030787 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0155104Z 2025-12-04T10:49:11.0155252Z [W1204 10:12:51.313279682 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0155254Z 2025-12-04T10:49:11.0155400Z [W1204 10:12:51.313351360 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0155402Z 2025-12-04T10:49:11.0155440Z FAILED [0.5539s] [100%] 2025-12-04T10:49:11.0155443Z 2025-12-04T10:49:11.0155496Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0155645Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0155692Z Traceback (most recent call last): 2025-12-04T10:49:11.0155848Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0155887Z method(*args, **kwargs) 2025-12-04T10:49:11.0156039Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0156099Z method(*args, **kwargs) 2025-12-04T10:49:11.0156251Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0156287Z with policy(): 2025-12-04T10:49:11.0156439Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0156479Z raise RuntimeError(msg) 2025-12-04T10:49:11.0156873Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0156877Z 2025-12-04T10:49:11.0156950Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0157237Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0157240Z 2025-12-04T10:49:11.0157326Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0157411Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0157467Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0157737Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0157823Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0157858Z graph_break [] 2025-12-04T10:49:11.0157930Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0158271Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0158316Z if out == self.unknown_value: 2025-12-04T10:49:11.0158463Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0158507Z Traceback (most recent call last): 2025-12-04T10:49:11.0158662Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0158701Z method(*args, **kwargs) 2025-12-04T10:49:11.0158850Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0158889Z method(*args, **kwargs) 2025-12-04T10:49:11.0159039Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0159076Z with policy(): 2025-12-04T10:49:11.0159227Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0159268Z raise RuntimeError(msg) 2025-12-04T10:49:11.0159666Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0159669Z 2025-12-04T10:49:11.0159741Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0160055Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0160058Z 2025-12-04T10:49:11.0160143Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0160215Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0160271Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0160540Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0160612Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0160650Z graph_break [] 2025-12-04T10:49:11.0160720Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0161067Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0161111Z if out == self.unknown_value: 2025-12-04T10:49:11.0161192Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0161249Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0161319Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0161602Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0161638Z graph_break [] 2025-12-04T10:49:11.0161691Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0161840Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0161920Z Traceback (most recent call last): 2025-12-04T10:49:11.0162072Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0162113Z method(*args, **kwargs) 2025-12-04T10:49:11.0162262Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0162305Z method(*args, **kwargs) 2025-12-04T10:49:11.0162454Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0162492Z with policy(): 2025-12-04T10:49:11.0162644Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0162688Z raise RuntimeError(msg) 2025-12-04T10:49:11.0163089Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0163093Z 2025-12-04T10:49:11.0163164Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0163450Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0163453Z 2025-12-04T10:49:11.0163537Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0163608Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0163689Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0163959Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0164031Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0164069Z graph_break [] 2025-12-04T10:49:11.0164140Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0164479Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0164523Z if out == self.unknown_value: 2025-12-04T10:49:11.0164597Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0164651Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0164722Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0164989Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0165038Z graph_break [] 2025-12-04T10:49:11.0165129Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0165182Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0165253Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0165520Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0165558Z graph_break [] 2025-12-04T10:49:11.0165800Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6e4e98ad66105123.xml - 2025-12-04T10:49:11.0165861Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0166485Z FAILED [0.5539s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0166489Z 2025-12-04T10:49:11.0166561Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0166847Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0166850Z 2025-12-04T10:49:11.0166933Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0166995Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0167061Z ================== 1 failed, 57 deselected, 2 rerun in 11.57s ================== 2025-12-04T10:49:11.0167100Z Got exit code 1 2025-12-04T10:49:11.0167140Z Retrying single test... 2025-12-04T10:49:11.0167338Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-dc0d832ecc736cb5.xml 2025-12-04T10:49:11.0167418Z ============================= test session starts ============================== 2025-12-04T10:49:11.0167533Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0167575Z cachedir: .pytest_cache 2025-12-04T10:49:11.0167732Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0167780Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0167820Z configfile: pytest.ini 2025-12-04T10:49:11.0167986Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0168060Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0168344Z stepcurrent: skipping 12 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0168390Z Running 1 items in this shard 2025-12-04T10:49:11.0168392Z 2025-12-04T10:49:11.0168750Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:13:01.308560458 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0168765Z 2025-12-04T10:49:11.0168917Z [W1204 10:13:09.987953668 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0168932Z 2025-12-04T10:49:11.0169080Z [W1204 10:13:09.988098976 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0169082Z 2025-12-04T10:49:11.0169231Z [W1204 10:13:09.991396920 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0169234Z 2025-12-04T10:49:11.0169380Z [W1204 10:13:09.991705354 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0169382Z 2025-12-04T10:49:11.0169530Z [W1204 10:13:09.991779262 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0169533Z 2025-12-04T10:49:11.0169680Z [W1204 10:13:09.994389640 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0169683Z 2025-12-04T10:49:11.0169832Z [W1204 10:13:09.994658795 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0169833Z 2025-12-04T10:49:11.0169981Z [W1204 10:13:09.994756663 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0169983Z 2025-12-04T10:49:11.0170034Z ('RERUN', {'yellow': True}) [10.7365s] [100%] 2025-12-04T10:49:11.0170388Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:13:10.664630308 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0170391Z 2025-12-04T10:49:11.0170538Z [W1204 10:13:10.665059100 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0170541Z 2025-12-04T10:49:11.0170689Z [W1204 10:13:10.665158568 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0170691Z 2025-12-04T10:49:11.0170838Z [W1204 10:13:10.666592559 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0170840Z 2025-12-04T10:49:11.0171009Z [W1204 10:13:10.666868034 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0171011Z 2025-12-04T10:49:11.0171159Z [W1204 10:13:10.666948522 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0171163Z 2025-12-04T10:49:11.0171308Z [W1204 10:13:10.669211807 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0171311Z 2025-12-04T10:49:11.0171458Z [W1204 10:13:10.669479002 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0171460Z 2025-12-04T10:49:11.0171607Z [W1204 10:13:10.669557051 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0171609Z 2025-12-04T10:49:11.0171661Z ('RERUN', {'yellow': True}) [0.5295s] [100%] 2025-12-04T10:49:11.0172052Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:13:10.191763442 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0172072Z 2025-12-04T10:49:11.0172220Z [W1204 10:13:10.192178084 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0172242Z 2025-12-04T10:49:11.0172388Z [W1204 10:13:10.192275802 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0172390Z 2025-12-04T10:49:11.0172537Z [W1204 10:13:10.193699084 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0172539Z 2025-12-04T10:49:11.0172688Z [W1204 10:13:10.193969758 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0172690Z 2025-12-04T10:49:11.0172837Z [W1204 10:13:10.194052117 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0172840Z 2025-12-04T10:49:11.0172987Z [W1204 10:13:10.196329181 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0172990Z 2025-12-04T10:49:11.0173135Z [W1204 10:13:10.196591156 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0173137Z 2025-12-04T10:49:11.0173284Z [W1204 10:13:10.196670195 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0173286Z 2025-12-04T10:49:11.0173323Z FAILED [0.5070s] [100%] 2025-12-04T10:49:11.0173326Z 2025-12-04T10:49:11.0173379Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0173527Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0173573Z Traceback (most recent call last): 2025-12-04T10:49:11.0173728Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0173770Z method(*args, **kwargs) 2025-12-04T10:49:11.0173922Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0173962Z method(*args, **kwargs) 2025-12-04T10:49:11.0174112Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0174149Z with policy(): 2025-12-04T10:49:11.0174327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0174367Z raise RuntimeError(msg) 2025-12-04T10:49:11.0174758Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0174762Z 2025-12-04T10:49:11.0174838Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0175125Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0175128Z 2025-12-04T10:49:11.0175214Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0175286Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0175340Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0175609Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0175693Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0175742Z graph_break [] 2025-12-04T10:49:11.0175812Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0176156Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0176202Z if out == self.unknown_value: 2025-12-04T10:49:11.0176351Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0176396Z Traceback (most recent call last): 2025-12-04T10:49:11.0176548Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0176589Z method(*args, **kwargs) 2025-12-04T10:49:11.0176740Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0176781Z method(*args, **kwargs) 2025-12-04T10:49:11.0176931Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0176968Z with policy(): 2025-12-04T10:49:11.0177121Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0177162Z raise RuntimeError(msg) 2025-12-04T10:49:11.0177558Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0177562Z 2025-12-04T10:49:11.0177634Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0177922Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0177924Z 2025-12-04T10:49:11.0178008Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0178103Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0178158Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0178434Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0178507Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0178543Z graph_break [] 2025-12-04T10:49:11.0178614Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0178954Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0178997Z if out == self.unknown_value: 2025-12-04T10:49:11.0179068Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0179124Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0179194Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0179462Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0179520Z graph_break [] 2025-12-04T10:49:11.0179572Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0179720Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0179766Z Traceback (most recent call last): 2025-12-04T10:49:11.0179919Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0179959Z method(*args, **kwargs) 2025-12-04T10:49:11.0180108Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0180147Z method(*args, **kwargs) 2025-12-04T10:49:11.0180297Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0180333Z with policy(): 2025-12-04T10:49:11.0180485Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0180528Z raise RuntimeError(msg) 2025-12-04T10:49:11.0180929Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0180933Z 2025-12-04T10:49:11.0181004Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0181290Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0181293Z 2025-12-04T10:49:11.0181377Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0181450Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0181504Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0181793Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0181977Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0182014Z graph_break [] 2025-12-04T10:49:11.0182084Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0182422Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0182467Z if out == self.unknown_value: 2025-12-04T10:49:11.0182538Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0182593Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0182663Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0182933Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0182968Z graph_break [] 2025-12-04T10:49:11.0183039Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0183108Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0183178Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0183441Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0183494Z graph_break [] 2025-12-04T10:49:11.0183736Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-dc0d832ecc736cb5.xml - 2025-12-04T10:49:11.0183796Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0184419Z FAILED [0.5070s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0184424Z 2025-12-04T10:49:11.0184494Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0184782Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0184784Z 2025-12-04T10:49:11.0184866Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0184928Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0184993Z ================== 1 failed, 57 deselected, 2 rerun in 11.92s ================== 2025-12-04T10:49:11.0185031Z Got exit code 1 2025-12-04T10:49:11.0185265Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0185393Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0185589Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6401dade5a2637c1.xml 2025-12-04T10:49:11.0185670Z ============================= test session starts ============================== 2025-12-04T10:49:11.0185782Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0185822Z cachedir: .pytest_cache 2025-12-04T10:49:11.0185981Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0186027Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0186067Z configfile: pytest.ini 2025-12-04T10:49:11.0186227Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0186300Z collecting ... collected 58 items / 13 deselected / 45 selected 2025-12-04T10:49:11.0186352Z stepcurrent: skipping 13 already run items. 2025-12-04T10:49:11.0186395Z Running 45 items in this shard 2025-12-04T10:49:11.0186397Z 2025-12-04T10:49:11.0186648Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.8058s] [ 2%] 2025-12-04T10:49:11.0186895Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.7701s] [ 2%] 2025-12-04T10:49:11.0187133Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 FAILED [0.7470s] [ 2%] 2025-12-04T10:49:11.0187148Z 2025-12-04T10:49:11.0187201Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0187351Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0187395Z Traceback (most recent call last): 2025-12-04T10:49:11.0187552Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0187590Z method(*args, **kwargs) 2025-12-04T10:49:11.0187741Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0187780Z method(*args, **kwargs) 2025-12-04T10:49:11.0187932Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0187967Z with policy(): 2025-12-04T10:49:11.0188121Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0188161Z raise RuntimeError(msg) 2025-12-04T10:49:11.0188560Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0188563Z 2025-12-04T10:49:11.0188634Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0188924Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0188927Z 2025-12-04T10:49:11.0189012Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0189083Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0189137Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0189314Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0189406Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0189442Z graph_break [] 2025-12-04T10:49:11.0189592Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0189635Z Traceback (most recent call last): 2025-12-04T10:49:11.0189789Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0189828Z method(*args, **kwargs) 2025-12-04T10:49:11.0189978Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0190016Z method(*args, **kwargs) 2025-12-04T10:49:11.0190164Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0190200Z with policy(): 2025-12-04T10:49:11.0190353Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0190392Z raise RuntimeError(msg) 2025-12-04T10:49:11.0190801Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0190826Z 2025-12-04T10:49:11.0190899Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0191187Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0191189Z 2025-12-04T10:49:11.0191276Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0191346Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0191401Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0191576Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0191648Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0191683Z graph_break [] 2025-12-04T10:49:11.0191755Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0191807Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0191919Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0192092Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0192128Z graph_break [] 2025-12-04T10:49:11.0192179Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0192329Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0192373Z Traceback (most recent call last): 2025-12-04T10:49:11.0192528Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0192568Z method(*args, **kwargs) 2025-12-04T10:49:11.0192719Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0192759Z method(*args, **kwargs) 2025-12-04T10:49:11.0192908Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0192948Z with policy(): 2025-12-04T10:49:11.0193129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0193171Z raise RuntimeError(msg) 2025-12-04T10:49:11.0193578Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0193582Z 2025-12-04T10:49:11.0193656Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0193947Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0193950Z 2025-12-04T10:49:11.0194038Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0194109Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0194165Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0194341Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0194433Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0194485Z graph_break [] 2025-12-04T10:49:11.0194556Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0194612Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0194682Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0194858Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0194895Z graph_break [] 2025-12-04T10:49:11.0194967Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0195020Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0195093Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0195266Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0195307Z graph_break [] 2025-12-04T10:49:11.0195548Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6401dade5a2637c1.xml - 2025-12-04T10:49:11.0195609Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0196255Z FAILED [0.7470s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0196261Z 2025-12-04T10:49:11.0196332Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0196620Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0196622Z 2025-12-04T10:49:11.0196705Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0196793Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0196858Z ================== 1 failed, 13 deselected, 2 rerun in 4.47s =================== 2025-12-04T10:49:11.0196896Z Got exit code 1 2025-12-04T10:49:11.0196936Z Retrying single test... 2025-12-04T10:49:11.0197133Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d58476ea23671a66.xml 2025-12-04T10:49:11.0197188Z ============================= test session starts ============================== 2025-12-04T10:49:11.0197301Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0197340Z cachedir: .pytest_cache 2025-12-04T10:49:11.0197499Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0197544Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0197586Z configfile: pytest.ini 2025-12-04T10:49:11.0197747Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0197820Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0198118Z stepcurrent: skipping 13 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0198176Z Running 1 items in this shard 2025-12-04T10:49:11.0198179Z 2025-12-04T10:49:11.0198543Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:13:31.815339316 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0198545Z 2025-12-04T10:49:11.0198698Z [W1204 10:13:38.197217707 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0198700Z 2025-12-04T10:49:11.0198850Z [W1204 10:13:38.197378984 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0198853Z 2025-12-04T10:49:11.0199001Z [W1204 10:13:38.201116050 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0199004Z 2025-12-04T10:49:11.0199150Z [W1204 10:13:38.201412714 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0199152Z 2025-12-04T10:49:11.0199299Z [W1204 10:13:38.201492373 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0199301Z 2025-12-04T10:49:11.0199449Z [W1204 10:13:38.203908605 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0199451Z 2025-12-04T10:49:11.0199597Z [W1204 10:13:38.204180050 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0199601Z 2025-12-04T10:49:11.0199747Z [W1204 10:13:38.204258138 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0199749Z 2025-12-04T10:49:11.0199802Z ('RERUN', {'yellow': True}) [10.1762s] [100%] 2025-12-04T10:49:11.0200161Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:13:39.404751305 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0200163Z 2025-12-04T10:49:11.0200340Z [W1204 10:13:39.405124528 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0200342Z 2025-12-04T10:49:11.0200491Z [W1204 10:13:39.405213106 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0200494Z 2025-12-04T10:49:11.0200641Z [W1204 10:13:39.406580529 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0200645Z 2025-12-04T10:49:11.0200792Z [W1204 10:13:39.406907873 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0200794Z 2025-12-04T10:49:11.0200943Z [W1204 10:13:39.406985501 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0200945Z 2025-12-04T10:49:11.0201093Z [W1204 10:13:39.409179708 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0201095Z 2025-12-04T10:49:11.0201243Z [W1204 10:13:39.409435603 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0201257Z 2025-12-04T10:49:11.0201402Z [W1204 10:13:39.409511451 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0201405Z 2025-12-04T10:49:11.0201467Z ('RERUN', {'yellow': True}) [0.6950s] [100%] 2025-12-04T10:49:11.0201826Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:13:40.098878370 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0201828Z 2025-12-04T10:49:11.0202020Z [W1204 10:13:40.099255732 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0202022Z 2025-12-04T10:49:11.0202170Z [W1204 10:13:40.099339471 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0202173Z 2025-12-04T10:49:11.0202320Z [W1204 10:13:40.100707263 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0202321Z 2025-12-04T10:49:11.0202472Z [W1204 10:13:40.101033437 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0202474Z 2025-12-04T10:49:11.0202623Z [W1204 10:13:40.101113035 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0202625Z 2025-12-04T10:49:11.0202773Z [W1204 10:13:40.103326322 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0202775Z 2025-12-04T10:49:11.0202924Z [W1204 10:13:40.103584307 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0202926Z 2025-12-04T10:49:11.0203074Z [W1204 10:13:40.103659665 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0203075Z 2025-12-04T10:49:11.0203116Z FAILED [0.7177s] [100%] 2025-12-04T10:49:11.0203119Z 2025-12-04T10:49:11.0203171Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0203321Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0203367Z Traceback (most recent call last): 2025-12-04T10:49:11.0203555Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0203596Z method(*args, **kwargs) 2025-12-04T10:49:11.0203747Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0203788Z method(*args, **kwargs) 2025-12-04T10:49:11.0203938Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0203976Z with policy(): 2025-12-04T10:49:11.0204127Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0204168Z raise RuntimeError(msg) 2025-12-04T10:49:11.0204572Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0204574Z 2025-12-04T10:49:11.0204647Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0204935Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0204952Z 2025-12-04T10:49:11.0205037Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0205126Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0205180Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0205359Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0205432Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0205469Z graph_break [] 2025-12-04T10:49:11.0205540Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0205882Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0205926Z if out == self.unknown_value: 2025-12-04T10:49:11.0206081Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0206125Z Traceback (most recent call last): 2025-12-04T10:49:11.0206276Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0206315Z method(*args, **kwargs) 2025-12-04T10:49:11.0206467Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0206506Z method(*args, **kwargs) 2025-12-04T10:49:11.0206657Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0206694Z with policy(): 2025-12-04T10:49:11.0206850Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0206892Z raise RuntimeError(msg) 2025-12-04T10:49:11.0207302Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0207305Z 2025-12-04T10:49:11.0207399Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0207685Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0207688Z 2025-12-04T10:49:11.0207775Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0207846Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0207903Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0208075Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0208147Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0208182Z graph_break [] 2025-12-04T10:49:11.0208256Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0208595Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0208653Z if out == self.unknown_value: 2025-12-04T10:49:11.0208723Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0208789Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0208862Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0209037Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0209073Z graph_break [] 2025-12-04T10:49:11.0209125Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0209278Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0209321Z Traceback (most recent call last): 2025-12-04T10:49:11.0209476Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0209517Z method(*args, **kwargs) 2025-12-04T10:49:11.0209669Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0209709Z method(*args, **kwargs) 2025-12-04T10:49:11.0209862Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0209898Z with policy(): 2025-12-04T10:49:11.0210053Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0210092Z raise RuntimeError(msg) 2025-12-04T10:49:11.0210500Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0210504Z 2025-12-04T10:49:11.0210577Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0210867Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0210869Z 2025-12-04T10:49:11.0210955Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0211046Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0211101Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0211274Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0211347Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0211382Z graph_break [] 2025-12-04T10:49:11.0211452Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0211793Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0211838Z if out == self.unknown_value: 2025-12-04T10:49:11.0211945Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0212000Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0212071Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0212243Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0212296Z graph_break [] 2025-12-04T10:49:11.0212365Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0212435Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0212503Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0212675Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0212710Z graph_break [] 2025-12-04T10:49:11.0212951Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d58476ea23671a66.xml - 2025-12-04T10:49:11.0213008Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0213653Z FAILED [0.7177s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0213659Z 2025-12-04T10:49:11.0213729Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0214020Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0214022Z 2025-12-04T10:49:11.0214106Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0214167Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0214234Z ================== 1 failed, 57 deselected, 2 rerun in 11.74s ================== 2025-12-04T10:49:11.0214272Z Got exit code 1 2025-12-04T10:49:11.0214313Z Retrying single test... 2025-12-04T10:49:11.0214509Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-bdc520b49d319d73.xml 2025-12-04T10:49:11.0214567Z ============================= test session starts ============================== 2025-12-04T10:49:11.0214712Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0214753Z cachedir: .pytest_cache 2025-12-04T10:49:11.0214910Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0214955Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0214995Z configfile: pytest.ini 2025-12-04T10:49:11.0215157Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0215229Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0215518Z stepcurrent: skipping 13 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0215560Z Running 1 items in this shard 2025-12-04T10:49:11.0215563Z 2025-12-04T10:49:11.0215927Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:13:49.270092822 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0215930Z 2025-12-04T10:49:11.0216091Z [W1204 10:13:57.625060963 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0216093Z 2025-12-04T10:49:11.0216242Z [W1204 10:13:57.625210850 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0216258Z 2025-12-04T10:49:11.0216406Z [W1204 10:13:57.629150863 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0216408Z 2025-12-04T10:49:11.0216558Z [W1204 10:13:57.629455417 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0216560Z 2025-12-04T10:49:11.0216706Z [W1204 10:13:57.629537055 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0216708Z 2025-12-04T10:49:11.0216855Z [W1204 10:13:57.632083655 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0216858Z 2025-12-04T10:49:11.0217004Z [W1204 10:13:57.632349520 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0217007Z 2025-12-04T10:49:11.0217155Z [W1204 10:13:57.632426198 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0217157Z 2025-12-04T10:49:11.0217206Z ('RERUN', {'yellow': True}) [10.1135s] [100%] 2025-12-04T10:49:11.0217567Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:13:58.887417752 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0217569Z 2025-12-04T10:49:11.0217717Z [W1204 10:13:58.887807024 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0217720Z 2025-12-04T10:49:11.0217868Z [W1204 10:13:58.887896983 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0217871Z 2025-12-04T10:49:11.0218020Z [W1204 10:13:58.889285275 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0218022Z 2025-12-04T10:49:11.0218188Z [W1204 10:13:58.889617199 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0218191Z 2025-12-04T10:49:11.0218338Z [W1204 10:13:58.889694637 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0218340Z 2025-12-04T10:49:11.0218486Z [W1204 10:13:58.891978132 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0218489Z 2025-12-04T10:49:11.0218636Z [W1204 10:13:58.892240417 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0218638Z 2025-12-04T10:49:11.0218786Z [W1204 10:13:58.892318045 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0218787Z 2025-12-04T10:49:11.0218835Z ('RERUN', {'yellow': True}) [0.7607s] [100%] 2025-12-04T10:49:11.0219195Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:13:59.649714543 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0219197Z 2025-12-04T10:49:11.0219346Z [W1204 10:13:59.650112425 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0219360Z 2025-12-04T10:49:11.0219507Z [W1204 10:13:59.650194814 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0219522Z 2025-12-04T10:49:11.0219670Z [W1204 10:13:59.651575086 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0219671Z 2025-12-04T10:49:11.0219817Z [W1204 10:13:59.651895680 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0219821Z 2025-12-04T10:49:11.0219969Z [W1204 10:13:59.651974418 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0219970Z 2025-12-04T10:49:11.0220118Z [W1204 10:13:59.654282243 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0220121Z 2025-12-04T10:49:11.0220267Z [W1204 10:13:59.654540558 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0220270Z 2025-12-04T10:49:11.0220418Z [W1204 10:13:59.654615726 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0220420Z 2025-12-04T10:49:11.0220457Z FAILED [0.7618s] [100%] 2025-12-04T10:49:11.0220459Z 2025-12-04T10:49:11.0220511Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0220661Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0220708Z Traceback (most recent call last): 2025-12-04T10:49:11.0220864Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0220905Z method(*args, **kwargs) 2025-12-04T10:49:11.0221056Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0221098Z method(*args, **kwargs) 2025-12-04T10:49:11.0221248Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0221286Z with policy(): 2025-12-04T10:49:11.0221439Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0221502Z raise RuntimeError(msg) 2025-12-04T10:49:11.0221944Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0221947Z 2025-12-04T10:49:11.0222019Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0222308Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0222311Z 2025-12-04T10:49:11.0222396Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0222469Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0222525Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0222701Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0222771Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0222824Z graph_break [] 2025-12-04T10:49:11.0222894Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0223251Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0223295Z if out == self.unknown_value: 2025-12-04T10:49:11.0223444Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0223490Z Traceback (most recent call last): 2025-12-04T10:49:11.0223642Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0223682Z method(*args, **kwargs) 2025-12-04T10:49:11.0223832Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0223873Z method(*args, **kwargs) 2025-12-04T10:49:11.0224022Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0224061Z with policy(): 2025-12-04T10:49:11.0224212Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0224252Z raise RuntimeError(msg) 2025-12-04T10:49:11.0224663Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0224665Z 2025-12-04T10:49:11.0224738Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0225026Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0225029Z 2025-12-04T10:49:11.0225114Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0225186Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0225240Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0225440Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0225510Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0225547Z graph_break [] 2025-12-04T10:49:11.0225617Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0225961Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0226004Z if out == self.unknown_value: 2025-12-04T10:49:11.0226075Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0226128Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0226199Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0226373Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0226410Z graph_break [] 2025-12-04T10:49:11.0226462Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0226627Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0226672Z Traceback (most recent call last): 2025-12-04T10:49:11.0226842Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0226881Z method(*args, **kwargs) 2025-12-04T10:49:11.0227029Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0227068Z method(*args, **kwargs) 2025-12-04T10:49:11.0227218Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0227255Z with policy(): 2025-12-04T10:49:11.0227406Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0227448Z raise RuntimeError(msg) 2025-12-04T10:49:11.0227854Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0227858Z 2025-12-04T10:49:11.0227930Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0228220Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0228224Z 2025-12-04T10:49:11.0228309Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0228380Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0228434Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0228608Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0228679Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0228717Z graph_break [] 2025-12-04T10:49:11.0228786Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0229151Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0229193Z if out == self.unknown_value: 2025-12-04T10:49:11.0229264Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0229319Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0229390Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0229563Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0229601Z graph_break [] 2025-12-04T10:49:11.0229671Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0229725Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0229798Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0229972Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0230007Z graph_break [] 2025-12-04T10:49:11.0230251Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-bdc520b49d319d73.xml - 2025-12-04T10:49:11.0230321Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0230969Z FAILED [0.7618s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0230972Z 2025-12-04T10:49:11.0231044Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0231329Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0231333Z 2025-12-04T10:49:11.0231417Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0231478Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0231543Z ================== 1 failed, 57 deselected, 2 rerun in 11.81s ================== 2025-12-04T10:49:11.0231580Z Got exit code 1 2025-12-04T10:49:11.0231819Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0231993Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0232188Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-4a762bc953aabbc3.xml 2025-12-04T10:49:11.0232247Z ============================= test session starts ============================== 2025-12-04T10:49:11.0232357Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0232400Z cachedir: .pytest_cache 2025-12-04T10:49:11.0232556Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0232602Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0232641Z configfile: pytest.ini 2025-12-04T10:49:11.0232838Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0232912Z collecting ... collected 58 items / 14 deselected / 44 selected 2025-12-04T10:49:11.0232964Z stepcurrent: skipping 14 already run items. 2025-12-04T10:49:11.0233007Z Running 44 items in this shard 2025-12-04T10:49:11.0233010Z 2025-12-04T10:49:11.0233259Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.6501s] [ 2%] 2025-12-04T10:49:11.0233504Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6133s] [ 2%] 2025-12-04T10:49:11.0233725Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 FAILED [0.5823s] [ 2%] 2025-12-04T10:49:11.0233727Z 2025-12-04T10:49:11.0233779Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0233926Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0233987Z Traceback (most recent call last): 2025-12-04T10:49:11.0234141Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0234198Z method(*args, **kwargs) 2025-12-04T10:49:11.0234348Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0234388Z method(*args, **kwargs) 2025-12-04T10:49:11.0234537Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0234574Z with policy(): 2025-12-04T10:49:11.0234727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0234768Z raise RuntimeError(msg) 2025-12-04T10:49:11.0235160Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 65536 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0235167Z 2025-12-04T10:49:11.0235238Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0235528Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0235530Z 2025-12-04T10:49:11.0235615Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0235687Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0235741Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0235914Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0235986Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0236025Z graph_break [] 2025-12-04T10:49:11.0236171Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0236216Z Traceback (most recent call last): 2025-12-04T10:49:11.0236366Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0236406Z method(*args, **kwargs) 2025-12-04T10:49:11.0236575Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0236614Z method(*args, **kwargs) 2025-12-04T10:49:11.0236764Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0236801Z with policy(): 2025-12-04T10:49:11.0236953Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0236994Z raise RuntimeError(msg) 2025-12-04T10:49:11.0237395Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 65536 and is now reported as 131072 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0237397Z 2025-12-04T10:49:11.0237468Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0237757Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0237772Z 2025-12-04T10:49:11.0237855Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0237928Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0237996Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0238169Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0238240Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0238275Z graph_break [] 2025-12-04T10:49:11.0238348Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0238401Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0238471Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0238642Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0238680Z graph_break [] 2025-12-04T10:49:11.0238731Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0238884Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0238927Z Traceback (most recent call last): 2025-12-04T10:49:11.0239080Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0239120Z method(*args, **kwargs) 2025-12-04T10:49:11.0239270Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0239308Z method(*args, **kwargs) 2025-12-04T10:49:11.0239457Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0239494Z with policy(): 2025-12-04T10:49:11.0239645Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0239685Z raise RuntimeError(msg) 2025-12-04T10:49:11.0240087Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0240112Z 2025-12-04T10:49:11.0240185Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0240471Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0240474Z 2025-12-04T10:49:11.0240559Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0240630Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0240685Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0240856Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0240927Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0240963Z graph_break [] 2025-12-04T10:49:11.0241036Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0241088Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0241159Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0241356Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0241394Z graph_break [] 2025-12-04T10:49:11.0241478Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0241531Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0241601Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0241774Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0241809Z graph_break [] 2025-12-04T10:49:11.0242083Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-4a762bc953aabbc3.xml - 2025-12-04T10:49:11.0242142Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0242767Z FAILED [0.5823s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0242771Z 2025-12-04T10:49:11.0242844Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0243129Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0243133Z 2025-12-04T10:49:11.0243216Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0243278Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0243343Z ================== 1 failed, 14 deselected, 2 rerun in 4.00s =================== 2025-12-04T10:49:11.0243380Z Got exit code 1 2025-12-04T10:49:11.0243419Z Retrying single test... 2025-12-04T10:49:11.0243617Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-049d8f38086ccff3.xml 2025-12-04T10:49:11.0243702Z ============================= test session starts ============================== 2025-12-04T10:49:11.0243814Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0243854Z cachedir: .pytest_cache 2025-12-04T10:49:11.0244012Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0244058Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0244099Z configfile: pytest.ini 2025-12-04T10:49:11.0244259Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0244333Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0244614Z stepcurrent: skipping 14 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0244662Z Running 1 items in this shard 2025-12-04T10:49:11.0244664Z 2025-12-04T10:49:11.0245022Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:14:19.163610512 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0245039Z 2025-12-04T10:49:11.0245190Z [W1204 10:14:27.826286441 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0245206Z 2025-12-04T10:49:11.0245357Z [W1204 10:14:27.826439638 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0245359Z 2025-12-04T10:49:11.0245508Z [W1204 10:14:27.829788302 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0245510Z 2025-12-04T10:49:11.0245661Z [W1204 10:14:27.830091516 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0245663Z 2025-12-04T10:49:11.0245809Z [W1204 10:14:27.830176774 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0245814Z 2025-12-04T10:49:11.0245960Z [W1204 10:14:27.832671226 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0245962Z 2025-12-04T10:49:11.0246110Z [W1204 10:14:27.832937380 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0246112Z 2025-12-04T10:49:11.0246258Z [W1204 10:14:27.833020639 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0246260Z 2025-12-04T10:49:11.0246312Z ('RERUN', {'yellow': True}) [10.3237s] [100%] 2025-12-04T10:49:11.0246666Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:14:28.920803629 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0246669Z 2025-12-04T10:49:11.0246818Z [W1204 10:14:28.921287710 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0246821Z 2025-12-04T10:49:11.0246971Z [W1204 10:14:28.921394468 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0246972Z 2025-12-04T10:49:11.0247118Z [W1204 10:14:28.922789900 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0247119Z 2025-12-04T10:49:11.0247291Z [W1204 10:14:28.923142113 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0247293Z 2025-12-04T10:49:11.0247439Z [W1204 10:14:28.923223872 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0247442Z 2025-12-04T10:49:11.0247591Z [W1204 10:14:28.925430799 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0247594Z 2025-12-04T10:49:11.0247743Z [W1204 10:14:28.925693804 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0247745Z 2025-12-04T10:49:11.0247892Z [W1204 10:14:28.925775352 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0247894Z 2025-12-04T10:49:11.0247945Z ('RERUN', {'yellow': True}) [0.5839s] [100%] 2025-12-04T10:49:11.0248299Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:14:28.515886021 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0248313Z 2025-12-04T10:49:11.0248461Z [W1204 10:14:28.516284843 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0248474Z 2025-12-04T10:49:11.0248621Z [W1204 10:14:28.516382061 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0248623Z 2025-12-04T10:49:11.0248770Z [W1204 10:14:28.517767664 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0248772Z 2025-12-04T10:49:11.0248921Z [W1204 10:14:28.518104548 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0248923Z 2025-12-04T10:49:11.0249069Z [W1204 10:14:28.518186536 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0249072Z 2025-12-04T10:49:11.0249220Z [W1204 10:14:28.520393503 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0249222Z 2025-12-04T10:49:11.0249369Z [W1204 10:14:28.520652428 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0249373Z 2025-12-04T10:49:11.0249520Z [W1204 10:14:28.520726936 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0249521Z 2025-12-04T10:49:11.0249560Z FAILED [0.6262s] [100%] 2025-12-04T10:49:11.0249562Z 2025-12-04T10:49:11.0249614Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0249765Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0249809Z Traceback (most recent call last): 2025-12-04T10:49:11.0249967Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0250006Z method(*args, **kwargs) 2025-12-04T10:49:11.0250159Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0250198Z method(*args, **kwargs) 2025-12-04T10:49:11.0250348Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0250384Z with policy(): 2025-12-04T10:49:11.0250559Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0250602Z raise RuntimeError(msg) 2025-12-04T10:49:11.0250995Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 65536 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0250999Z 2025-12-04T10:49:11.0251073Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0251362Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0251364Z 2025-12-04T10:49:11.0251450Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0251522Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0251579Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0251753Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0251838Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0251911Z graph_break [] 2025-12-04T10:49:11.0252000Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0252342Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0252387Z if out == self.unknown_value: 2025-12-04T10:49:11.0252537Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0252583Z Traceback (most recent call last): 2025-12-04T10:49:11.0252736Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0252775Z method(*args, **kwargs) 2025-12-04T10:49:11.0252926Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0252965Z method(*args, **kwargs) 2025-12-04T10:49:11.0253115Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0253150Z with policy(): 2025-12-04T10:49:11.0253301Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0253341Z raise RuntimeError(msg) 2025-12-04T10:49:11.0253742Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 65536 and is now reported as 131072 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0253745Z 2025-12-04T10:49:11.0253816Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0254103Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0254106Z 2025-12-04T10:49:11.0254190Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0254260Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0254348Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0254521Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0254592Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0254629Z graph_break [] 2025-12-04T10:49:11.0254699Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0255038Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0255085Z if out == self.unknown_value: 2025-12-04T10:49:11.0255155Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0255209Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0255281Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0255457Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0255508Z graph_break [] 2025-12-04T10:49:11.0255562Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0255728Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0255815Z Traceback (most recent call last): 2025-12-04T10:49:11.0255981Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0256049Z method(*args, **kwargs) 2025-12-04T10:49:11.0256261Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0256335Z method(*args, **kwargs) 2025-12-04T10:49:11.0256497Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0256562Z with policy(): 2025-12-04T10:49:11.0256731Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0256809Z raise RuntimeError(msg) 2025-12-04T10:49:11.0257232Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0257239Z 2025-12-04T10:49:11.0257338Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0257639Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0257642Z 2025-12-04T10:49:11.0257755Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0257866Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0257942Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0258142Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0258225Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0258293Z graph_break [] 2025-12-04T10:49:11.0258370Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0258782Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0258837Z if out == self.unknown_value: 2025-12-04T10:49:11.0258939Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0259010Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0259101Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0259302Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0259371Z graph_break [] 2025-12-04T10:49:11.0259454Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0259541Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0259623Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0259827Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0259905Z graph_break [] 2025-12-04T10:49:11.0260160Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-049d8f38086ccff3.xml - 2025-12-04T10:49:11.0260260Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0260901Z FAILED [0.6262s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0260904Z 2025-12-04T10:49:11.0261007Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0261316Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0261345Z 2025-12-04T10:49:11.0261443Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0261532Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0261609Z ================== 1 failed, 57 deselected, 2 rerun in 11.69s ================== 2025-12-04T10:49:11.0261678Z Got exit code 1 2025-12-04T10:49:11.0261738Z Retrying single test... 2025-12-04T10:49:11.0262006Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ff04c30e2b4a7609.xml 2025-12-04T10:49:11.0262075Z ============================= test session starts ============================== 2025-12-04T10:49:11.0262214Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0262260Z cachedir: .pytest_cache 2025-12-04T10:49:11.0262468Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0262528Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0262593Z configfile: pytest.ini 2025-12-04T10:49:11.0262767Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0262861Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0263202Z stepcurrent: skipping 14 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0263284Z Running 1 items in this shard 2025-12-04T10:49:11.0263288Z 2025-12-04T10:49:11.0263672Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:14:38.754154123 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0263676Z 2025-12-04T10:49:11.0263840Z [W1204 10:14:45.889659649 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0263842Z 2025-12-04T10:49:11.0264013Z [W1204 10:14:45.889802526 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0264015Z 2025-12-04T10:49:11.0264192Z [W1204 10:14:45.893181310 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0264215Z 2025-12-04T10:49:11.0264375Z [W1204 10:14:45.893473505 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0264391Z 2025-12-04T10:49:11.0264563Z [W1204 10:14:45.893550893 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0264582Z 2025-12-04T10:49:11.0264744Z [W1204 10:14:45.895950696 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0264746Z 2025-12-04T10:49:11.0264920Z [W1204 10:14:45.896221361 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0264924Z 2025-12-04T10:49:11.0265092Z [W1204 10:14:45.896298419 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0265115Z 2025-12-04T10:49:11.0265176Z ('RERUN', {'yellow': True}) [9.7897s] [100%] 2025-12-04T10:49:11.0265554Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:14:46.962273994 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0265558Z 2025-12-04T10:49:11.0265723Z [W1204 10:14:46.962823653 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0265725Z 2025-12-04T10:49:11.0265894Z [W1204 10:14:46.962929911 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0265896Z 2025-12-04T10:49:11.0266065Z [W1204 10:14:46.964413652 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0266089Z 2025-12-04T10:49:11.0266249Z [W1204 10:14:46.964816184 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0266252Z 2025-12-04T10:49:11.0269645Z [W1204 10:14:46.964899643 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0269648Z 2025-12-04T10:49:11.0269813Z [W1204 10:14:46.967250456 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0269815Z 2025-12-04T10:49:11.0269978Z [W1204 10:14:46.967523851 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0270007Z 2025-12-04T10:49:11.0270182Z [W1204 10:14:46.967602300 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0270183Z 2025-12-04T10:49:11.0270259Z ('RERUN', {'yellow': True}) [0.5644s] [100%] 2025-12-04T10:49:11.0270641Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:14:46.496292132 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0270644Z 2025-12-04T10:49:11.0270801Z [W1204 10:14:46.496693474 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0270803Z 2025-12-04T10:49:11.0270966Z [W1204 10:14:46.496789612 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0270972Z 2025-12-04T10:49:11.0271147Z [W1204 10:14:46.498183635 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0271149Z 2025-12-04T10:49:11.0271325Z [W1204 10:14:46.498529958 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0271350Z 2025-12-04T10:49:11.0271521Z [W1204 10:14:46.498608817 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0271537Z 2025-12-04T10:49:11.0271695Z [W1204 10:14:46.500894362 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0271696Z 2025-12-04T10:49:11.0271888Z [W1204 10:14:46.501161417 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0271890Z 2025-12-04T10:49:11.0272063Z [W1204 10:14:46.501238535 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0272065Z 2025-12-04T10:49:11.0272139Z FAILED [0.5221s] [100%] 2025-12-04T10:49:11.0272141Z 2025-12-04T10:49:11.0272215Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0272376Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0272447Z Traceback (most recent call last): 2025-12-04T10:49:11.0272628Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0272702Z method(*args, **kwargs) 2025-12-04T10:49:11.0272866Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0272931Z method(*args, **kwargs) 2025-12-04T10:49:11.0273095Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0273160Z with policy(): 2025-12-04T10:49:11.0273332Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0273398Z raise RuntimeError(msg) 2025-12-04T10:49:11.0273805Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 65536 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0273808Z 2025-12-04T10:49:11.0273904Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0274254Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0274256Z 2025-12-04T10:49:11.0274359Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0274455Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0274522Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0274728Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0274808Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0274885Z graph_break [] 2025-12-04T10:49:11.0274966Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0275331Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0275388Z if out == self.unknown_value: 2025-12-04T10:49:11.0275561Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0275642Z Traceback (most recent call last): 2025-12-04T10:49:11.0275821Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0275907Z method(*args, **kwargs) 2025-12-04T10:49:11.0276070Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0276128Z method(*args, **kwargs) 2025-12-04T10:49:11.0276302Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0276371Z with policy(): 2025-12-04T10:49:11.0276535Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0276604Z raise RuntimeError(msg) 2025-12-04T10:49:11.0277015Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 65536 and is now reported as 131072 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0277020Z 2025-12-04T10:49:11.0277126Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0277430Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0277452Z 2025-12-04T10:49:11.0277556Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0277655Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0277721Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0277928Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0278016Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0278080Z graph_break [] 2025-12-04T10:49:11.0278164Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0278544Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0278593Z if out == self.unknown_value: 2025-12-04T10:49:11.0278715Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0278781Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0278877Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0279063Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0279119Z graph_break [] 2025-12-04T10:49:11.0279192Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0279374Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0279441Z Traceback (most recent call last): 2025-12-04T10:49:11.0279608Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0279665Z method(*args, **kwargs) 2025-12-04T10:49:11.0279838Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0279920Z method(*args, **kwargs) 2025-12-04T10:49:11.0280080Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0280139Z with policy(): 2025-12-04T10:49:11.0280312Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0280388Z raise RuntimeError(msg) 2025-12-04T10:49:11.0280806Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0280808Z 2025-12-04T10:49:11.0280905Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0281213Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0281217Z 2025-12-04T10:49:11.0281318Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0281415Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0281486Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0281681Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0281765Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0281828Z graph_break [] 2025-12-04T10:49:11.0281946Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0282325Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0282381Z if out == self.unknown_value: 2025-12-04T10:49:11.0282481Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0282550Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0282637Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0282882Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0282929Z graph_break [] 2025-12-04T10:49:11.0283029Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0283095Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0283202Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0283396Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0283463Z graph_break [] 2025-12-04T10:49:11.0283719Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ff04c30e2b4a7609.xml - 2025-12-04T10:49:11.0283800Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0284440Z FAILED [0.5221s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0284462Z 2025-12-04T10:49:11.0284556Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0284892Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0284894Z 2025-12-04T10:49:11.0284989Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0285074Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0285150Z ================== 1 failed, 57 deselected, 2 rerun in 11.02s ================== 2025-12-04T10:49:11.0285213Z Got exit code 1 2025-12-04T10:49:11.0285472Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0285626Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0285839Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-0b5d7ef514cc7a4a.xml 2025-12-04T10:49:11.0285918Z ============================= test session starts ============================== 2025-12-04T10:49:11.0286055Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0286120Z cachedir: .pytest_cache 2025-12-04T10:49:11.0286301Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0286357Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0286421Z configfile: pytest.ini 2025-12-04T10:49:11.0286589Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0286707Z collecting ... collected 58 items / 15 deselected / 43 selected 2025-12-04T10:49:11.0286770Z stepcurrent: skipping 15 already run items. 2025-12-04T10:49:11.0286839Z Running 43 items in this shard 2025-12-04T10:49:11.0286841Z 2025-12-04T10:49:11.0287100Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [3.0602s] [ 2%] 2025-12-04T10:49:11.0287387Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6145s] [ 2%] 2025-12-04T10:49:11.0287629Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 FAILED [0.5952s] [ 2%] 2025-12-04T10:49:11.0287650Z 2025-12-04T10:49:11.0287711Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0287882Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0287937Z Traceback (most recent call last): 2025-12-04T10:49:11.0288116Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0288179Z method(*args, **kwargs) 2025-12-04T10:49:11.0288361Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0288416Z method(*args, **kwargs) 2025-12-04T10:49:11.0288590Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0288655Z with policy(): 2025-12-04T10:49:11.0288832Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0288888Z raise RuntimeError(msg) 2025-12-04T10:49:11.0289316Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0289319Z 2025-12-04T10:49:11.0289414Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0289723Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0289725Z 2025-12-04T10:49:11.0289839Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0289928Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0290010Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0290301Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0290397Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0290439Z graph_break [] 2025-12-04T10:49:11.0290631Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0290685Z Traceback (most recent call last): 2025-12-04T10:49:11.0290865Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0290917Z method(*args, **kwargs) 2025-12-04T10:49:11.0291084Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0291164Z method(*args, **kwargs) 2025-12-04T10:49:11.0291326Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0291389Z with policy(): 2025-12-04T10:49:11.0291551Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0291630Z raise RuntimeError(msg) 2025-12-04T10:49:11.0292093Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0292096Z 2025-12-04T10:49:11.0292202Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0292499Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0292502Z 2025-12-04T10:49:11.0292614Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0292697Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0292782Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0293089Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0293187Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0293248Z graph_break [] 2025-12-04T10:49:11.0293330Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0293432Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0293518Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0293813Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0293861Z graph_break [] 2025-12-04T10:49:11.0293935Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0294093Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0294177Z Traceback (most recent call last): 2025-12-04T10:49:11.0294340Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0294404Z method(*args, **kwargs) 2025-12-04T10:49:11.0294564Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0294625Z method(*args, **kwargs) 2025-12-04T10:49:11.0294811Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0294861Z with policy(): 2025-12-04T10:49:11.0295034Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0295085Z raise RuntimeError(msg) 2025-12-04T10:49:11.0295515Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0295519Z 2025-12-04T10:49:11.0295620Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0295935Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0295937Z 2025-12-04T10:49:11.0296060Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0296162Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0296227Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0296522Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0296623Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0296669Z graph_break [] 2025-12-04T10:49:11.0296767Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0296832Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0296931Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0297218Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0297282Z graph_break [] 2025-12-04T10:49:11.0297364Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0297457Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0297533Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0297852Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0297907Z graph_break [] 2025-12-04T10:49:11.0298173Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-0b5d7ef514cc7a4a.xml - 2025-12-04T10:49:11.0298254Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0298888Z FAILED [0.5952s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0298892Z 2025-12-04T10:49:11.0299009Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0299307Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0299322Z 2025-12-04T10:49:11.0299416Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0299501Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0299575Z ================== 1 failed, 15 deselected, 2 rerun in 4.42s =================== 2025-12-04T10:49:11.0299655Z Got exit code 1 2025-12-04T10:49:11.0299706Z Retrying single test... 2025-12-04T10:49:11.0299926Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-9ca3df57aa53f606.xml 2025-12-04T10:49:11.0299992Z ============================= test session starts ============================== 2025-12-04T10:49:11.0300121Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0300183Z cachedir: .pytest_cache 2025-12-04T10:49:11.0300398Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0300453Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0300520Z configfile: pytest.ini 2025-12-04T10:49:11.0300694Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0300804Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0301106Z stepcurrent: skipping 15 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0301174Z Running 1 items in this shard 2025-12-04T10:49:11.0301177Z 2025-12-04T10:49:11.0301558Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:15:08.530950048 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0301560Z 2025-12-04T10:49:11.0301727Z [W1204 10:15:15.992235949 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0301740Z 2025-12-04T10:49:11.0301955Z [W1204 10:15:15.992391456 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0301974Z 2025-12-04T10:49:11.0302138Z [W1204 10:15:15.995913088 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0302140Z 2025-12-04T10:49:11.0302312Z [W1204 10:15:15.996206632 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0302314Z 2025-12-04T10:49:11.0302491Z [W1204 10:15:15.996284690 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0302493Z 2025-12-04T10:49:11.0302651Z [W1204 10:15:15.998670414 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0302654Z 2025-12-04T10:49:11.0302830Z [W1204 10:15:15.998930129 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0302835Z 2025-12-04T10:49:11.0303002Z [W1204 10:15:15.999008947 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0303004Z 2025-12-04T10:49:11.0303077Z ('RERUN', {'yellow': True}) [10.3604s] [100%] 2025-12-04T10:49:11.0303464Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:15:16.598134529 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0303467Z 2025-12-04T10:49:11.0303626Z [W1204 10:15:16.598479223 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0303629Z 2025-12-04T10:49:11.0303804Z [W1204 10:15:16.598559561 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0303807Z 2025-12-04T10:49:11.0303970Z [W1204 10:15:16.599902405 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0303972Z 2025-12-04T10:49:11.0304143Z [W1204 10:15:16.600157570 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0304145Z 2025-12-04T10:49:11.0304345Z [W1204 10:15:16.600235589 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0304347Z 2025-12-04T10:49:11.0304505Z [W1204 10:15:16.602403206 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0304508Z 2025-12-04T10:49:11.0304685Z [W1204 10:15:16.602656121 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0304687Z 2025-12-04T10:49:11.0304851Z [W1204 10:15:16.602731140 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0304853Z 2025-12-04T10:49:11.0304934Z ('RERUN', {'yellow': True}) [0.4572s] [100%] 2025-12-04T10:49:11.0305316Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:15:16.058674504 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0305318Z 2025-12-04T10:49:11.0305478Z [W1204 10:15:16.059050347 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0305492Z 2025-12-04T10:49:11.0305680Z [W1204 10:15:16.059134065 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0305695Z 2025-12-04T10:49:11.0305863Z [W1204 10:15:16.060494179 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0305865Z 2025-12-04T10:49:11.0306035Z [W1204 10:15:16.060745354 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0306037Z 2025-12-04T10:49:11.0306213Z [W1204 10:15:16.060825332 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0306215Z 2025-12-04T10:49:11.0306373Z [W1204 10:15:16.062950591 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0306376Z 2025-12-04T10:49:11.0306551Z [W1204 10:15:16.063212855 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0306553Z 2025-12-04T10:49:11.0306721Z [W1204 10:15:16.063289354 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0306723Z 2025-12-04T10:49:11.0306788Z FAILED [0.4404s] [100%] 2025-12-04T10:49:11.0306790Z 2025-12-04T10:49:11.0306852Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0307028Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0307107Z Traceback (most recent call last): 2025-12-04T10:49:11.0307278Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0307344Z method(*args, **kwargs) 2025-12-04T10:49:11.0307511Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0307573Z method(*args, **kwargs) 2025-12-04T10:49:11.0307731Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0307818Z with policy(): 2025-12-04T10:49:11.0307982Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0308046Z raise RuntimeError(msg) 2025-12-04T10:49:11.0308472Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0308474Z 2025-12-04T10:49:11.0308565Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0312149Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0312155Z 2025-12-04T10:49:11.0312247Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0312322Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0312380Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0312660Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0312734Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0312805Z graph_break [] 2025-12-04T10:49:11.0312879Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0313226Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0313288Z if out == self.unknown_value: 2025-12-04T10:49:11.0313440Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0313489Z Traceback (most recent call last): 2025-12-04T10:49:11.0313644Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0313686Z method(*args, **kwargs) 2025-12-04T10:49:11.0313836Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0313878Z method(*args, **kwargs) 2025-12-04T10:49:11.0314026Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0314064Z with policy(): 2025-12-04T10:49:11.0314217Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0314257Z raise RuntimeError(msg) 2025-12-04T10:49:11.0314665Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0314668Z 2025-12-04T10:49:11.0314740Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0315034Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0315038Z 2025-12-04T10:49:11.0315123Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0315196Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0315252Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0315553Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0315628Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0315664Z graph_break [] 2025-12-04T10:49:11.0315737Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0316079Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0316124Z if out == self.unknown_value: 2025-12-04T10:49:11.0316194Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0316249Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0316320Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0316586Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0316632Z graph_break [] 2025-12-04T10:49:11.0316684Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0316833Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0316892Z Traceback (most recent call last): 2025-12-04T10:49:11.0317044Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0317085Z method(*args, **kwargs) 2025-12-04T10:49:11.0317235Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0317275Z method(*args, **kwargs) 2025-12-04T10:49:11.0317423Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0317461Z with policy(): 2025-12-04T10:49:11.0317611Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0317652Z raise RuntimeError(msg) 2025-12-04T10:49:11.0318060Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0318063Z 2025-12-04T10:49:11.0318135Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0318423Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0318426Z 2025-12-04T10:49:11.0318510Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0318582Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0318637Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0318908Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0318980Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0319015Z graph_break [] 2025-12-04T10:49:11.0319110Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0319447Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0319491Z if out == self.unknown_value: 2025-12-04T10:49:11.0319562Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0319617Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0319688Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0319958Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0319993Z graph_break [] 2025-12-04T10:49:11.0320065Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0320118Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0320188Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0320453Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0320517Z graph_break [] 2025-12-04T10:49:11.0320761Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-9ca3df57aa53f606.xml - 2025-12-04T10:49:11.0320821Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0321459Z FAILED [0.4404s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0321463Z 2025-12-04T10:49:11.0321533Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0321822Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0321824Z 2025-12-04T10:49:11.0321940Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0322005Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0322072Z ================== 1 failed, 57 deselected, 2 rerun in 11.41s ================== 2025-12-04T10:49:11.0322109Z Got exit code 1 2025-12-04T10:49:11.0322148Z Retrying single test... 2025-12-04T10:49:11.0322347Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-bd24b8558ecf2685.xml 2025-12-04T10:49:11.0322406Z ============================= test session starts ============================== 2025-12-04T10:49:11.0322518Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0322560Z cachedir: .pytest_cache 2025-12-04T10:49:11.0322719Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0322766Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0322805Z configfile: pytest.ini 2025-12-04T10:49:11.0323004Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0323077Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0323363Z stepcurrent: skipping 15 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0323408Z Running 1 items in this shard 2025-12-04T10:49:11.0323411Z 2025-12-04T10:49:11.0323770Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:15:25.269363341 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0323773Z 2025-12-04T10:49:11.0323926Z [W1204 10:15:33.868870138 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0323929Z 2025-12-04T10:49:11.0324079Z [W1204 10:15:33.869044494 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0324080Z 2025-12-04T10:49:11.0324246Z [W1204 10:15:33.873022497 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0324248Z 2025-12-04T10:49:11.0324395Z [W1204 10:15:33.873297222 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0324412Z 2025-12-04T10:49:11.0324559Z [W1204 10:15:33.873377920 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0324561Z 2025-12-04T10:49:11.0324709Z [W1204 10:15:33.876030948 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0324712Z 2025-12-04T10:49:11.0324858Z [W1204 10:15:33.876299383 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0324860Z 2025-12-04T10:49:11.0325007Z [W1204 10:15:33.876372712 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0325009Z 2025-12-04T10:49:11.0325059Z ('RERUN', {'yellow': True}) [10.6304s] [100%] 2025-12-04T10:49:11.0325417Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:15:34.699404939 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0325419Z 2025-12-04T10:49:11.0325568Z [W1204 10:15:34.699801302 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0325570Z 2025-12-04T10:49:11.0325717Z [W1204 10:15:34.699886380 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0325719Z 2025-12-04T10:49:11.0325867Z [W1204 10:15:34.701290703 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0325869Z 2025-12-04T10:49:11.0326015Z [W1204 10:15:34.701541358 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0326018Z 2025-12-04T10:49:11.0326164Z [W1204 10:15:34.701619036 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0326166Z 2025-12-04T10:49:11.0326332Z [W1204 10:15:34.703912262 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0326334Z 2025-12-04T10:49:11.0326482Z [W1204 10:15:34.704173577 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0326484Z 2025-12-04T10:49:11.0326631Z [W1204 10:15:34.704248935 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0326634Z 2025-12-04T10:49:11.0326682Z ('RERUN', {'yellow': True}) [0.6805s] [100%] 2025-12-04T10:49:11.0327035Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:15:34.378929039 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0327037Z 2025-12-04T10:49:11.0327185Z [W1204 10:15:34.379436929 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0327187Z 2025-12-04T10:49:11.0327333Z [W1204 10:15:34.379543137 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0327335Z 2025-12-04T10:49:11.0327482Z [W1204 10:15:34.380996568 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0327495Z 2025-12-04T10:49:11.0327642Z [W1204 10:15:34.381274823 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0327657Z 2025-12-04T10:49:11.0327804Z [W1204 10:15:34.381351771 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0327806Z 2025-12-04T10:49:11.0327953Z [W1204 10:15:34.383665616 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0327955Z 2025-12-04T10:49:11.0328102Z [W1204 10:15:34.383925681 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0328104Z 2025-12-04T10:49:11.0328249Z [W1204 10:15:34.383999350 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0328253Z 2025-12-04T10:49:11.0328290Z FAILED [0.6677s] [100%] 2025-12-04T10:49:11.0328292Z 2025-12-04T10:49:11.0328345Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0328496Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0328541Z Traceback (most recent call last): 2025-12-04T10:49:11.0328697Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0328738Z method(*args, **kwargs) 2025-12-04T10:49:11.0328890Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0328929Z method(*args, **kwargs) 2025-12-04T10:49:11.0329078Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0329116Z with policy(): 2025-12-04T10:49:11.0329267Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0329308Z raise RuntimeError(msg) 2025-12-04T10:49:11.0329702Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0329724Z 2025-12-04T10:49:11.0329798Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0330085Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0330089Z 2025-12-04T10:49:11.0330174Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0330246Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0330302Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0330573Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0330645Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0330682Z graph_break [] 2025-12-04T10:49:11.0330753Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0331098Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0331151Z if out == self.unknown_value: 2025-12-04T10:49:11.0331312Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0331356Z Traceback (most recent call last): 2025-12-04T10:49:11.0331510Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0331549Z method(*args, **kwargs) 2025-12-04T10:49:11.0331703Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0331742Z method(*args, **kwargs) 2025-12-04T10:49:11.0331940Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0331978Z with policy(): 2025-12-04T10:49:11.0332128Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0332170Z raise RuntimeError(msg) 2025-12-04T10:49:11.0332572Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0332574Z 2025-12-04T10:49:11.0332648Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0332934Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0332938Z 2025-12-04T10:49:11.0333024Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0333095Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0333152Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0333422Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0333492Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0333558Z graph_break [] 2025-12-04T10:49:11.0333629Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0333968Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0334011Z if out == self.unknown_value: 2025-12-04T10:49:11.0334082Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0334136Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0334208Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0334477Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0334514Z graph_break [] 2025-12-04T10:49:11.0334565Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0334715Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0334772Z Traceback (most recent call last): 2025-12-04T10:49:11.0334925Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0334979Z method(*args, **kwargs) 2025-12-04T10:49:11.0335129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0335168Z method(*args, **kwargs) 2025-12-04T10:49:11.0335318Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0335354Z with policy(): 2025-12-04T10:49:11.0335505Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0335546Z raise RuntimeError(msg) 2025-12-04T10:49:11.0335947Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0335951Z 2025-12-04T10:49:11.0336023Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0336309Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0336311Z 2025-12-04T10:49:11.0336398Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0336468Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0336523Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0336792Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0336865Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0336901Z graph_break [] 2025-12-04T10:49:11.0336971Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0337334Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0337377Z if out == self.unknown_value: 2025-12-04T10:49:11.0337448Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0337502Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0337573Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0337841Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0337881Z graph_break [] 2025-12-04T10:49:11.0337950Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0338004Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0338073Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0338339Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0338375Z graph_break [] 2025-12-04T10:49:11.0338628Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-bd24b8558ecf2685.xml - 2025-12-04T10:49:11.0338688Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0339332Z FAILED [0.6677s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0339335Z 2025-12-04T10:49:11.0339408Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0339695Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0339699Z 2025-12-04T10:49:11.0339783Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0339845Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0339910Z ================== 1 failed, 57 deselected, 2 rerun in 12.13s ================== 2025-12-04T10:49:11.0339947Z Got exit code 1 2025-12-04T10:49:11.0340184Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0340312Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0340507Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-9d367633f4a743ad.xml 2025-12-04T10:49:11.0340564Z ============================= test session starts ============================== 2025-12-04T10:49:11.0340678Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0340720Z cachedir: .pytest_cache 2025-12-04T10:49:11.0340877Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0340923Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0340962Z configfile: pytest.ini 2025-12-04T10:49:11.0341145Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0341218Z collecting ... collected 58 items / 16 deselected / 42 selected 2025-12-04T10:49:11.0341271Z stepcurrent: skipping 16 already run items. 2025-12-04T10:49:11.0341314Z Running 42 items in this shard 2025-12-04T10:49:11.0341317Z 2025-12-04T10:49:11.0341565Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.6982s] [ 2%] 2025-12-04T10:49:11.0341813Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6402s] [ 2%] 2025-12-04T10:49:11.0342074Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 FAILED [0.6605s] [ 2%] 2025-12-04T10:49:11.0342077Z 2025-12-04T10:49:11.0342128Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0342276Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0342335Z Traceback (most recent call last): 2025-12-04T10:49:11.0342489Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0342544Z method(*args, **kwargs) 2025-12-04T10:49:11.0342694Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0342733Z method(*args, **kwargs) 2025-12-04T10:49:11.0342882Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0342921Z with policy(): 2025-12-04T10:49:11.0343071Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0343112Z raise RuntimeError(msg) 2025-12-04T10:49:11.0343509Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0343514Z 2025-12-04T10:49:11.0343586Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0343876Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0343878Z 2025-12-04T10:49:11.0343966Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0344037Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0344092Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0344269Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0344341Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0344379Z graph_break [] 2025-12-04T10:49:11.0344527Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0344573Z Traceback (most recent call last): 2025-12-04T10:49:11.0344725Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0344806Z method(*args, **kwargs) 2025-12-04T10:49:11.0344956Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0344996Z method(*args, **kwargs) 2025-12-04T10:49:11.0345144Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0345183Z with policy(): 2025-12-04T10:49:11.0345335Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0345377Z raise RuntimeError(msg) 2025-12-04T10:49:11.0345783Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0345787Z 2025-12-04T10:49:11.0345857Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0346147Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0346167Z 2025-12-04T10:49:11.0346252Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0346337Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0346391Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0346567Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0346637Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0346676Z graph_break [] 2025-12-04T10:49:11.0346747Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0346802Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0346872Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0347045Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0347083Z graph_break [] 2025-12-04T10:49:11.0347134Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0347285Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0347329Z Traceback (most recent call last): 2025-12-04T10:49:11.0347484Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0347523Z method(*args, **kwargs) 2025-12-04T10:49:11.0347672Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0347711Z method(*args, **kwargs) 2025-12-04T10:49:11.0347861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0347896Z with policy(): 2025-12-04T10:49:11.0348049Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0348090Z raise RuntimeError(msg) 2025-12-04T10:49:11.0348518Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0348520Z 2025-12-04T10:49:11.0348592Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0348880Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0348883Z 2025-12-04T10:49:11.0348970Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0349041Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0349097Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0349269Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0349342Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0349377Z graph_break [] 2025-12-04T10:49:11.0349450Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0349503Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0349573Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0349754Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0349812Z graph_break [] 2025-12-04T10:49:11.0349882Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0349937Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0350006Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0350180Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0350216Z graph_break [] 2025-12-04T10:49:11.0350458Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-9d367633f4a743ad.xml - 2025-12-04T10:49:11.0350519Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0351149Z FAILED [0.6605s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0351153Z 2025-12-04T10:49:11.0351227Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0351511Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0351514Z 2025-12-04T10:49:11.0351599Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0351661Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0351727Z ================== 1 failed, 16 deselected, 2 rerun in 4.14s =================== 2025-12-04T10:49:11.0351765Z Got exit code 1 2025-12-04T10:49:11.0351804Z Retrying single test... 2025-12-04T10:49:11.0352112Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c6904ea01da2ccd8.xml 2025-12-04T10:49:11.0352209Z ============================= test session starts ============================== 2025-12-04T10:49:11.0352320Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0352360Z cachedir: .pytest_cache 2025-12-04T10:49:11.0352518Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0352564Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0352605Z configfile: pytest.ini 2025-12-04T10:49:11.0352767Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0352840Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0353124Z stepcurrent: skipping 16 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0353168Z Running 1 items in this shard 2025-12-04T10:49:11.0353170Z 2025-12-04T10:49:11.0353529Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:15:55.957869759 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0353548Z 2025-12-04T10:49:11.0353699Z [W1204 10:16:03.623042179 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0353718Z 2025-12-04T10:49:11.0353868Z [W1204 10:16:03.623202036 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0353870Z 2025-12-04T10:49:11.0354019Z [W1204 10:16:03.627145800 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0354021Z 2025-12-04T10:49:11.0354170Z [W1204 10:16:03.627532062 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0354171Z 2025-12-04T10:49:11.0354320Z [W1204 10:16:03.627611071 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0354323Z 2025-12-04T10:49:11.0354470Z [W1204 10:16:03.630216180 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0354473Z 2025-12-04T10:49:11.0354620Z [W1204 10:16:03.630500335 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0354622Z 2025-12-04T10:49:11.0354769Z [W1204 10:16:03.630577573 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0354774Z 2025-12-04T10:49:11.0354824Z ('RERUN', {'yellow': True}) [10.4292s] [100%] 2025-12-04T10:49:11.0355186Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:16:04.744378072 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0355189Z 2025-12-04T10:49:11.0355337Z [W1204 10:16:04.744825513 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0355340Z 2025-12-04T10:49:11.0355489Z [W1204 10:16:04.744905781 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0355491Z 2025-12-04T10:49:11.0355638Z [W1204 10:16:04.746296404 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0355659Z 2025-12-04T10:49:11.0355808Z [W1204 10:16:04.746622988 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0355810Z 2025-12-04T10:49:11.0355958Z [W1204 10:16:04.746701027 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0355961Z 2025-12-04T10:49:11.0356109Z [W1204 10:16:04.748892054 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0356112Z 2025-12-04T10:49:11.0356260Z [W1204 10:16:04.749157149 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0356262Z 2025-12-04T10:49:11.0356408Z [W1204 10:16:04.749235108 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0356410Z 2025-12-04T10:49:11.0356464Z ('RERUN', {'yellow': True}) [0.6107s] [100%] 2025-12-04T10:49:11.0356821Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:16:04.354543755 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0356834Z 2025-12-04T10:49:11.0356982Z [W1204 10:16:04.354963807 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0356997Z 2025-12-04T10:49:11.0357146Z [W1204 10:16:04.355063315 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0357148Z 2025-12-04T10:49:11.0357294Z [W1204 10:16:04.356496788 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0357296Z 2025-12-04T10:49:11.0357445Z [W1204 10:16:04.356847751 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0357447Z 2025-12-04T10:49:11.0357594Z [W1204 10:16:04.356930449 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0357597Z 2025-12-04T10:49:11.0357743Z [W1204 10:16:04.359206355 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0357748Z 2025-12-04T10:49:11.0357897Z [W1204 10:16:04.359476830 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0357900Z 2025-12-04T10:49:11.0358046Z [W1204 10:16:04.359553288 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0358048Z 2025-12-04T10:49:11.0358091Z FAILED [0.6060s] [100%] 2025-12-04T10:49:11.0358093Z 2025-12-04T10:49:11.0358145Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0358296Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0358342Z Traceback (most recent call last): 2025-12-04T10:49:11.0358499Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0358541Z method(*args, **kwargs) 2025-12-04T10:49:11.0358694Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0358735Z method(*args, **kwargs) 2025-12-04T10:49:11.0358883Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0358922Z with policy(): 2025-12-04T10:49:11.0359094Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0359136Z raise RuntimeError(msg) 2025-12-04T10:49:11.0359537Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0359541Z 2025-12-04T10:49:11.0359616Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0359902Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0359904Z 2025-12-04T10:49:11.0359993Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0360065Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0360121Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0360298Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0360382Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0360433Z graph_break [] 2025-12-04T10:49:11.0360505Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0360851Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0360897Z if out == self.unknown_value: 2025-12-04T10:49:11.0361046Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0361090Z Traceback (most recent call last): 2025-12-04T10:49:11.0361246Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0361287Z method(*args, **kwargs) 2025-12-04T10:49:11.0361440Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0361479Z method(*args, **kwargs) 2025-12-04T10:49:11.0361630Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0361666Z with policy(): 2025-12-04T10:49:11.0361819Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0361909Z raise RuntimeError(msg) 2025-12-04T10:49:11.0362316Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0362319Z 2025-12-04T10:49:11.0362393Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0362680Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0362682Z 2025-12-04T10:49:11.0362768Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0362876Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0362932Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0363108Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0363182Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0363218Z graph_break [] 2025-12-04T10:49:11.0363289Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0363629Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0363674Z if out == self.unknown_value: 2025-12-04T10:49:11.0363744Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0363801Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0363873Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0364047Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0364116Z graph_break [] 2025-12-04T10:49:11.0364167Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0364318Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0364378Z Traceback (most recent call last): 2025-12-04T10:49:11.0364533Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0364572Z method(*args, **kwargs) 2025-12-04T10:49:11.0364725Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0364763Z method(*args, **kwargs) 2025-12-04T10:49:11.0364913Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0364950Z with policy(): 2025-12-04T10:49:11.0365101Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0365141Z raise RuntimeError(msg) 2025-12-04T10:49:11.0365547Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0365550Z 2025-12-04T10:49:11.0365623Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0365911Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0365913Z 2025-12-04T10:49:11.0366001Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0366072Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0366129Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0366301Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0366372Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0366409Z graph_break [] 2025-12-04T10:49:11.0366502Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0366841Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0366887Z if out == self.unknown_value: 2025-12-04T10:49:11.0366957Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0367012Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0367084Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0367257Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0367293Z graph_break [] 2025-12-04T10:49:11.0367365Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0367419Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0367490Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0367663Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0367709Z graph_break [] 2025-12-04T10:49:11.0367952Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c6904ea01da2ccd8.xml - 2025-12-04T10:49:11.0368027Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0368668Z FAILED [0.6060s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0368671Z 2025-12-04T10:49:11.0368743Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0369033Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0369036Z 2025-12-04T10:49:11.0369122Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0369182Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0369249Z ================== 1 failed, 57 deselected, 2 rerun in 11.79s ================== 2025-12-04T10:49:11.0369286Z Got exit code 1 2025-12-04T10:49:11.0369327Z Retrying single test... 2025-12-04T10:49:11.0369523Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-133eefb9b5240e6d.xml 2025-12-04T10:49:11.0369581Z ============================= test session starts ============================== 2025-12-04T10:49:11.0369693Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0369734Z cachedir: .pytest_cache 2025-12-04T10:49:11.0369893Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0369939Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0369978Z configfile: pytest.ini 2025-12-04T10:49:11.0370143Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0370241Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0370526Z stepcurrent: skipping 16 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0370570Z Running 1 items in this shard 2025-12-04T10:49:11.0370572Z 2025-12-04T10:49:11.0370931Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:16:13.485656605 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0370935Z 2025-12-04T10:49:11.0371088Z [W1204 10:16:21.901257694 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0371090Z 2025-12-04T10:49:11.0371241Z [W1204 10:16:21.901405191 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0371243Z 2025-12-04T10:49:11.0371393Z [W1204 10:16:21.905152629 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0371405Z 2025-12-04T10:49:11.0371553Z [W1204 10:16:21.905445813 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0371556Z 2025-12-04T10:49:11.0371717Z [W1204 10:16:21.905524862 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0371719Z 2025-12-04T10:49:11.0371904Z [W1204 10:16:21.907884406 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0371906Z 2025-12-04T10:49:11.0372054Z [W1204 10:16:21.908151221 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0372056Z 2025-12-04T10:49:11.0372205Z [W1204 10:16:21.908229700 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0372206Z 2025-12-04T10:49:11.0372256Z ('RERUN', {'yellow': True}) [9.9758s] [100%] 2025-12-04T10:49:11.0372615Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:16:22.863213806 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0372618Z 2025-12-04T10:49:11.0372767Z [W1204 10:16:22.863615478 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0372769Z 2025-12-04T10:49:11.0372918Z [W1204 10:16:22.863710686 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0372920Z 2025-12-04T10:49:11.0373067Z [W1204 10:16:22.865071480 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0373070Z 2025-12-04T10:49:11.0373218Z [W1204 10:16:22.865392744 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0373220Z 2025-12-04T10:49:11.0373367Z [W1204 10:16:22.865469112 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0373370Z 2025-12-04T10:49:11.0373518Z [W1204 10:16:22.867618291 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0373520Z 2025-12-04T10:49:11.0373693Z [W1204 10:16:22.867875086 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0373695Z 2025-12-04T10:49:11.0373842Z [W1204 10:16:22.867949974 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0373844Z 2025-12-04T10:49:11.0373892Z ('RERUN', {'yellow': True}) [0.4655s] [100%] 2025-12-04T10:49:11.0374252Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:16:22.319824503 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0374255Z 2025-12-04T10:49:11.0374402Z [W1204 10:16:22.320208866 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0374405Z 2025-12-04T10:49:11.0374555Z [W1204 10:16:22.320291674 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0374557Z 2025-12-04T10:49:11.0374708Z [W1204 10:16:22.321654908 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0374709Z 2025-12-04T10:49:11.0374870Z [W1204 10:16:22.321975942 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0374872Z 2025-12-04T10:49:11.0375019Z [W1204 10:16:22.322058960 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0375034Z 2025-12-04T10:49:11.0375180Z [W1204 10:16:22.324236778 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0375184Z 2025-12-04T10:49:11.0375332Z [W1204 10:16:22.324491673 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0375334Z 2025-12-04T10:49:11.0375482Z [W1204 10:16:22.324567801 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0375484Z 2025-12-04T10:49:11.0375521Z FAILED [0.4554s] [100%] 2025-12-04T10:49:11.0375524Z 2025-12-04T10:49:11.0375578Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0375727Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0375775Z Traceback (most recent call last): 2025-12-04T10:49:11.0375931Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0375974Z method(*args, **kwargs) 2025-12-04T10:49:11.0376126Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0376167Z method(*args, **kwargs) 2025-12-04T10:49:11.0376317Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0376355Z with policy(): 2025-12-04T10:49:11.0376507Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0376549Z raise RuntimeError(msg) 2025-12-04T10:49:11.0376949Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0376952Z 2025-12-04T10:49:11.0377024Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0377346Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0377348Z 2025-12-04T10:49:11.0377435Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0377509Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0377565Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0377743Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0377814Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0377852Z graph_break [] 2025-12-04T10:49:11.0377923Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0378267Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0378312Z if out == self.unknown_value: 2025-12-04T10:49:11.0378469Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0378514Z Traceback (most recent call last): 2025-12-04T10:49:11.0378677Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0378717Z method(*args, **kwargs) 2025-12-04T10:49:11.0378866Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0378908Z method(*args, **kwargs) 2025-12-04T10:49:11.0379059Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0379098Z with policy(): 2025-12-04T10:49:11.0379248Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0379292Z raise RuntimeError(msg) 2025-12-04T10:49:11.0379696Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0379700Z 2025-12-04T10:49:11.0379773Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0380068Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0380073Z 2025-12-04T10:49:11.0380158Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0380230Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0380285Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0380459Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0380531Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0380568Z graph_break [] 2025-12-04T10:49:11.0380639Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0381005Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0381048Z if out == self.unknown_value: 2025-12-04T10:49:11.0381120Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0381173Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0381246Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0381419Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0381458Z graph_break [] 2025-12-04T10:49:11.0381509Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0381662Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0381708Z Traceback (most recent call last): 2025-12-04T10:49:11.0381897Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0381937Z method(*args, **kwargs) 2025-12-04T10:49:11.0382086Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0382151Z method(*args, **kwargs) 2025-12-04T10:49:11.0382299Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0382352Z with policy(): 2025-12-04T10:49:11.0382503Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0382544Z raise RuntimeError(msg) 2025-12-04T10:49:11.0382950Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0382953Z 2025-12-04T10:49:11.0383027Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0383314Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0383317Z 2025-12-04T10:49:11.0383402Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0383473Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0383528Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0383702Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0383772Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0383809Z graph_break [] 2025-12-04T10:49:11.0383878Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0384219Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0384262Z if out == self.unknown_value: 2025-12-04T10:49:11.0384333Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0384387Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0384458Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0384660Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0384698Z graph_break [] 2025-12-04T10:49:11.0384768Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0384823Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0384893Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0385068Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0385105Z graph_break [] 2025-12-04T10:49:11.0385347Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-133eefb9b5240e6d.xml - 2025-12-04T10:49:11.0385405Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0386038Z FAILED [0.4554s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0386064Z 2025-12-04T10:49:11.0386135Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0386420Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0386422Z 2025-12-04T10:49:11.0386509Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0386569Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0386636Z ================== 1 failed, 57 deselected, 2 rerun in 11.06s ================== 2025-12-04T10:49:11.0386671Z Got exit code 1 2025-12-04T10:49:11.0386910Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0387038Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0387233Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-29bc1b49a0501a4a.xml 2025-12-04T10:49:11.0387291Z ============================= test session starts ============================== 2025-12-04T10:49:11.0387404Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0387445Z cachedir: .pytest_cache 2025-12-04T10:49:11.0387602Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0387649Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0387689Z configfile: pytest.ini 2025-12-04T10:49:11.0387851Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0387925Z collecting ... collected 58 items / 17 deselected / 41 selected 2025-12-04T10:49:11.0387978Z stepcurrent: skipping 17 already run items. 2025-12-04T10:49:11.0388023Z Running 41 items in this shard 2025-12-04T10:49:11.0388025Z 2025-12-04T10:49:11.0388292Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.5785s] [ 2%] 2025-12-04T10:49:11.0388532Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6690s] [ 2%] 2025-12-04T10:49:11.0388753Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 FAILED [0.6640s] [ 2%] 2025-12-04T10:49:11.0388757Z 2025-12-04T10:49:11.0388808Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0388957Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0389003Z Traceback (most recent call last): 2025-12-04T10:49:11.0389158Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0389200Z method(*args, **kwargs) 2025-12-04T10:49:11.0389351Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0389390Z method(*args, **kwargs) 2025-12-04T10:49:11.0389541Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0389589Z with policy(): 2025-12-04T10:49:11.0389741Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0389796Z raise RuntimeError(msg) 2025-12-04T10:49:11.0390185Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0390189Z 2025-12-04T10:49:11.0390262Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0390548Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0390551Z 2025-12-04T10:49:11.0390635Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0390707Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0390763Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0390938Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0391009Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0391046Z graph_break [] 2025-12-04T10:49:11.0391197Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0391242Z Traceback (most recent call last): 2025-12-04T10:49:11.0391393Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0391435Z method(*args, **kwargs) 2025-12-04T10:49:11.0391584Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0391625Z method(*args, **kwargs) 2025-12-04T10:49:11.0391775Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0391812Z with policy(): 2025-12-04T10:49:11.0391999Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0392083Z raise RuntimeError(msg) 2025-12-04T10:49:11.0392482Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0392486Z 2025-12-04T10:49:11.0392558Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0392843Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0392846Z 2025-12-04T10:49:11.0392930Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0393002Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0393057Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0393232Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0393302Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0393352Z graph_break [] 2025-12-04T10:49:11.0393422Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0393492Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0393562Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0393734Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0393771Z graph_break [] 2025-12-04T10:49:11.0393823Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0393974Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0394019Z Traceback (most recent call last): 2025-12-04T10:49:11.0394173Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0394213Z method(*args, **kwargs) 2025-12-04T10:49:11.0394362Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0394401Z method(*args, **kwargs) 2025-12-04T10:49:11.0394551Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0394587Z with policy(): 2025-12-04T10:49:11.0394740Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0394780Z raise RuntimeError(msg) 2025-12-04T10:49:11.0395176Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0395179Z 2025-12-04T10:49:11.0395250Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0395538Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0395540Z 2025-12-04T10:49:11.0395626Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0395719Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0395776Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0395948Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0396021Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0396057Z graph_break [] 2025-12-04T10:49:11.0396129Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0396184Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0396254Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0396426Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0396462Z graph_break [] 2025-12-04T10:49:11.0396533Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0396588Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0396657Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0396832Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0396879Z graph_break [] 2025-12-04T10:49:11.0397119Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-29bc1b49a0501a4a.xml - 2025-12-04T10:49:11.0397191Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0397814Z FAILED [0.6640s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0397817Z 2025-12-04T10:49:11.0397889Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0398172Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0398175Z 2025-12-04T10:49:11.0398260Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0398320Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0398388Z ================== 1 failed, 17 deselected, 2 rerun in 4.07s =================== 2025-12-04T10:49:11.0398424Z Got exit code 1 2025-12-04T10:49:11.0398465Z Retrying single test... 2025-12-04T10:49:11.0398661Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-054ceffac23daad3.xml 2025-12-04T10:49:11.0398720Z ============================= test session starts ============================== 2025-12-04T10:49:11.0398832Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0398873Z cachedir: .pytest_cache 2025-12-04T10:49:11.0399030Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0399075Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0399115Z configfile: pytest.ini 2025-12-04T10:49:11.0399300Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0399374Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0399655Z stepcurrent: skipping 17 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0399700Z Running 1 items in this shard 2025-12-04T10:49:11.0399702Z 2025-12-04T10:49:11.0400057Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:16:42.106210018 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0400060Z 2025-12-04T10:49:11.0400212Z [W1204 10:16:50.671511528 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0400216Z 2025-12-04T10:49:11.0400367Z [W1204 10:16:50.671663165 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0400369Z 2025-12-04T10:49:11.0400519Z [W1204 10:16:50.676981333 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0400531Z 2025-12-04T10:49:11.0400679Z [W1204 10:16:50.677444744 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0400695Z 2025-12-04T10:49:11.0400843Z [W1204 10:16:50.677527532 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0400845Z 2025-12-04T10:49:11.0400992Z [W1204 10:16:50.679934796 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0400994Z 2025-12-04T10:49:11.0401143Z [W1204 10:16:50.680202470 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0401145Z 2025-12-04T10:49:11.0401290Z [W1204 10:16:50.680281879 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0401293Z 2025-12-04T10:49:11.0401344Z ('RERUN', {'yellow': True}) [10.1992s] [100%] 2025-12-04T10:49:11.0401697Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:16:51.747139194 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0401700Z 2025-12-04T10:49:11.0401882Z [W1204 10:16:51.747527947 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0401884Z 2025-12-04T10:49:11.0402034Z [W1204 10:16:51.747607955 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0402036Z 2025-12-04T10:49:11.0402181Z [W1204 10:16:51.748972899 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0402185Z 2025-12-04T10:49:11.0402333Z [W1204 10:16:51.749312722 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0402336Z 2025-12-04T10:49:11.0402482Z [W1204 10:16:51.749394491 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0402484Z 2025-12-04T10:49:11.0402632Z [W1204 10:16:51.751595348 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0402634Z 2025-12-04T10:49:11.0402813Z [W1204 10:16:51.751858343 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0402816Z 2025-12-04T10:49:11.0402962Z [W1204 10:16:51.751933942 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0402965Z 2025-12-04T10:49:11.0403015Z ('RERUN', {'yellow': True}) [0.5781s] [100%] 2025-12-04T10:49:11.0403367Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:16:51.328686266 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0403370Z 2025-12-04T10:49:11.0403519Z [W1204 10:16:51.329097948 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0403520Z 2025-12-04T10:49:11.0403670Z [W1204 10:16:51.329196317 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0403672Z 2025-12-04T10:49:11.0403819Z [W1204 10:16:51.330596540 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0403834Z 2025-12-04T10:49:11.0403981Z [W1204 10:16:51.330934983 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0404003Z 2025-12-04T10:49:11.0404150Z [W1204 10:16:51.331018991 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0404152Z 2025-12-04T10:49:11.0404299Z [W1204 10:16:51.333227879 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0404301Z 2025-12-04T10:49:11.0404449Z [W1204 10:16:51.333489224 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0404451Z 2025-12-04T10:49:11.0404600Z [W1204 10:16:51.333565002 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0404602Z 2025-12-04T10:49:11.0404641Z FAILED [0.5736s] [100%] 2025-12-04T10:49:11.0404643Z 2025-12-04T10:49:11.0404696Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0404847Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0404892Z Traceback (most recent call last): 2025-12-04T10:49:11.0405049Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0405088Z method(*args, **kwargs) 2025-12-04T10:49:11.0405244Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0405282Z method(*args, **kwargs) 2025-12-04T10:49:11.0405433Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0405470Z with policy(): 2025-12-04T10:49:11.0405621Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0405662Z raise RuntimeError(msg) 2025-12-04T10:49:11.0406056Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0406059Z 2025-12-04T10:49:11.0406152Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0406440Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0406442Z 2025-12-04T10:49:11.0406529Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0406600Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0406656Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0406833Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0406904Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0406940Z graph_break [] 2025-12-04T10:49:11.0407013Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0407356Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0407412Z if out == self.unknown_value: 2025-12-04T10:49:11.0407559Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0407616Z Traceback (most recent call last): 2025-12-04T10:49:11.0407768Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0407808Z method(*args, **kwargs) 2025-12-04T10:49:11.0407957Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0407997Z method(*args, **kwargs) 2025-12-04T10:49:11.0408148Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0408184Z with policy(): 2025-12-04T10:49:11.0408335Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0408375Z raise RuntimeError(msg) 2025-12-04T10:49:11.0408771Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0408774Z 2025-12-04T10:49:11.0408845Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0409133Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0409135Z 2025-12-04T10:49:11.0409219Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0409291Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0409347Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0409521Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0409594Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0409629Z graph_break [] 2025-12-04T10:49:11.0409700Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0410063Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0410108Z if out == self.unknown_value: 2025-12-04T10:49:11.0410177Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0410234Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0410303Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0410477Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0410514Z graph_break [] 2025-12-04T10:49:11.0410566Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0410714Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0410762Z Traceback (most recent call last): 2025-12-04T10:49:11.0410914Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0410955Z method(*args, **kwargs) 2025-12-04T10:49:11.0411107Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0411158Z method(*args, **kwargs) 2025-12-04T10:49:11.0411307Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0411354Z with policy(): 2025-12-04T10:49:11.0411504Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0411545Z raise RuntimeError(msg) 2025-12-04T10:49:11.0411987Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0411990Z 2025-12-04T10:49:11.0412062Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0412350Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0412353Z 2025-12-04T10:49:11.0412437Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0412508Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0412562Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0412736Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0412807Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0412844Z graph_break [] 2025-12-04T10:49:11.0412914Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0413256Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0413300Z if out == self.unknown_value: 2025-12-04T10:49:11.0413370Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0413425Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0413496Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0413706Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0413742Z graph_break [] 2025-12-04T10:49:11.0413814Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0413868Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0413939Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0414110Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0414147Z graph_break [] 2025-12-04T10:49:11.0414388Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-054ceffac23daad3.xml - 2025-12-04T10:49:11.0414447Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0415068Z FAILED [0.5736s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0415097Z 2025-12-04T10:49:11.0415168Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0415455Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0415457Z 2025-12-04T10:49:11.0415544Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0415605Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0415670Z ================== 1 failed, 57 deselected, 2 rerun in 11.49s ================== 2025-12-04T10:49:11.0415708Z Got exit code 1 2025-12-04T10:49:11.0415748Z Retrying single test... 2025-12-04T10:49:11.0415944Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6e61d44b25d45b0c.xml 2025-12-04T10:49:11.0416001Z ============================= test session starts ============================== 2025-12-04T10:49:11.0416114Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0416154Z cachedir: .pytest_cache 2025-12-04T10:49:11.0416313Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0416357Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0416399Z configfile: pytest.ini 2025-12-04T10:49:11.0416560Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0416634Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0416917Z stepcurrent: skipping 17 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0416961Z Running 1 items in this shard 2025-12-04T10:49:11.0416965Z 2025-12-04T10:49:11.0417343Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:17:00.264897191 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0417346Z 2025-12-04T10:49:11.0417498Z [W1204 10:17:07.078807526 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0417500Z 2025-12-04T10:49:11.0417648Z [W1204 10:17:07.078980413 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0417651Z 2025-12-04T10:49:11.0417801Z [W1204 10:17:07.082525815 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0417804Z 2025-12-04T10:49:11.0417949Z [W1204 10:17:07.082829919 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0417951Z 2025-12-04T10:49:11.0418097Z [W1204 10:17:07.082911387 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0418100Z 2025-12-04T10:49:11.0418246Z [W1204 10:17:07.085444399 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0418248Z 2025-12-04T10:49:11.0418398Z [W1204 10:17:07.085715094 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0418418Z 2025-12-04T10:49:11.0418565Z [W1204 10:17:07.085792922 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0418581Z 2025-12-04T10:49:11.0418629Z ('RERUN', {'yellow': True}) [9.4350s] [100%] 2025-12-04T10:49:11.0418987Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:17:08.244630741 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0418989Z 2025-12-04T10:49:11.0419136Z [W1204 10:17:08.245030383 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0419138Z 2025-12-04T10:49:11.0419286Z [W1204 10:17:08.245118751 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0419288Z 2025-12-04T10:49:11.0419436Z [W1204 10:17:08.246535174 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0419442Z 2025-12-04T10:49:11.0419589Z [W1204 10:17:08.246860778 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0419591Z 2025-12-04T10:49:11.0419739Z [W1204 10:17:08.246939206 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0419742Z 2025-12-04T10:49:11.0419892Z [W1204 10:17:08.249247582 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0419894Z 2025-12-04T10:49:11.0420042Z [W1204 10:17:08.249510687 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0420045Z 2025-12-04T10:49:11.0420190Z [W1204 10:17:08.249586685 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0420193Z 2025-12-04T10:49:11.0420242Z ('RERUN', {'yellow': True}) [0.6574s] [100%] 2025-12-04T10:49:11.0420595Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:17:09.902451199 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0420616Z 2025-12-04T10:49:11.0420763Z [W1204 10:17:09.902836602 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0420765Z 2025-12-04T10:49:11.0420912Z [W1204 10:17:09.902924960 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0420914Z 2025-12-04T10:49:11.0421060Z [W1204 10:17:09.904324174 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0421063Z 2025-12-04T10:49:11.0421210Z [W1204 10:17:09.904649817 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0421212Z 2025-12-04T10:49:11.0421359Z [W1204 10:17:09.904727426 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0421361Z 2025-12-04T10:49:11.0421508Z [W1204 10:17:09.907036801 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0421510Z 2025-12-04T10:49:11.0421657Z [W1204 10:17:09.907298886 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0421670Z 2025-12-04T10:49:11.0421817Z [W1204 10:17:09.907375445 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0421832Z 2025-12-04T10:49:11.0421902Z FAILED [0.6482s] [100%] 2025-12-04T10:49:11.0421904Z 2025-12-04T10:49:11.0421955Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0422107Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0422153Z Traceback (most recent call last): 2025-12-04T10:49:11.0422312Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0422351Z method(*args, **kwargs) 2025-12-04T10:49:11.0422503Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0422543Z method(*args, **kwargs) 2025-12-04T10:49:11.0422693Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0422734Z with policy(): 2025-12-04T10:49:11.0422884Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0422925Z raise RuntimeError(msg) 2025-12-04T10:49:11.0423321Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0423324Z 2025-12-04T10:49:11.0423398Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0423684Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0423688Z 2025-12-04T10:49:11.0423773Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0423844Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0423901Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0424111Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0424183Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0424223Z graph_break [] 2025-12-04T10:49:11.0424293Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0424637Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0424682Z if out == self.unknown_value: 2025-12-04T10:49:11.0424831Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0424876Z Traceback (most recent call last): 2025-12-04T10:49:11.0425029Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0425069Z method(*args, **kwargs) 2025-12-04T10:49:11.0425219Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0425258Z method(*args, **kwargs) 2025-12-04T10:49:11.0425408Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0425456Z with policy(): 2025-12-04T10:49:11.0425607Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0425663Z raise RuntimeError(msg) 2025-12-04T10:49:11.0426061Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0426065Z 2025-12-04T10:49:11.0426137Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0426423Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0426426Z 2025-12-04T10:49:11.0426512Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0426582Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0426638Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0426811Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0426883Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0426920Z graph_break [] 2025-12-04T10:49:11.0426990Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0427328Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0427373Z if out == self.unknown_value: 2025-12-04T10:49:11.0427443Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0427499Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0427568Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0427743Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0427780Z graph_break [] 2025-12-04T10:49:11.0427851Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0427999Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0428044Z Traceback (most recent call last): 2025-12-04T10:49:11.0428201Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0428240Z method(*args, **kwargs) 2025-12-04T10:49:11.0428391Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0428429Z method(*args, **kwargs) 2025-12-04T10:49:11.0428580Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0428617Z with policy(): 2025-12-04T10:49:11.0428771Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0428810Z raise RuntimeError(msg) 2025-12-04T10:49:11.0429207Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0429219Z 2025-12-04T10:49:11.0429291Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0429586Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0429588Z 2025-12-04T10:49:11.0429675Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0429748Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0429804Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0429977Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0430049Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0430084Z graph_break [] 2025-12-04T10:49:11.0430156Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0430498Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0430543Z if out == self.unknown_value: 2025-12-04T10:49:11.0430615Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0430669Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0430739Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0430913Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0430948Z graph_break [] 2025-12-04T10:49:11.0431018Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0431074Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0431145Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0431316Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0431352Z graph_break [] 2025-12-04T10:49:11.0431617Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6e61d44b25d45b0c.xml - 2025-12-04T10:49:11.0431676Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0432430Z FAILED [0.6482s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0432435Z 2025-12-04T10:49:11.0432506Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0432794Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0432796Z 2025-12-04T10:49:11.0432881Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0432959Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0433024Z ================== 1 failed, 57 deselected, 2 rerun in 10.90s ================== 2025-12-04T10:49:11.0433075Z Got exit code 1 2025-12-04T10:49:11.0433313Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0433442Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0433639Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-21de0c23f9491a12.xml 2025-12-04T10:49:11.0433694Z ============================= test session starts ============================== 2025-12-04T10:49:11.0433805Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0433846Z cachedir: .pytest_cache 2025-12-04T10:49:11.0434004Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0434052Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0434093Z configfile: pytest.ini 2025-12-04T10:49:11.0434253Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0434327Z collecting ... collected 58 items / 18 deselected / 40 selected 2025-12-04T10:49:11.0434379Z stepcurrent: skipping 18 already run items. 2025-12-04T10:49:11.0434425Z Running 40 items in this shard 2025-12-04T10:49:11.0434427Z 2025-12-04T10:49:11.0434673Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [3.0991s] [ 2%] 2025-12-04T10:49:11.0434919Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6574s] [ 2%] 2025-12-04T10:49:11.0435142Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 FAILED [0.6458s] [ 2%] 2025-12-04T10:49:11.0435147Z 2025-12-04T10:49:11.0435197Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0435370Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0435415Z Traceback (most recent call last): 2025-12-04T10:49:11.0435572Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0435611Z method(*args, **kwargs) 2025-12-04T10:49:11.0435764Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0435803Z method(*args, **kwargs) 2025-12-04T10:49:11.0435953Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0435990Z with policy(): 2025-12-04T10:49:11.0436143Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0436183Z raise RuntimeError(msg) 2025-12-04T10:49:11.0436698Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0436700Z 2025-12-04T10:49:11.0436788Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0437075Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0437093Z 2025-12-04T10:49:11.0437179Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0437249Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0437305Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0437581Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0437653Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0437690Z graph_break [] 2025-12-04T10:49:11.0437841Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0437885Z Traceback (most recent call last): 2025-12-04T10:49:11.0438040Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0438080Z method(*args, **kwargs) 2025-12-04T10:49:11.0438228Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0438267Z method(*args, **kwargs) 2025-12-04T10:49:11.0438416Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0438454Z with policy(): 2025-12-04T10:49:11.0438605Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0438646Z raise RuntimeError(msg) 2025-12-04T10:49:11.0439042Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0439045Z 2025-12-04T10:49:11.0439117Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0439428Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0439430Z 2025-12-04T10:49:11.0439517Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0439588Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0439645Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0439915Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0439987Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0440024Z graph_break [] 2025-12-04T10:49:11.0440093Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0440149Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0440218Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0440484Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0440536Z graph_break [] 2025-12-04T10:49:11.0440588Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0440751Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0440797Z Traceback (most recent call last): 2025-12-04T10:49:11.0440949Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0440988Z method(*args, **kwargs) 2025-12-04T10:49:11.0441140Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0441180Z method(*args, **kwargs) 2025-12-04T10:49:11.0441330Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0441369Z with policy(): 2025-12-04T10:49:11.0441519Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0441562Z raise RuntimeError(msg) 2025-12-04T10:49:11.0442003Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0442005Z 2025-12-04T10:49:11.0442078Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0442364Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0442367Z 2025-12-04T10:49:11.0442450Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0442521Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0442576Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0442847Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0442917Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0442986Z graph_break [] 2025-12-04T10:49:11.0443059Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0443113Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0443183Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0443451Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0443489Z graph_break [] 2025-12-04T10:49:11.0443558Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0443613Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0443682Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0443953Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0443988Z graph_break [] 2025-12-04T10:49:11.0444230Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-21de0c23f9491a12.xml - 2025-12-04T10:49:11.0444301Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0444938Z FAILED [0.6458s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0444940Z 2025-12-04T10:49:11.0445012Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0445295Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0445298Z 2025-12-04T10:49:11.0445385Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0445446Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0445513Z ================== 1 failed, 18 deselected, 2 rerun in 4.55s =================== 2025-12-04T10:49:11.0445549Z Got exit code 1 2025-12-04T10:49:11.0445590Z Retrying single test... 2025-12-04T10:49:11.0445788Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6d5af1af113a1342.xml 2025-12-04T10:49:11.0445846Z ============================= test session starts ============================== 2025-12-04T10:49:11.0445955Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0445998Z cachedir: .pytest_cache 2025-12-04T10:49:11.0446153Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0446199Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0446239Z configfile: pytest.ini 2025-12-04T10:49:11.0446400Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0446473Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0446782Z stepcurrent: skipping 18 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0446828Z Running 1 items in this shard 2025-12-04T10:49:11.0446830Z 2025-12-04T10:49:11.0447187Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:17:30.450564393 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0447191Z 2025-12-04T10:49:11.0447343Z [W1204 10:17:38.029359063 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0447345Z 2025-12-04T10:49:11.0447496Z [W1204 10:17:38.029531050 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0447499Z 2025-12-04T10:49:11.0447651Z [W1204 10:17:38.033702990 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0447653Z 2025-12-04T10:49:11.0447801Z [W1204 10:17:38.034006114 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0447813Z 2025-12-04T10:49:11.0447958Z [W1204 10:17:38.034084713 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0447960Z 2025-12-04T10:49:11.0448117Z [W1204 10:17:38.036637954 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0448119Z 2025-12-04T10:49:11.0448265Z [W1204 10:17:38.036908689 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0448269Z 2025-12-04T10:49:11.0448418Z [W1204 10:17:38.036984267 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0448420Z 2025-12-04T10:49:11.0448471Z ('RERUN', {'yellow': True}) [10.6444s] [100%] 2025-12-04T10:49:11.0448826Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:17:39.844296434 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0448830Z 2025-12-04T10:49:11.0448979Z [W1204 10:17:39.844702366 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0448981Z 2025-12-04T10:49:11.0449127Z [W1204 10:17:39.844800314 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0449129Z 2025-12-04T10:49:11.0449278Z [W1204 10:17:39.846214547 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0449280Z 2025-12-04T10:49:11.0449426Z [W1204 10:17:39.846486612 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0449428Z 2025-12-04T10:49:11.0449576Z [W1204 10:17:39.846567970 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0449577Z 2025-12-04T10:49:11.0449728Z [W1204 10:17:39.848802828 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0449731Z 2025-12-04T10:49:11.0449878Z [W1204 10:17:39.849075582 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0449880Z 2025-12-04T10:49:11.0450045Z [W1204 10:17:39.849159581 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0450047Z 2025-12-04T10:49:11.0450095Z ('RERUN', {'yellow': True}) [0.6749s] [100%] 2025-12-04T10:49:11.0450449Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:17:39.519665839 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0450453Z 2025-12-04T10:49:11.0450602Z [W1204 10:17:39.520087041 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0450605Z 2025-12-04T10:49:11.0450752Z [W1204 10:17:39.520189979 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0450753Z 2025-12-04T10:49:11.0450903Z [W1204 10:17:39.521627841 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0450905Z 2025-12-04T10:49:11.0451050Z [W1204 10:17:39.521907566 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0451053Z 2025-12-04T10:49:11.0451199Z [W1204 10:17:39.521990384 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0451212Z 2025-12-04T10:49:11.0451358Z [W1204 10:17:39.524244391 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0451371Z 2025-12-04T10:49:11.0451518Z [W1204 10:17:39.524512806 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0451520Z 2025-12-04T10:49:11.0451670Z [W1204 10:17:39.524591144 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0451671Z 2025-12-04T10:49:11.0451709Z FAILED [0.6703s] [100%] 2025-12-04T10:49:11.0451711Z 2025-12-04T10:49:11.0451763Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0451947Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0451994Z Traceback (most recent call last): 2025-12-04T10:49:11.0452150Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0452191Z method(*args, **kwargs) 2025-12-04T10:49:11.0452344Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0452382Z method(*args, **kwargs) 2025-12-04T10:49:11.0452534Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0452569Z with policy(): 2025-12-04T10:49:11.0452721Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0452760Z raise RuntimeError(msg) 2025-12-04T10:49:11.0453152Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0453157Z 2025-12-04T10:49:11.0453229Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0453518Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0453558Z 2025-12-04T10:49:11.0453644Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0453716Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0453772Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0454043Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0454117Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0454152Z graph_break [] 2025-12-04T10:49:11.0454224Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0454568Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0454613Z if out == self.unknown_value: 2025-12-04T10:49:11.0454762Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0454819Z Traceback (most recent call last): 2025-12-04T10:49:11.0454971Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0455027Z method(*args, **kwargs) 2025-12-04T10:49:11.0455178Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0455217Z method(*args, **kwargs) 2025-12-04T10:49:11.0455365Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0455402Z with policy(): 2025-12-04T10:49:11.0455554Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0455594Z raise RuntimeError(msg) 2025-12-04T10:49:11.0455992Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0455997Z 2025-12-04T10:49:11.0456068Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0456356Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0456358Z 2025-12-04T10:49:11.0456444Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0456514Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0456568Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0456838Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0456910Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0456947Z graph_break [] 2025-12-04T10:49:11.0457017Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0457380Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0457424Z if out == self.unknown_value: 2025-12-04T10:49:11.0457494Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0457548Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0457619Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0457888Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0457927Z graph_break [] 2025-12-04T10:49:11.0457979Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0458127Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0458175Z Traceback (most recent call last): 2025-12-04T10:49:11.0458328Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0458368Z method(*args, **kwargs) 2025-12-04T10:49:11.0458517Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0458568Z method(*args, **kwargs) 2025-12-04T10:49:11.0458716Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0458771Z with policy(): 2025-12-04T10:49:11.0458922Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0458963Z raise RuntimeError(msg) 2025-12-04T10:49:11.0459363Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0459366Z 2025-12-04T10:49:11.0459438Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0459725Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0459730Z 2025-12-04T10:49:11.0459814Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0459884Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0459938Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0460210Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0460282Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0460319Z graph_break [] 2025-12-04T10:49:11.0460389Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0460734Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0460781Z if out == self.unknown_value: 2025-12-04T10:49:11.0460852Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0460907Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0460999Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0461269Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0461304Z graph_break [] 2025-12-04T10:49:11.0461377Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0461431Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0461501Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0461769Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0461806Z graph_break [] 2025-12-04T10:49:11.0462089Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6d5af1af113a1342.xml - 2025-12-04T10:49:11.0462151Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0462774Z FAILED [0.6703s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0462804Z 2025-12-04T10:49:11.0462876Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0463165Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0463167Z 2025-12-04T10:49:11.0463251Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0463314Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0463381Z ================== 1 failed, 57 deselected, 2 rerun in 12.16s ================== 2025-12-04T10:49:11.0463420Z Got exit code 1 2025-12-04T10:49:11.0463460Z Retrying single test... 2025-12-04T10:49:11.0463660Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8cd753204fad372e.xml 2025-12-04T10:49:11.0463715Z ============================= test session starts ============================== 2025-12-04T10:49:11.0463829Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0463871Z cachedir: .pytest_cache 2025-12-04T10:49:11.0464030Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0464077Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0464118Z configfile: pytest.ini 2025-12-04T10:49:11.0464285Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0464356Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0464644Z stepcurrent: skipping 18 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0464688Z Running 1 items in this shard 2025-12-04T10:49:11.0464690Z 2025-12-04T10:49:11.0465079Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:17:50.775926148 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0465082Z 2025-12-04T10:49:11.0465235Z [W1204 10:17:57.161900619 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0465237Z 2025-12-04T10:49:11.0465388Z [W1204 10:17:57.162060596 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0465391Z 2025-12-04T10:49:11.0465539Z [W1204 10:17:57.166263226 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0465541Z 2025-12-04T10:49:11.0465687Z [W1204 10:17:57.166580690 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0465691Z 2025-12-04T10:49:11.0465838Z [W1204 10:17:57.166657469 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0465840Z 2025-12-04T10:49:11.0465987Z [W1204 10:17:57.169308288 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0466001Z 2025-12-04T10:49:11.0466150Z [W1204 10:17:57.169579803 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0466164Z 2025-12-04T10:49:11.0466314Z [W1204 10:17:57.169655021 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0466315Z 2025-12-04T10:49:11.0466365Z ('RERUN', {'yellow': True}) [10.4884s] [100%] 2025-12-04T10:49:11.0466722Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:17:58.019561105 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0466725Z 2025-12-04T10:49:11.0466873Z [W1204 10:17:58.019966357 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0466876Z 2025-12-04T10:49:11.0467026Z [W1204 10:17:58.020071625 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0467029Z 2025-12-04T10:49:11.0467177Z [W1204 10:17:58.021520777 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0467181Z 2025-12-04T10:49:11.0467328Z [W1204 10:17:58.021788792 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0467330Z 2025-12-04T10:49:11.0467485Z [W1204 10:17:58.021871620 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0467487Z 2025-12-04T10:49:11.0467634Z [W1204 10:17:58.024216886 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0467637Z 2025-12-04T10:49:11.0467785Z [W1204 10:17:58.024484800 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0467788Z 2025-12-04T10:49:11.0467934Z [W1204 10:17:58.024565399 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0467939Z 2025-12-04T10:49:11.0467988Z ('RERUN', {'yellow': True}) [0.7006s] [100%] 2025-12-04T10:49:11.0468362Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:17:59.686834710 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0468364Z 2025-12-04T10:49:11.0468510Z [W1204 10:17:59.687245852 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0468513Z 2025-12-04T10:49:11.0468662Z [W1204 10:17:59.687350100 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0468664Z 2025-12-04T10:49:11.0468811Z [W1204 10:17:59.688788903 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0468813Z 2025-12-04T10:49:11.0468965Z [W1204 10:17:59.689071537 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0468967Z 2025-12-04T10:49:11.0469117Z [W1204 10:17:59.689154066 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0469118Z 2025-12-04T10:49:11.0469267Z [W1204 10:17:59.691509381 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0469279Z 2025-12-04T10:49:11.0469432Z [W1204 10:17:59.691772366 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0469445Z 2025-12-04T10:49:11.0469593Z [W1204 10:17:59.691851884 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0469595Z 2025-12-04T10:49:11.0469634Z FAILED [0.6660s] [100%] 2025-12-04T10:49:11.0469636Z 2025-12-04T10:49:11.0469687Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0469839Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0469887Z Traceback (most recent call last): 2025-12-04T10:49:11.0470044Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0470087Z method(*args, **kwargs) 2025-12-04T10:49:11.0470238Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0470280Z method(*args, **kwargs) 2025-12-04T10:49:11.0470430Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0470468Z with policy(): 2025-12-04T10:49:11.0470620Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0470660Z raise RuntimeError(msg) 2025-12-04T10:49:11.0471052Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0471055Z 2025-12-04T10:49:11.0471133Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0471419Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0471423Z 2025-12-04T10:49:11.0471510Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0471583Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0471656Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0471968Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0472040Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0472079Z graph_break [] 2025-12-04T10:49:11.0472150Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0472495Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0472539Z if out == self.unknown_value: 2025-12-04T10:49:11.0472692Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0472738Z Traceback (most recent call last): 2025-12-04T10:49:11.0472893Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0472931Z method(*args, **kwargs) 2025-12-04T10:49:11.0473083Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0473149Z method(*args, **kwargs) 2025-12-04T10:49:11.0473302Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0473353Z with policy(): 2025-12-04T10:49:11.0473506Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0473546Z raise RuntimeError(msg) 2025-12-04T10:49:11.0473949Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0473952Z 2025-12-04T10:49:11.0474025Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0474309Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0474313Z 2025-12-04T10:49:11.0474399Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0474469Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0474527Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0474799Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0474872Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0474910Z graph_break [] 2025-12-04T10:49:11.0474983Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0475325Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0475369Z if out == self.unknown_value: 2025-12-04T10:49:11.0475440Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0475495Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0475592Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0475859Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0475897Z graph_break [] 2025-12-04T10:49:11.0475949Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0476100Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0476146Z Traceback (most recent call last): 2025-12-04T10:49:11.0476300Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0476339Z method(*args, **kwargs) 2025-12-04T10:49:11.0476494Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0476533Z method(*args, **kwargs) 2025-12-04T10:49:11.0476683Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0476719Z with policy(): 2025-12-04T10:49:11.0476883Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0476922Z raise RuntimeError(msg) 2025-12-04T10:49:11.0477335Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0477337Z 2025-12-04T10:49:11.0477410Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0477695Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0477697Z 2025-12-04T10:49:11.0477785Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0477855Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0477911Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0478184Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0478257Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0478292Z graph_break [] 2025-12-04T10:49:11.0478367Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0478707Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0478750Z if out == self.unknown_value: 2025-12-04T10:49:11.0478824Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0478879Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0478952Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0479222Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0479280Z graph_break [] 2025-12-04T10:49:11.0479350Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0479405Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0479475Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0479741Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0479778Z graph_break [] 2025-12-04T10:49:11.0480020Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8cd753204fad372e.xml - 2025-12-04T10:49:11.0480080Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0480709Z FAILED [0.6660s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0480722Z 2025-12-04T10:49:11.0480795Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0481089Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0481091Z 2025-12-04T10:49:11.0481178Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0481240Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0481306Z ================== 1 failed, 57 deselected, 2 rerun in 12.03s ================== 2025-12-04T10:49:11.0481341Z Got exit code 1 2025-12-04T10:49:11.0481578Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0481708Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0481934Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-5c0df7d8c6af8b9c.xml 2025-12-04T10:49:11.0481992Z ============================= test session starts ============================== 2025-12-04T10:49:11.0482103Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0482188Z cachedir: .pytest_cache 2025-12-04T10:49:11.0484127Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0484178Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0484220Z configfile: pytest.ini 2025-12-04T10:49:11.0484385Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0484462Z collecting ... collected 58 items / 19 deselected / 39 selected 2025-12-04T10:49:11.0484517Z stepcurrent: skipping 19 already run items. 2025-12-04T10:49:11.0484562Z Running 39 items in this shard 2025-12-04T10:49:11.0484564Z 2025-12-04T10:49:11.0484815Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.7516s] [ 2%] 2025-12-04T10:49:11.0485100Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6502s] [ 2%] 2025-12-04T10:49:11.0485321Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 FAILED [0.6469s] [ 2%] 2025-12-04T10:49:11.0485324Z 2025-12-04T10:49:11.0485375Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0485524Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0485570Z Traceback (most recent call last): 2025-12-04T10:49:11.0485728Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0485768Z method(*args, **kwargs) 2025-12-04T10:49:11.0485923Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0485961Z method(*args, **kwargs) 2025-12-04T10:49:11.0486111Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0486176Z with policy(): 2025-12-04T10:49:11.0486327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0486367Z raise RuntimeError(msg) 2025-12-04T10:49:11.0486784Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0486786Z 2025-12-04T10:49:11.0486864Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0487153Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0487155Z 2025-12-04T10:49:11.0487242Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0487313Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0487369Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0487546Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0487617Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0487653Z graph_break [] 2025-12-04T10:49:11.0487806Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0487850Z Traceback (most recent call last): 2025-12-04T10:49:11.0488003Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0488044Z method(*args, **kwargs) 2025-12-04T10:49:11.0488194Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0488233Z method(*args, **kwargs) 2025-12-04T10:49:11.0488382Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0488419Z with policy(): 2025-12-04T10:49:11.0488567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0488608Z raise RuntimeError(msg) 2025-12-04T10:49:11.0489034Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0489038Z 2025-12-04T10:49:11.0489110Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0489396Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0489399Z 2025-12-04T10:49:11.0489484Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0489556Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0489610Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0489786Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0489857Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0489903Z graph_break [] 2025-12-04T10:49:11.0489973Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0490027Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0490108Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0490280Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0490315Z graph_break [] 2025-12-04T10:49:11.0490368Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0490518Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0490563Z Traceback (most recent call last): 2025-12-04T10:49:11.0490717Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0490758Z method(*args, **kwargs) 2025-12-04T10:49:11.0490908Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0490948Z method(*args, **kwargs) 2025-12-04T10:49:11.0491098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0491135Z with policy(): 2025-12-04T10:49:11.0491285Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0491325Z raise RuntimeError(msg) 2025-12-04T10:49:11.0491732Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0491736Z 2025-12-04T10:49:11.0491807Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0492131Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0492134Z 2025-12-04T10:49:11.0492218Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0492289Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0492369Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0492542Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0492612Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0492649Z graph_break [] 2025-12-04T10:49:11.0492718Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0492772Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0492842Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0493015Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0493051Z graph_break [] 2025-12-04T10:49:11.0493123Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0493175Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0493245Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0493416Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0493475Z graph_break [] 2025-12-04T10:49:11.0493715Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-5c0df7d8c6af8b9c.xml - 2025-12-04T10:49:11.0493799Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0494434Z FAILED [0.6469s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0494436Z 2025-12-04T10:49:11.0494507Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0494793Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0494796Z 2025-12-04T10:49:11.0494880Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0494940Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0495007Z ================== 1 failed, 19 deselected, 2 rerun in 4.22s =================== 2025-12-04T10:49:11.0495044Z Got exit code 1 2025-12-04T10:49:11.0495084Z Retrying single test... 2025-12-04T10:49:11.0495278Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-f4aed3974486ec12.xml 2025-12-04T10:49:11.0495336Z ============================= test session starts ============================== 2025-12-04T10:49:11.0495448Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0495488Z cachedir: .pytest_cache 2025-12-04T10:49:11.0495646Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0495692Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0495731Z configfile: pytest.ini 2025-12-04T10:49:11.0495894Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0495988Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0496271Z stepcurrent: skipping 19 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0496316Z Running 1 items in this shard 2025-12-04T10:49:11.0496318Z 2025-12-04T10:49:11.0496678Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:18:19.474240716 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0496682Z 2025-12-04T10:49:11.0496834Z [W1204 10:18:27.926942297 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0496836Z 2025-12-04T10:49:11.0496987Z [W1204 10:18:27.927100844 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0496989Z 2025-12-04T10:49:11.0497137Z [W1204 10:18:27.931080948 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0497148Z 2025-12-04T10:49:11.0497295Z [W1204 10:18:27.931387532 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0497298Z 2025-12-04T10:49:11.0497455Z [W1204 10:18:27.931464081 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0497457Z 2025-12-04T10:49:11.0497606Z [W1204 10:18:27.933893035 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0497608Z 2025-12-04T10:49:11.0497755Z [W1204 10:18:27.934168760 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0497757Z 2025-12-04T10:49:11.0497905Z [W1204 10:18:27.934249108 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0497908Z 2025-12-04T10:49:11.0497957Z ('RERUN', {'yellow': True}) [10.2466s] [100%] 2025-12-04T10:49:11.0498314Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:18:28.135238959 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0498317Z 2025-12-04T10:49:11.0498464Z [W1204 10:18:28.135692810 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0498466Z 2025-12-04T10:49:11.0498614Z [W1204 10:18:28.135785628 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0498616Z 2025-12-04T10:49:11.0498763Z [W1204 10:18:28.137184202 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0498765Z 2025-12-04T10:49:11.0498911Z [W1204 10:18:28.137534875 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0498913Z 2025-12-04T10:49:11.0499060Z [W1204 10:18:28.137619014 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0499062Z 2025-12-04T10:49:11.0499209Z [W1204 10:18:28.139837681 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0499211Z 2025-12-04T10:49:11.0499383Z [W1204 10:18:28.140116516 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0499385Z 2025-12-04T10:49:11.0499532Z [W1204 10:18:28.140199324 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0499533Z 2025-12-04T10:49:11.0499583Z ('RERUN', {'yellow': True}) [0.6885s] [100%] 2025-12-04T10:49:11.0499939Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:18:29.772708751 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0499942Z 2025-12-04T10:49:11.0500088Z [W1204 10:18:29.773432107 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0500091Z 2025-12-04T10:49:11.0500238Z [W1204 10:18:29.773539125 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0500240Z 2025-12-04T10:49:11.0500387Z [W1204 10:18:29.775248482 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0500388Z 2025-12-04T10:49:11.0500549Z [W1204 10:18:29.775653244 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0500551Z 2025-12-04T10:49:11.0500697Z [W1204 10:18:29.775732503 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0500709Z 2025-12-04T10:49:11.0500855Z [W1204 10:18:29.778165957 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0500858Z 2025-12-04T10:49:11.0501006Z [W1204 10:18:29.778448601 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0501008Z 2025-12-04T10:49:11.0501154Z [W1204 10:18:29.778526030 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0501156Z 2025-12-04T10:49:11.0501193Z FAILED [0.6532s] [100%] 2025-12-04T10:49:11.0501196Z 2025-12-04T10:49:11.0501249Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0501398Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0501445Z Traceback (most recent call last): 2025-12-04T10:49:11.0501601Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0501641Z method(*args, **kwargs) 2025-12-04T10:49:11.0501795Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0501835Z method(*args, **kwargs) 2025-12-04T10:49:11.0502031Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0502068Z with policy(): 2025-12-04T10:49:11.0502220Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0502260Z raise RuntimeError(msg) 2025-12-04T10:49:11.0502656Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0502660Z 2025-12-04T10:49:11.0502732Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0503054Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0503057Z 2025-12-04T10:49:11.0503142Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0503216Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0503270Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0503446Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0503517Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0503554Z graph_break [] 2025-12-04T10:49:11.0503624Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0503975Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0504019Z if out == self.unknown_value: 2025-12-04T10:49:11.0504180Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0504225Z Traceback (most recent call last): 2025-12-04T10:49:11.0504391Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0504431Z method(*args, **kwargs) 2025-12-04T10:49:11.0504579Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0504618Z method(*args, **kwargs) 2025-12-04T10:49:11.0504767Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0504803Z with policy(): 2025-12-04T10:49:11.0504953Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0504994Z raise RuntimeError(msg) 2025-12-04T10:49:11.0505397Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0505402Z 2025-12-04T10:49:11.0505474Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0505763Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0505767Z 2025-12-04T10:49:11.0505851Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0505923Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0505981Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0506155Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0506227Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0506263Z graph_break [] 2025-12-04T10:49:11.0506333Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0506695Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0506738Z if out == self.unknown_value: 2025-12-04T10:49:11.0506808Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0506862Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0506933Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0507105Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0507142Z graph_break [] 2025-12-04T10:49:11.0507192Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0507344Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0507392Z Traceback (most recent call last): 2025-12-04T10:49:11.0507545Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0507584Z method(*args, **kwargs) 2025-12-04T10:49:11.0507733Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0507782Z method(*args, **kwargs) 2025-12-04T10:49:11.0507930Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0507979Z with policy(): 2025-12-04T10:49:11.0508129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0508170Z raise RuntimeError(msg) 2025-12-04T10:49:11.0508577Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0508579Z 2025-12-04T10:49:11.0508651Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0508937Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0508940Z 2025-12-04T10:49:11.0509025Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0509094Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0509149Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0509322Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0509392Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0509428Z graph_break [] 2025-12-04T10:49:11.0509497Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0509836Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0509879Z if out == self.unknown_value: 2025-12-04T10:49:11.0509949Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0510002Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0510072Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0510263Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0510300Z graph_break [] 2025-12-04T10:49:11.0510370Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0510425Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0510493Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0510665Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0510701Z graph_break [] 2025-12-04T10:49:11.0510942Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-f4aed3974486ec12.xml - 2025-12-04T10:49:11.0511003Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0511639Z FAILED [0.6532s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0511664Z 2025-12-04T10:49:11.0511735Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0512069Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0512072Z 2025-12-04T10:49:11.0512158Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0512218Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0512284Z ================== 1 failed, 57 deselected, 2 rerun in 11.76s ================== 2025-12-04T10:49:11.0512320Z Got exit code 1 2025-12-04T10:49:11.0512361Z Retrying single test... 2025-12-04T10:49:11.0512556Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-b875236b1f901482.xml 2025-12-04T10:49:11.0512613Z ============================= test session starts ============================== 2025-12-04T10:49:11.0512725Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0512765Z cachedir: .pytest_cache 2025-12-04T10:49:11.0512924Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0512971Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0513011Z configfile: pytest.ini 2025-12-04T10:49:11.0513173Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0513246Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0513530Z stepcurrent: skipping 19 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0513575Z Running 1 items in this shard 2025-12-04T10:49:11.0513577Z 2025-12-04T10:49:11.0513974Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:18:38.266870890 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0513977Z 2025-12-04T10:49:11.0514128Z [W1204 10:18:45.479191891 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0514130Z 2025-12-04T10:49:11.0514279Z [W1204 10:18:45.479341199 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0514282Z 2025-12-04T10:49:11.0514429Z [W1204 10:18:45.483157836 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0514432Z 2025-12-04T10:49:11.0514579Z [W1204 10:18:45.483466570 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0514580Z 2025-12-04T10:49:11.0514727Z [W1204 10:18:45.483555328 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0514730Z 2025-12-04T10:49:11.0514878Z [W1204 10:18:45.486116360 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0514880Z 2025-12-04T10:49:11.0515027Z [W1204 10:18:45.486378555 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0515043Z 2025-12-04T10:49:11.0515190Z [W1204 10:18:45.486455443 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0515206Z 2025-12-04T10:49:11.0515255Z ('RERUN', {'yellow': True}) [9.8410s] [100%] 2025-12-04T10:49:11.0515611Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:18:46.486472208 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0515614Z 2025-12-04T10:49:11.0515762Z [W1204 10:18:46.486890900 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0515764Z 2025-12-04T10:49:11.0515911Z [W1204 10:18:46.486983448 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0515914Z 2025-12-04T10:49:11.0516061Z [W1204 10:18:46.488387041 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0516064Z 2025-12-04T10:49:11.0516212Z [W1204 10:18:46.488736755 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0516214Z 2025-12-04T10:49:11.0516360Z [W1204 10:18:46.488818773 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0516362Z 2025-12-04T10:49:11.0516512Z [W1204 10:18:46.491014612 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0516513Z 2025-12-04T10:49:11.0516659Z [W1204 10:18:46.491281936 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0516663Z 2025-12-04T10:49:11.0516808Z [W1204 10:18:46.491359065 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0516811Z 2025-12-04T10:49:11.0516858Z ('RERUN', {'yellow': True}) [0.4915s] [100%] 2025-12-04T10:49:11.0517216Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:18:47.990977340 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0517219Z 2025-12-04T10:49:11.0517385Z [W1204 10:18:47.991369943 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0517387Z 2025-12-04T10:49:11.0517532Z [W1204 10:18:47.991463611 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0517536Z 2025-12-04T10:49:11.0517682Z [W1204 10:18:47.992849125 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0517685Z 2025-12-04T10:49:11.0517832Z [W1204 10:18:47.993188789 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0517834Z 2025-12-04T10:49:11.0517979Z [W1204 10:18:47.993271547 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0517981Z 2025-12-04T10:49:11.0518130Z [W1204 10:18:47.995477415 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0518132Z 2025-12-04T10:49:11.0518278Z [W1204 10:18:47.995739210 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0518291Z 2025-12-04T10:49:11.0518438Z [W1204 10:18:47.995816918 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0518452Z 2025-12-04T10:49:11.0518491Z FAILED [0.5012s] [100%] 2025-12-04T10:49:11.0518493Z 2025-12-04T10:49:11.0518544Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0518694Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0518739Z Traceback (most recent call last): 2025-12-04T10:49:11.0518895Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0518934Z method(*args, **kwargs) 2025-12-04T10:49:11.0519088Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0519129Z method(*args, **kwargs) 2025-12-04T10:49:11.0519280Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0519316Z with policy(): 2025-12-04T10:49:11.0519469Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0519509Z raise RuntimeError(msg) 2025-12-04T10:49:11.0519909Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0519911Z 2025-12-04T10:49:11.0519984Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0520274Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0520279Z 2025-12-04T10:49:11.0520366Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0520437Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0520492Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0520687Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0520759Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0520795Z graph_break [] 2025-12-04T10:49:11.0520865Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0521207Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0521252Z if out == self.unknown_value: 2025-12-04T10:49:11.0521401Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0521446Z Traceback (most recent call last): 2025-12-04T10:49:11.0521597Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0521638Z method(*args, **kwargs) 2025-12-04T10:49:11.0521787Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0521826Z method(*args, **kwargs) 2025-12-04T10:49:11.0522207Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0522263Z with policy(): 2025-12-04T10:49:11.0522414Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0522470Z raise RuntimeError(msg) 2025-12-04T10:49:11.0522874Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0522878Z 2025-12-04T10:49:11.0522948Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0523235Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0523239Z 2025-12-04T10:49:11.0523323Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0523394Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0523449Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0523623Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0523695Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0523733Z graph_break [] 2025-12-04T10:49:11.0523804Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0524142Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0524188Z if out == self.unknown_value: 2025-12-04T10:49:11.0524258Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0524315Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0524384Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0524556Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0524591Z graph_break [] 2025-12-04T10:49:11.0524671Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0524820Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0524865Z Traceback (most recent call last): 2025-12-04T10:49:11.0525018Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0525058Z method(*args, **kwargs) 2025-12-04T10:49:11.0525209Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0525248Z method(*args, **kwargs) 2025-12-04T10:49:11.0525396Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0525433Z with policy(): 2025-12-04T10:49:11.0525584Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0525625Z raise RuntimeError(msg) 2025-12-04T10:49:11.0526033Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0526047Z 2025-12-04T10:49:11.0526137Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0526422Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0526424Z 2025-12-04T10:49:11.0526509Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0526579Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0526634Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0526806Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0526877Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0526912Z graph_break [] 2025-12-04T10:49:11.0526983Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0527322Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0527365Z if out == self.unknown_value: 2025-12-04T10:49:11.0527435Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0527489Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0527558Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0527731Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0527767Z graph_break [] 2025-12-04T10:49:11.0527838Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0527892Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0527962Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0528134Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0528170Z graph_break [] 2025-12-04T10:49:11.0528428Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-b875236b1f901482.xml - 2025-12-04T10:49:11.0528487Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0529120Z FAILED [0.5012s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0529124Z 2025-12-04T10:49:11.0529195Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0529484Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0529486Z 2025-12-04T10:49:11.0529570Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0529641Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0529707Z ================== 1 failed, 57 deselected, 2 rerun in 11.00s ================== 2025-12-04T10:49:11.0529756Z Got exit code 1 2025-12-04T10:49:11.0529994Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0530120Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0530318Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c02d3d3ab8b1391b.xml 2025-12-04T10:49:11.0530374Z ============================= test session starts ============================== 2025-12-04T10:49:11.0530484Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0530526Z cachedir: .pytest_cache 2025-12-04T10:49:11.0530682Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0530728Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0530769Z configfile: pytest.ini 2025-12-04T10:49:11.0530930Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0531003Z collecting ... collected 58 items / 20 deselected / 38 selected 2025-12-04T10:49:11.0531055Z stepcurrent: skipping 20 already run items. 2025-12-04T10:49:11.0531099Z Running 38 items in this shard 2025-12-04T10:49:11.0531101Z 2025-12-04T10:49:11.0531347Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.5257s] [ 2%] 2025-12-04T10:49:11.0531588Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.4315s] [ 2%] 2025-12-04T10:49:11.0531808Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 FAILED [0.4178s] [ 2%] 2025-12-04T10:49:11.0531810Z 2025-12-04T10:49:11.0531894Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0532072Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0532116Z Traceback (most recent call last): 2025-12-04T10:49:11.0532272Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0532310Z method(*args, **kwargs) 2025-12-04T10:49:11.0532463Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0532502Z method(*args, **kwargs) 2025-12-04T10:49:11.0532652Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0532688Z with policy(): 2025-12-04T10:49:11.0532839Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0532878Z raise RuntimeError(msg) 2025-12-04T10:49:11.0533270Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0533273Z 2025-12-04T10:49:11.0533359Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0533644Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0533662Z 2025-12-04T10:49:11.0533747Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0533817Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0533872Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0534047Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0534118Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0534154Z graph_break [] 2025-12-04T10:49:11.0534302Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0534345Z Traceback (most recent call last): 2025-12-04T10:49:11.0534499Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0534538Z method(*args, **kwargs) 2025-12-04T10:49:11.0534689Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0534728Z method(*args, **kwargs) 2025-12-04T10:49:11.0534878Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0534914Z with policy(): 2025-12-04T10:49:11.0535065Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0535105Z raise RuntimeError(msg) 2025-12-04T10:49:11.0535503Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0535506Z 2025-12-04T10:49:11.0535577Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0535882Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0535884Z 2025-12-04T10:49:11.0535969Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0536038Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0536094Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0536266Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0536339Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0536375Z graph_break [] 2025-12-04T10:49:11.0536445Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0536500Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0536569Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0536743Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0536778Z graph_break [] 2025-12-04T10:49:11.0536829Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0536986Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0537031Z Traceback (most recent call last): 2025-12-04T10:49:11.0537194Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0537233Z method(*args, **kwargs) 2025-12-04T10:49:11.0537381Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0537419Z method(*args, **kwargs) 2025-12-04T10:49:11.0537568Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0537605Z with policy(): 2025-12-04T10:49:11.0537755Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0537796Z raise RuntimeError(msg) 2025-12-04T10:49:11.0538192Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0538196Z 2025-12-04T10:49:11.0538267Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0538551Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0538553Z 2025-12-04T10:49:11.0538635Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0538706Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0538760Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0538934Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0539005Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0539041Z graph_break [] 2025-12-04T10:49:11.0539111Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0539165Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0539258Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0539430Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0539465Z graph_break [] 2025-12-04T10:49:11.0539536Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0539590Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0539660Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0539830Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0539866Z graph_break [] 2025-12-04T10:49:11.0540106Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c02d3d3ab8b1391b.xml - 2025-12-04T10:49:11.0540166Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0540786Z FAILED [0.4178s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0540818Z 2025-12-04T10:49:11.0540889Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0541177Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0541181Z 2025-12-04T10:49:11.0541265Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0541325Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0541391Z ================== 1 failed, 20 deselected, 2 rerun in 3.54s =================== 2025-12-04T10:49:11.0541427Z Got exit code 1 2025-12-04T10:49:11.0541467Z Retrying single test... 2025-12-04T10:49:11.0541662Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d4781cda0fec93ea.xml 2025-12-04T10:49:11.0541720Z ============================= test session starts ============================== 2025-12-04T10:49:11.0541830Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0541914Z cachedir: .pytest_cache 2025-12-04T10:49:11.0542072Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0542119Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0542159Z configfile: pytest.ini 2025-12-04T10:49:11.0542320Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0542392Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0542674Z stepcurrent: skipping 20 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0542719Z Running 1 items in this shard 2025-12-04T10:49:11.0542721Z 2025-12-04T10:49:11.0543105Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:19:05.460032860 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0543107Z 2025-12-04T10:49:11.0543261Z [W1204 10:19:13.139845034 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0543264Z 2025-12-04T10:49:11.0543415Z [W1204 10:19:13.139990991 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0543417Z 2025-12-04T10:49:11.0543564Z [W1204 10:19:13.143775729 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0543568Z 2025-12-04T10:49:11.0543714Z [W1204 10:19:13.144075344 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0543716Z 2025-12-04T10:49:11.0543864Z [W1204 10:19:13.144158802 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0543866Z 2025-12-04T10:49:11.0544012Z [W1204 10:19:13.146513907 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0544014Z 2025-12-04T10:49:11.0544175Z [W1204 10:19:13.146773752 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0544177Z 2025-12-04T10:49:11.0544324Z [W1204 10:19:13.146849191 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0544342Z 2025-12-04T10:49:11.0544392Z ('RERUN', {'yellow': True}) [10.2098s] [100%] 2025-12-04T10:49:11.0544747Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:19:14.092571344 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0544750Z 2025-12-04T10:49:11.0544898Z [W1204 10:19:14.092960677 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0544900Z 2025-12-04T10:49:11.0545046Z [W1204 10:19:14.093057605 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0545048Z 2025-12-04T10:49:11.0545194Z [W1204 10:19:14.094442508 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0545197Z 2025-12-04T10:49:11.0545343Z [W1204 10:19:14.094778132 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0545345Z 2025-12-04T10:49:11.0545493Z [W1204 10:19:14.094859101 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0545495Z 2025-12-04T10:49:11.0545641Z [W1204 10:19:14.097034089 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0545644Z 2025-12-04T10:49:11.0545789Z [W1204 10:19:14.097294864 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0545792Z 2025-12-04T10:49:11.0545938Z [W1204 10:19:14.097371363 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0545941Z 2025-12-04T10:49:11.0545989Z ('RERUN', {'yellow': True}) [0.4461s] [100%] 2025-12-04T10:49:11.0546363Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:19:15.527736876 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0546365Z 2025-12-04T10:49:11.0546512Z [W1204 10:19:15.528133299 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0546514Z 2025-12-04T10:49:11.0546660Z [W1204 10:19:15.528230247 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0546663Z 2025-12-04T10:49:11.0546810Z [W1204 10:19:15.529612991 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0546813Z 2025-12-04T10:49:11.0546958Z [W1204 10:19:15.529950864 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0546960Z 2025-12-04T10:49:11.0547108Z [W1204 10:19:15.530032383 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0547110Z 2025-12-04T10:49:11.0547255Z [W1204 10:19:15.532230051 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0547257Z 2025-12-04T10:49:11.0547406Z [W1204 10:19:15.532490516 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0547417Z 2025-12-04T10:49:11.0547564Z [W1204 10:19:15.532565315 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0547579Z 2025-12-04T10:49:11.0547617Z FAILED [0.4310s] [100%] 2025-12-04T10:49:11.0547618Z 2025-12-04T10:49:11.0547670Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0547818Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0547865Z Traceback (most recent call last): 2025-12-04T10:49:11.0548020Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0548061Z method(*args, **kwargs) 2025-12-04T10:49:11.0548211Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0548251Z method(*args, **kwargs) 2025-12-04T10:49:11.0548400Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0548438Z with policy(): 2025-12-04T10:49:11.0548589Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0548631Z raise RuntimeError(msg) 2025-12-04T10:49:11.0549026Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0549030Z 2025-12-04T10:49:11.0549102Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0549388Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0549391Z 2025-12-04T10:49:11.0549475Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0549547Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0549603Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0549803Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0549874Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0549911Z graph_break [] 2025-12-04T10:49:11.0549981Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0550326Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0550370Z if out == self.unknown_value: 2025-12-04T10:49:11.0550518Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0550561Z Traceback (most recent call last): 2025-12-04T10:49:11.0550715Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0550755Z method(*args, **kwargs) 2025-12-04T10:49:11.0550904Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0550943Z method(*args, **kwargs) 2025-12-04T10:49:11.0551102Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0551139Z with policy(): 2025-12-04T10:49:11.0551290Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0551342Z raise RuntimeError(msg) 2025-12-04T10:49:11.0551738Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0551741Z 2025-12-04T10:49:11.0551813Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0552133Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0552136Z 2025-12-04T10:49:11.0552221Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0552293Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0552348Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0552522Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0552594Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0552630Z graph_break [] 2025-12-04T10:49:11.0552699Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0553039Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0553082Z if out == self.unknown_value: 2025-12-04T10:49:11.0553153Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0553207Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0553277Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0553449Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0553527Z graph_break [] 2025-12-04T10:49:11.0553578Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0553727Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0553772Z Traceback (most recent call last): 2025-12-04T10:49:11.0553925Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0553963Z method(*args, **kwargs) 2025-12-04T10:49:11.0554115Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0554152Z method(*args, **kwargs) 2025-12-04T10:49:11.0554301Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0554338Z with policy(): 2025-12-04T10:49:11.0554489Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0554529Z raise RuntimeError(msg) 2025-12-04T10:49:11.0554925Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0554955Z 2025-12-04T10:49:11.0555027Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0555309Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0555311Z 2025-12-04T10:49:11.0555398Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0555468Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0555523Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0555696Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0555767Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0555805Z graph_break [] 2025-12-04T10:49:11.0555874Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0556219Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0556262Z if out == self.unknown_value: 2025-12-04T10:49:11.0556334Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0556387Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0556457Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0556631Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0556667Z graph_break [] 2025-12-04T10:49:11.0556737Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0556790Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0556859Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0557029Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0557084Z graph_break [] 2025-12-04T10:49:11.0557325Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d4781cda0fec93ea.xml - 2025-12-04T10:49:11.0557382Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0558003Z FAILED [0.4310s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0558007Z 2025-12-04T10:49:11.0558078Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0558365Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0558367Z 2025-12-04T10:49:11.0558464Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0558524Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0558590Z ================== 1 failed, 57 deselected, 2 rerun in 11.25s ================== 2025-12-04T10:49:11.0558635Z Got exit code 1 2025-12-04T10:49:11.0558676Z Retrying single test... 2025-12-04T10:49:11.0558870Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-7c93e86cea06e972.xml 2025-12-04T10:49:11.0558926Z ============================= test session starts ============================== 2025-12-04T10:49:11.0559037Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0559079Z cachedir: .pytest_cache 2025-12-04T10:49:11.0559237Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0559282Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0559322Z configfile: pytest.ini 2025-12-04T10:49:11.0559482Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0559556Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0559838Z stepcurrent: skipping 20 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0559882Z Running 1 items in this shard 2025-12-04T10:49:11.0559886Z 2025-12-04T10:49:11.0560244Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:19:23.116835466 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0560248Z 2025-12-04T10:49:11.0560398Z [W1204 10:19:30.495027563 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0560401Z 2025-12-04T10:49:11.0560551Z [W1204 10:19:30.495190000 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0560553Z 2025-12-04T10:49:11.0560700Z [W1204 10:19:30.498557826 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0560701Z 2025-12-04T10:49:11.0560876Z [W1204 10:19:30.498912420 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0560878Z 2025-12-04T10:49:11.0561025Z [W1204 10:19:30.498991878 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0561028Z 2025-12-04T10:49:11.0561176Z [W1204 10:19:30.501494571 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0561178Z 2025-12-04T10:49:11.0561324Z [W1204 10:19:30.501777795 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0561327Z 2025-12-04T10:49:11.0561473Z [W1204 10:19:30.501854484 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0561474Z 2025-12-04T10:49:11.0561524Z ('RERUN', {'yellow': True}) [9.8803s] [100%] 2025-12-04T10:49:11.0561913Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:19:31.440379611 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0561937Z 2025-12-04T10:49:11.0562084Z [W1204 10:19:31.440763923 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0562086Z 2025-12-04T10:49:11.0562254Z [W1204 10:19:31.440852942 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0562256Z 2025-12-04T10:49:11.0562403Z [W1204 10:19:31.442218226 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0562405Z 2025-12-04T10:49:11.0562552Z [W1204 10:19:31.442542250 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0562554Z 2025-12-04T10:49:11.0562700Z [W1204 10:19:31.442626948 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0562702Z 2025-12-04T10:49:11.0562849Z [W1204 10:19:31.444802157 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0562851Z 2025-12-04T10:49:11.0562997Z [W1204 10:19:31.445065002 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0563000Z 2025-12-04T10:49:11.0563147Z [W1204 10:19:31.445142950 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0563149Z 2025-12-04T10:49:11.0563197Z ('RERUN', {'yellow': True}) [0.4532s] [100%] 2025-12-04T10:49:11.0563550Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:19:32.919494803 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0563553Z 2025-12-04T10:49:11.0563699Z [W1204 10:19:32.919895685 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0563701Z 2025-12-04T10:49:11.0563848Z [W1204 10:19:32.919992124 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0563850Z 2025-12-04T10:49:11.0563997Z [W1204 10:19:32.921391317 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0563998Z 2025-12-04T10:49:11.0564181Z [W1204 10:19:32.921745460 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0564183Z 2025-12-04T10:49:11.0564332Z [W1204 10:19:32.921828179 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0564333Z 2025-12-04T10:49:11.0564483Z [W1204 10:19:32.924008447 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0564484Z 2025-12-04T10:49:11.0564630Z [W1204 10:19:32.924278552 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0564633Z 2025-12-04T10:49:11.0564779Z [W1204 10:19:32.924356761 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0564781Z 2025-12-04T10:49:11.0564818Z FAILED [0.4703s] [100%] 2025-12-04T10:49:11.0564820Z 2025-12-04T10:49:11.0564874Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0565024Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0565068Z Traceback (most recent call last): 2025-12-04T10:49:11.0565225Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0565275Z method(*args, **kwargs) 2025-12-04T10:49:11.0565426Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0565476Z method(*args, **kwargs) 2025-12-04T10:49:11.0565629Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0565665Z with policy(): 2025-12-04T10:49:11.0565818Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0565858Z raise RuntimeError(msg) 2025-12-04T10:49:11.0566249Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0566252Z 2025-12-04T10:49:11.0566324Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0566611Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0566613Z 2025-12-04T10:49:11.0566699Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0566771Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0566826Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0567001Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0567074Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0567109Z graph_break [] 2025-12-04T10:49:11.0567180Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0567523Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0567567Z if out == self.unknown_value: 2025-12-04T10:49:11.0567739Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0567787Z Traceback (most recent call last): 2025-12-04T10:49:11.0567937Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0567978Z method(*args, **kwargs) 2025-12-04T10:49:11.0568127Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0568167Z method(*args, **kwargs) 2025-12-04T10:49:11.0568317Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0568354Z with policy(): 2025-12-04T10:49:11.0568505Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0568545Z raise RuntimeError(msg) 2025-12-04T10:49:11.0568943Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0568956Z 2025-12-04T10:49:11.0569027Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0569312Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0569324Z 2025-12-04T10:49:11.0569409Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0569480Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0569534Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0569710Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0569780Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0569818Z graph_break [] 2025-12-04T10:49:11.0569888Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0570229Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0570274Z if out == self.unknown_value: 2025-12-04T10:49:11.0570343Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0570398Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0570470Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0570643Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0570679Z graph_break [] 2025-12-04T10:49:11.0570732Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0570879Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0570926Z Traceback (most recent call last): 2025-12-04T10:49:11.0571079Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0571118Z method(*args, **kwargs) 2025-12-04T10:49:11.0571267Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0571326Z method(*args, **kwargs) 2025-12-04T10:49:11.0571475Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0571512Z with policy(): 2025-12-04T10:49:11.0571662Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0571704Z raise RuntimeError(msg) 2025-12-04T10:49:11.0572123Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0572127Z 2025-12-04T10:49:11.0572199Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0572483Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0572486Z 2025-12-04T10:49:11.0572570Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0572669Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0572723Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0572896Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0572978Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0573015Z graph_break [] 2025-12-04T10:49:11.0573084Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0573425Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0573467Z if out == self.unknown_value: 2025-12-04T10:49:11.0573538Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0573592Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0573663Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0573837Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0573872Z graph_break [] 2025-12-04T10:49:11.0573943Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0573996Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0574068Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0574241Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0574277Z graph_break [] 2025-12-04T10:49:11.0574517Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-7c93e86cea06e972.xml - 2025-12-04T10:49:11.0574575Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0575221Z FAILED [0.4703s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0575225Z 2025-12-04T10:49:11.0575295Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0575579Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0575582Z 2025-12-04T10:49:11.0575665Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0575726Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0575791Z ================== 1 failed, 57 deselected, 2 rerun in 10.97s ================== 2025-12-04T10:49:11.0575828Z Got exit code 1 2025-12-04T10:49:11.0576063Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0576191Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0576388Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c105032a6cddc74a.xml 2025-12-04T10:49:11.0576456Z ============================= test session starts ============================== 2025-12-04T10:49:11.0576577Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0576618Z cachedir: .pytest_cache 2025-12-04T10:49:11.0576775Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0576821Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0576860Z configfile: pytest.ini 2025-12-04T10:49:11.0577022Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0577096Z collecting ... collected 58 items / 21 deselected / 37 selected 2025-12-04T10:49:11.0577147Z stepcurrent: skipping 21 already run items. 2025-12-04T10:49:11.0577191Z Running 37 items in this shard 2025-12-04T10:49:11.0577193Z 2025-12-04T10:49:11.0577439Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.8942s] [ 2%] 2025-12-04T10:49:11.0577683Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.4833s] [ 2%] 2025-12-04T10:49:11.0577903Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 FAILED [0.4773s] [ 2%] 2025-12-04T10:49:11.0577905Z 2025-12-04T10:49:11.0577957Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0578102Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0578148Z Traceback (most recent call last): 2025-12-04T10:49:11.0578303Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0578344Z method(*args, **kwargs) 2025-12-04T10:49:11.0578495Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0578536Z method(*args, **kwargs) 2025-12-04T10:49:11.0578685Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0578721Z with policy(): 2025-12-04T10:49:11.0578900Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0578939Z raise RuntimeError(msg) 2025-12-04T10:49:11.0579329Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0579333Z 2025-12-04T10:49:11.0579404Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0579690Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0579691Z 2025-12-04T10:49:11.0579777Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0579848Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0579901Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0580172Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0580269Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0580304Z graph_break [] 2025-12-04T10:49:11.0580452Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0580496Z Traceback (most recent call last): 2025-12-04T10:49:11.0580652Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0580690Z method(*args, **kwargs) 2025-12-04T10:49:11.0580840Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0580878Z method(*args, **kwargs) 2025-12-04T10:49:11.0581027Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0581063Z with policy(): 2025-12-04T10:49:11.0581214Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0581254Z raise RuntimeError(msg) 2025-12-04T10:49:11.0581657Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0581659Z 2025-12-04T10:49:11.0581730Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0582057Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0582060Z 2025-12-04T10:49:11.0582145Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0582216Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0582270Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0582538Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0582636Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0582671Z graph_break [] 2025-12-04T10:49:11.0582744Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0582797Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0582868Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0583134Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0583171Z graph_break [] 2025-12-04T10:49:11.0583222Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0583371Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0583417Z Traceback (most recent call last): 2025-12-04T10:49:11.0583570Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0583609Z method(*args, **kwargs) 2025-12-04T10:49:11.0583760Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0583812Z method(*args, **kwargs) 2025-12-04T10:49:11.0583960Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0584013Z with policy(): 2025-12-04T10:49:11.0584163Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0584204Z raise RuntimeError(msg) 2025-12-04T10:49:11.0584602Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0584604Z 2025-12-04T10:49:11.0584676Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0584964Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0584967Z 2025-12-04T10:49:11.0585052Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0585122Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0585176Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0585449Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0585519Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0585556Z graph_break [] 2025-12-04T10:49:11.0585626Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0585680Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0585751Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0586017Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0586052Z graph_break [] 2025-12-04T10:49:11.0586142Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0586195Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0586265Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0586530Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0586567Z graph_break [] 2025-12-04T10:49:11.0586809Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c105032a6cddc74a.xml - 2025-12-04T10:49:11.0586868Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0587499Z FAILED [0.4773s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0587513Z 2025-12-04T10:49:11.0587583Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0587867Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0587883Z 2025-12-04T10:49:11.0587966Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0588027Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0588093Z ================== 1 failed, 21 deselected, 2 rerun in 4.01s =================== 2025-12-04T10:49:11.0588130Z Got exit code 1 2025-12-04T10:49:11.0588169Z Retrying single test... 2025-12-04T10:49:11.0588366Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-89324ffeee9e9f85.xml 2025-12-04T10:49:11.0588422Z ============================= test session starts ============================== 2025-12-04T10:49:11.0588533Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0588574Z cachedir: .pytest_cache 2025-12-04T10:49:11.0588730Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0588776Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0588815Z configfile: pytest.ini 2025-12-04T10:49:11.0588981Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0589054Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0589340Z stepcurrent: skipping 21 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0589383Z Running 1 items in this shard 2025-12-04T10:49:11.0589385Z 2025-12-04T10:49:11.0589744Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:19:52.886995689 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0589747Z 2025-12-04T10:49:11.0589900Z [W1204 10:19:59.509343531 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0589923Z 2025-12-04T10:49:11.0590072Z [W1204 10:19:59.509529627 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0590074Z 2025-12-04T10:49:11.0590221Z [W1204 10:19:59.513471093 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0590224Z 2025-12-04T10:49:11.0590371Z [W1204 10:19:59.513766047 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0590374Z 2025-12-04T10:49:11.0590521Z [W1204 10:19:59.513841176 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0590523Z 2025-12-04T10:49:11.0590668Z [W1204 10:19:59.516304509 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0590671Z 2025-12-04T10:49:11.0590818Z [W1204 10:19:59.516562524 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0590820Z 2025-12-04T10:49:11.0590967Z [W1204 10:19:59.516637293 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0590979Z 2025-12-04T10:49:11.0591029Z ('RERUN', {'yellow': True}) [10.6690s] [100%] 2025-12-04T10:49:11.0591382Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:20:00.252400675 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0591398Z 2025-12-04T10:49:11.0591545Z [W1204 10:20:00.252776968 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0591548Z 2025-12-04T10:49:11.0591696Z [W1204 10:20:00.252864907 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0591698Z 2025-12-04T10:49:11.0591882Z [W1204 10:20:00.254239091 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0591886Z 2025-12-04T10:49:11.0592032Z [W1204 10:20:00.254493836 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0592036Z 2025-12-04T10:49:11.0592183Z [W1204 10:20:00.254569634 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0592185Z 2025-12-04T10:49:11.0592330Z [W1204 10:20:00.256754803 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0592332Z 2025-12-04T10:49:11.0592483Z [W1204 10:20:00.257019708 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0592485Z 2025-12-04T10:49:11.0592632Z [W1204 10:20:00.257098287 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0592634Z 2025-12-04T10:49:11.0592683Z ('RERUN', {'yellow': True}) [0.6263s] [100%] 2025-12-04T10:49:11.0593037Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:20:01.923793455 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0593040Z 2025-12-04T10:49:11.0593187Z [W1204 10:20:01.924180628 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0593189Z 2025-12-04T10:49:11.0593375Z [W1204 10:20:01.924266807 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0593377Z 2025-12-04T10:49:11.0593524Z [W1204 10:20:01.925646500 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0593528Z 2025-12-04T10:49:11.0593674Z [W1204 10:20:01.925896566 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0593677Z 2025-12-04T10:49:11.0593823Z [W1204 10:20:01.925971914 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0593825Z 2025-12-04T10:49:11.0593971Z [W1204 10:20:01.928157433 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0593973Z 2025-12-04T10:49:11.0594120Z [W1204 10:20:01.928414538 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0594122Z 2025-12-04T10:49:11.0594268Z [W1204 10:20:01.928491047 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0594300Z 2025-12-04T10:49:11.0594339Z FAILED [0.6428s] [100%] 2025-12-04T10:49:11.0594341Z 2025-12-04T10:49:11.0594393Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0594556Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0594601Z Traceback (most recent call last): 2025-12-04T10:49:11.0594756Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0594797Z method(*args, **kwargs) 2025-12-04T10:49:11.0594949Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0594989Z method(*args, **kwargs) 2025-12-04T10:49:11.0595138Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0595176Z with policy(): 2025-12-04T10:49:11.0595327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0595370Z raise RuntimeError(msg) 2025-12-04T10:49:11.0595762Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0595764Z 2025-12-04T10:49:11.0595838Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0596126Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0596130Z 2025-12-04T10:49:11.0596216Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0596287Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0596343Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0596614Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0596686Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0596743Z graph_break [] 2025-12-04T10:49:11.0596813Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0597157Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0597202Z if out == self.unknown_value: 2025-12-04T10:49:11.0597352Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0597398Z Traceback (most recent call last): 2025-12-04T10:49:11.0597551Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0597589Z method(*args, **kwargs) 2025-12-04T10:49:11.0597743Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0597781Z method(*args, **kwargs) 2025-12-04T10:49:11.0597931Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0597968Z with policy(): 2025-12-04T10:49:11.0598133Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0598173Z raise RuntimeError(msg) 2025-12-04T10:49:11.0598570Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0598585Z 2025-12-04T10:49:11.0598657Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0598943Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0598945Z 2025-12-04T10:49:11.0599031Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0599102Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0599157Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0599427Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0599498Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0599535Z graph_break [] 2025-12-04T10:49:11.0599606Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0599949Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0599993Z if out == self.unknown_value: 2025-12-04T10:49:11.0600063Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0600117Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0600189Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0600459Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0600495Z graph_break [] 2025-12-04T10:49:11.0600570Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0600719Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0600763Z Traceback (most recent call last): 2025-12-04T10:49:11.0600918Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0600956Z method(*args, **kwargs) 2025-12-04T10:49:11.0601108Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0601146Z method(*args, **kwargs) 2025-12-04T10:49:11.0601296Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0601331Z with policy(): 2025-12-04T10:49:11.0601484Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0601524Z raise RuntimeError(msg) 2025-12-04T10:49:11.0601978Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0601993Z 2025-12-04T10:49:11.0602067Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0602370Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0602372Z 2025-12-04T10:49:11.0602458Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0602529Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0602585Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0602854Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0602926Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0602963Z graph_break [] 2025-12-04T10:49:11.0603033Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0603377Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0603421Z if out == self.unknown_value: 2025-12-04T10:49:11.0603585Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0603639Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0603709Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0603976Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0604015Z graph_break [] 2025-12-04T10:49:11.0604084Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0604137Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0604206Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0604512Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0604548Z graph_break [] 2025-12-04T10:49:11.0604789Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-89324ffeee9e9f85.xml - 2025-12-04T10:49:11.0604848Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0605472Z FAILED [0.6428s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0605474Z 2025-12-04T10:49:11.0605546Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0605828Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0605841Z 2025-12-04T10:49:11.0605925Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0606001Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0606066Z ================== 1 failed, 57 deselected, 2 rerun in 12.09s ================== 2025-12-04T10:49:11.0606102Z Got exit code 1 2025-12-04T10:49:11.0606141Z Retrying single test... 2025-12-04T10:49:11.0606339Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-dfff5407af113acb.xml 2025-12-04T10:49:11.0606397Z ============================= test session starts ============================== 2025-12-04T10:49:11.0606509Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0606549Z cachedir: .pytest_cache 2025-12-04T10:49:11.0606708Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0606752Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0606796Z configfile: pytest.ini 2025-12-04T10:49:11.0606958Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0607032Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0607315Z stepcurrent: skipping 21 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0607360Z Running 1 items in this shard 2025-12-04T10:49:11.0607362Z 2025-12-04T10:49:11.0607717Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:20:11.880921207 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0607720Z 2025-12-04T10:49:11.0607873Z [W1204 10:20:19.548852700 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0607875Z 2025-12-04T10:49:11.0608024Z [W1204 10:20:19.548996697 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0608026Z 2025-12-04T10:49:11.0608201Z [W1204 10:20:19.552764606 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0608203Z 2025-12-04T10:49:11.0608350Z [W1204 10:20:19.553063101 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0608353Z 2025-12-04T10:49:11.0608499Z [W1204 10:20:19.553143799 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0608501Z 2025-12-04T10:49:11.0608651Z [W1204 10:20:19.555722501 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0608654Z 2025-12-04T10:49:11.0608802Z [W1204 10:20:19.555982716 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0608804Z 2025-12-04T10:49:11.0608951Z [W1204 10:20:19.556063894 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0608953Z 2025-12-04T10:49:11.0609004Z ('RERUN', {'yellow': True}) [10.6231s] [100%] 2025-12-04T10:49:11.0609356Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:20:19.214765193 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0609369Z 2025-12-04T10:49:11.0609529Z [W1204 10:20:19.215188015 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0609530Z 2025-12-04T10:49:11.0609676Z [W1204 10:20:19.215301022 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0609679Z 2025-12-04T10:49:11.0609825Z [W1204 10:20:19.216746075 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0609827Z 2025-12-04T10:49:11.0609974Z [W1204 10:20:19.217038300 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0609976Z 2025-12-04T10:49:11.0610123Z [W1204 10:20:19.217126938 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0610124Z 2025-12-04T10:49:11.0610271Z [W1204 10:20:19.219450744 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0610274Z 2025-12-04T10:49:11.0610421Z [W1204 10:20:19.219725429 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0610424Z 2025-12-04T10:49:11.0610570Z [W1204 10:20:19.219805887 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0610572Z 2025-12-04T10:49:11.0610621Z ('RERUN', {'yellow': True}) [0.5277s] [100%] 2025-12-04T10:49:11.0610977Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:20:20.733450694 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0610980Z 2025-12-04T10:49:11.0611128Z [W1204 10:20:20.733856616 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0611131Z 2025-12-04T10:49:11.0611277Z [W1204 10:20:20.733956684 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0611279Z 2025-12-04T10:49:11.0611450Z [W1204 10:20:20.735364838 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0611452Z 2025-12-04T10:49:11.0611599Z [W1204 10:20:20.735638992 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0611601Z 2025-12-04T10:49:11.0611748Z [W1204 10:20:20.735719981 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0611751Z 2025-12-04T10:49:11.0611940Z [W1204 10:20:20.737995248 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0611943Z 2025-12-04T10:49:11.0612090Z [W1204 10:20:20.738266633 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0612092Z 2025-12-04T10:49:11.0612240Z [W1204 10:20:20.738344501 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0612242Z 2025-12-04T10:49:11.0612282Z FAILED [0.5115s] [100%] 2025-12-04T10:49:11.0612284Z 2025-12-04T10:49:11.0612335Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0612486Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0612556Z Traceback (most recent call last): 2025-12-04T10:49:11.0612711Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0612768Z method(*args, **kwargs) 2025-12-04T10:49:11.0612920Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0612959Z method(*args, **kwargs) 2025-12-04T10:49:11.0613111Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0613150Z with policy(): 2025-12-04T10:49:11.0613302Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0613342Z raise RuntimeError(msg) 2025-12-04T10:49:11.0613732Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0613737Z 2025-12-04T10:49:11.0613810Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0614097Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0614101Z 2025-12-04T10:49:11.0614187Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0614258Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0614315Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0614589Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0614661Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0614697Z graph_break [] 2025-12-04T10:49:11.0614767Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0615137Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0615182Z if out == self.unknown_value: 2025-12-04T10:49:11.0615329Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0615376Z Traceback (most recent call last): 2025-12-04T10:49:11.0615526Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0615568Z method(*args, **kwargs) 2025-12-04T10:49:11.0615718Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0615756Z method(*args, **kwargs) 2025-12-04T10:49:11.0615906Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0615942Z with policy(): 2025-12-04T10:49:11.0616095Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0616136Z raise RuntimeError(msg) 2025-12-04T10:49:11.0616538Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0616570Z 2025-12-04T10:49:11.0616641Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0616925Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0616927Z 2025-12-04T10:49:11.0617014Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0617085Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0617139Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0617411Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0617483Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0617519Z graph_break [] 2025-12-04T10:49:11.0617590Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0617932Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0617976Z if out == self.unknown_value: 2025-12-04T10:49:11.0618045Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0618100Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0618169Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0618440Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0618476Z graph_break [] 2025-12-04T10:49:11.0618528Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0618678Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0618723Z Traceback (most recent call last): 2025-12-04T10:49:11.0618897Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0618938Z method(*args, **kwargs) 2025-12-04T10:49:11.0619086Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0619127Z method(*args, **kwargs) 2025-12-04T10:49:11.0619277Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0619314Z with policy(): 2025-12-04T10:49:11.0619467Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0619506Z raise RuntimeError(msg) 2025-12-04T10:49:11.0619907Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0619909Z 2025-12-04T10:49:11.0619980Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0620276Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0620288Z 2025-12-04T10:49:11.0620373Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0620445Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0620500Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0620771Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0620842Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0620877Z graph_break [] 2025-12-04T10:49:11.0620947Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0621286Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0621331Z if out == self.unknown_value: 2025-12-04T10:49:11.0621400Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0621455Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0621524Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0621796Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0621832Z graph_break [] 2025-12-04T10:49:11.0621943Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0621995Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0622064Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0622333Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0622370Z graph_break [] 2025-12-04T10:49:11.0622648Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-dfff5407af113acb.xml - 2025-12-04T10:49:11.0622709Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0623330Z FAILED [0.5115s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0623335Z 2025-12-04T10:49:11.0623406Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0623691Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0623693Z 2025-12-04T10:49:11.0623776Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0623841Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0623922Z ================== 1 failed, 57 deselected, 2 rerun in 11.80s ================== 2025-12-04T10:49:11.0623958Z Got exit code 1 2025-12-04T10:49:11.0624208Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0624333Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0624531Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6f3b76887c577c47.xml 2025-12-04T10:49:11.0624587Z ============================= test session starts ============================== 2025-12-04T10:49:11.0624699Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0624739Z cachedir: .pytest_cache 2025-12-04T10:49:11.0624898Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0624943Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0624985Z configfile: pytest.ini 2025-12-04T10:49:11.0625146Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0625219Z collecting ... collected 58 items / 22 deselected / 36 selected 2025-12-04T10:49:11.0625271Z stepcurrent: skipping 22 already run items. 2025-12-04T10:49:11.0625314Z Running 36 items in this shard 2025-12-04T10:49:11.0625317Z 2025-12-04T10:49:11.0625566Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.7074s] [ 2%] 2025-12-04T10:49:11.0625808Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.7710s] [ 2%] 2025-12-04T10:49:11.0626030Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 FAILED [0.7368s] [ 2%] 2025-12-04T10:49:11.0626033Z 2025-12-04T10:49:11.0626082Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0626232Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0626296Z Traceback (most recent call last): 2025-12-04T10:49:11.0626453Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0626492Z method(*args, **kwargs) 2025-12-04T10:49:11.0626645Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0626684Z method(*args, **kwargs) 2025-12-04T10:49:11.0626835Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0626871Z with policy(): 2025-12-04T10:49:11.0627027Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0627066Z raise RuntimeError(msg) 2025-12-04T10:49:11.0627466Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0627469Z 2025-12-04T10:49:11.0627540Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0627838Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0627851Z 2025-12-04T10:49:11.0627937Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0628007Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0628062Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0628238Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0628310Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0628345Z graph_break [] 2025-12-04T10:49:11.0628495Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0628540Z Traceback (most recent call last): 2025-12-04T10:49:11.0628693Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0628733Z method(*args, **kwargs) 2025-12-04T10:49:11.0628883Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0628921Z method(*args, **kwargs) 2025-12-04T10:49:11.0629070Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0629108Z with policy(): 2025-12-04T10:49:11.0629259Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0629300Z raise RuntimeError(msg) 2025-12-04T10:49:11.0629706Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0629710Z 2025-12-04T10:49:11.0629782Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0630068Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0630092Z 2025-12-04T10:49:11.0630177Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0630248Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0630302Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0630478Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0630548Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0630584Z graph_break [] 2025-12-04T10:49:11.0630655Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0630709Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0630778Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0630953Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0630988Z graph_break [] 2025-12-04T10:49:11.0631041Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0631189Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0631246Z Traceback (most recent call last): 2025-12-04T10:49:11.0631396Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0631447Z method(*args, **kwargs) 2025-12-04T10:49:11.0631597Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0631636Z method(*args, **kwargs) 2025-12-04T10:49:11.0631784Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0631821Z with policy(): 2025-12-04T10:49:11.0632009Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0632049Z raise RuntimeError(msg) 2025-12-04T10:49:11.0632456Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0632459Z 2025-12-04T10:49:11.0632531Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0632821Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0632824Z 2025-12-04T10:49:11.0632907Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0632978Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0633031Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0633206Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0633277Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0633313Z graph_break [] 2025-12-04T10:49:11.0633383Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0633436Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0633504Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0633708Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0633744Z graph_break [] 2025-12-04T10:49:11.0633816Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0633869Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0633940Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0634111Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0634149Z graph_break [] 2025-12-04T10:49:11.0634387Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6f3b76887c577c47.xml - 2025-12-04T10:49:11.0634449Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0635091Z FAILED [0.7368s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0635155Z 2025-12-04T10:49:11.0635226Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0635510Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0635512Z 2025-12-04T10:49:11.0635596Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0635658Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0635723Z ================== 1 failed, 22 deselected, 2 rerun in 4.35s =================== 2025-12-04T10:49:11.0635760Z Got exit code 1 2025-12-04T10:49:11.0635799Z Retrying single test... 2025-12-04T10:49:11.0635993Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-21a5e859f2763dc6.xml 2025-12-04T10:49:11.0636051Z ============================= test session starts ============================== 2025-12-04T10:49:11.0636161Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0636202Z cachedir: .pytest_cache 2025-12-04T10:49:11.0636359Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0636405Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0636444Z configfile: pytest.ini 2025-12-04T10:49:11.0636605Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0636678Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0636964Z stepcurrent: skipping 22 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0637007Z Running 1 items in this shard 2025-12-04T10:49:11.0637010Z 2025-12-04T10:49:11.0637394Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:20:40.047941843 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0637397Z 2025-12-04T10:49:11.0637548Z [W1204 10:20:47.660122455 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0637550Z 2025-12-04T10:49:11.0637698Z [W1204 10:20:47.660276192 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0637701Z 2025-12-04T10:49:11.0637848Z [W1204 10:20:47.663810875 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0637851Z 2025-12-04T10:49:11.0637996Z [W1204 10:20:47.664185818 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0637997Z 2025-12-04T10:49:11.0638144Z [W1204 10:20:47.664272327 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0638147Z 2025-12-04T10:49:11.0638294Z [W1204 10:20:47.666857098 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0638295Z 2025-12-04T10:49:11.0638440Z [W1204 10:20:47.667149872 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0638454Z 2025-12-04T10:49:11.0638601Z [W1204 10:20:47.667230821 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0638615Z 2025-12-04T10:49:11.0638663Z ('RERUN', {'yellow': True}) [9.3724s] [100%] 2025-12-04T10:49:11.0639019Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:20:48.903519409 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0639021Z 2025-12-04T10:49:11.0639168Z [W1204 10:20:48.903873702 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0639171Z 2025-12-04T10:49:11.0639317Z [W1204 10:20:48.903954381 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0639320Z 2025-12-04T10:49:11.0639467Z [W1204 10:20:48.905315675 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0639470Z 2025-12-04T10:49:11.0639616Z [W1204 10:20:48.905635379 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0639618Z 2025-12-04T10:49:11.0639764Z [W1204 10:20:48.905717628 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0639767Z 2025-12-04T10:49:11.0639914Z [W1204 10:20:48.907918526 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0639917Z 2025-12-04T10:49:11.0640063Z [W1204 10:20:48.908179861 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0640066Z 2025-12-04T10:49:11.0640213Z [W1204 10:20:48.908258520 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0640216Z 2025-12-04T10:49:11.0640264Z ('RERUN', {'yellow': True}) [0.7234s] [100%] 2025-12-04T10:49:11.0640617Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:20:49.649725407 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0640644Z 2025-12-04T10:49:11.0640791Z [W1204 10:20:49.650113220 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0640793Z 2025-12-04T10:49:11.0640939Z [W1204 10:20:49.650199539 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0640942Z 2025-12-04T10:49:11.0641089Z [W1204 10:20:49.651563853 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0641092Z 2025-12-04T10:49:11.0641239Z [W1204 10:20:49.651880817 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0641241Z 2025-12-04T10:49:11.0641388Z [W1204 10:20:49.651958205 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0641390Z 2025-12-04T10:49:11.0641536Z [W1204 10:20:49.654132644 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0641538Z 2025-12-04T10:49:11.0641686Z [W1204 10:20:49.654386210 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0641698Z 2025-12-04T10:49:11.0641884Z [W1204 10:20:49.654461538 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0641903Z 2025-12-04T10:49:11.0641941Z FAILED [0.7386s] [100%] 2025-12-04T10:49:11.0641943Z 2025-12-04T10:49:11.0641994Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0642143Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0642188Z Traceback (most recent call last): 2025-12-04T10:49:11.0642345Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0642385Z method(*args, **kwargs) 2025-12-04T10:49:11.0642535Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0642576Z method(*args, **kwargs) 2025-12-04T10:49:11.0642723Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0642762Z with policy(): 2025-12-04T10:49:11.0642913Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0642954Z raise RuntimeError(msg) 2025-12-04T10:49:11.0643355Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0643359Z 2025-12-04T10:49:11.0643431Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0643721Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0643725Z 2025-12-04T10:49:11.0643810Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0643882Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0643937Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0644139Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0644210Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0644247Z graph_break [] 2025-12-04T10:49:11.0644316Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0644662Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0644706Z if out == self.unknown_value: 2025-12-04T10:49:11.0644856Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0644899Z Traceback (most recent call last): 2025-12-04T10:49:11.0645051Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0645092Z method(*args, **kwargs) 2025-12-04T10:49:11.0645242Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0645281Z method(*args, **kwargs) 2025-12-04T10:49:11.0645431Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0645481Z with policy(): 2025-12-04T10:49:11.0645631Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0645685Z raise RuntimeError(msg) 2025-12-04T10:49:11.0646093Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0646095Z 2025-12-04T10:49:11.0646168Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0646453Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0646456Z 2025-12-04T10:49:11.0646542Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0646613Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0646668Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0646841Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0646912Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0646950Z graph_break [] 2025-12-04T10:49:11.0647020Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0647364Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0647408Z if out == self.unknown_value: 2025-12-04T10:49:11.0647479Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0647533Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0647603Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0647777Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0647833Z graph_break [] 2025-12-04T10:49:11.0647885Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0648034Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0648078Z Traceback (most recent call last): 2025-12-04T10:49:11.0648231Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0648269Z method(*args, **kwargs) 2025-12-04T10:49:11.0648421Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0648459Z method(*args, **kwargs) 2025-12-04T10:49:11.0648608Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0648644Z with policy(): 2025-12-04T10:49:11.0648797Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0648837Z raise RuntimeError(msg) 2025-12-04T10:49:11.0649243Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0649266Z 2025-12-04T10:49:11.0649351Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0649635Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0649637Z 2025-12-04T10:49:11.0649725Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0649795Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0649851Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0650023Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0650095Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0650131Z graph_break [] 2025-12-04T10:49:11.0650204Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0650545Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0650587Z if out == self.unknown_value: 2025-12-04T10:49:11.0650658Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0650711Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0650782Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0650955Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0650992Z graph_break [] 2025-12-04T10:49:11.0651062Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0651116Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0651185Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0651357Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0651412Z graph_break [] 2025-12-04T10:49:11.0651652Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-21a5e859f2763dc6.xml - 2025-12-04T10:49:11.0651711Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0654139Z FAILED [0.7386s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0654145Z 2025-12-04T10:49:11.0654225Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0654519Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0654521Z 2025-12-04T10:49:11.0654607Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0654699Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0654766Z ================== 1 failed, 57 deselected, 2 rerun in 10.96s ================== 2025-12-04T10:49:11.0654822Z Got exit code 1 2025-12-04T10:49:11.0654863Z Retrying single test... 2025-12-04T10:49:11.0655059Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-801c56af220acf2c.xml 2025-12-04T10:49:11.0655117Z ============================= test session starts ============================== 2025-12-04T10:49:11.0655232Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0655273Z cachedir: .pytest_cache 2025-12-04T10:49:11.0655431Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0655482Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0655522Z configfile: pytest.ini 2025-12-04T10:49:11.0655688Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0655764Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0656051Z stepcurrent: skipping 22 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0656095Z Running 1 items in this shard 2025-12-04T10:49:11.0656097Z 2025-12-04T10:49:11.0656462Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:20:58.867990096 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0656467Z 2025-12-04T10:49:11.0656619Z [W1204 10:21:05.099100892 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0656624Z 2025-12-04T10:49:11.0656773Z [W1204 10:21:05.099270018 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0656775Z 2025-12-04T10:49:11.0656924Z [W1204 10:21:05.102788492 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0656926Z 2025-12-04T10:49:11.0657100Z [W1204 10:21:05.103083177 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0657102Z 2025-12-04T10:49:11.0657250Z [W1204 10:21:05.103165555 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0657255Z 2025-12-04T10:49:11.0657402Z [W1204 10:21:05.105538990 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0657404Z 2025-12-04T10:49:11.0657552Z [W1204 10:21:05.105798726 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0657554Z 2025-12-04T10:49:11.0657702Z [W1204 10:21:05.105874474 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0657704Z 2025-12-04T10:49:11.0657754Z ('RERUN', {'yellow': True}) [9.9566s] [100%] 2025-12-04T10:49:11.0658113Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:21:06.292177474 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0658127Z 2025-12-04T10:49:11.0658276Z [W1204 10:21:06.292547637 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0658278Z 2025-12-04T10:49:11.0658438Z [W1204 10:21:06.292628015 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0658440Z 2025-12-04T10:49:11.0658586Z [W1204 10:21:06.293999760 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0658587Z 2025-12-04T10:49:11.0658737Z [W1204 10:21:06.294327393 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0658739Z 2025-12-04T10:49:11.0658886Z [W1204 10:21:06.294404942 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0658889Z 2025-12-04T10:49:11.0659034Z [W1204 10:21:06.296613350 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0659038Z 2025-12-04T10:49:11.0659184Z [W1204 10:21:06.296869936 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0659186Z 2025-12-04T10:49:11.0659332Z [W1204 10:21:06.296943884 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0659334Z 2025-12-04T10:49:11.0659381Z ('RERUN', {'yellow': True}) [0.6899s] [100%] 2025-12-04T10:49:11.0659738Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:21:07.976818679 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0659741Z 2025-12-04T10:49:11.0659888Z [W1204 10:21:07.977178772 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0659890Z 2025-12-04T10:49:11.0660037Z [W1204 10:21:07.977260451 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0660039Z 2025-12-04T10:49:11.0660187Z [W1204 10:21:07.978621305 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0660189Z 2025-12-04T10:49:11.0660358Z [W1204 10:21:07.978935939 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0660360Z 2025-12-04T10:49:11.0660507Z [W1204 10:21:07.979017058 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0660509Z 2025-12-04T10:49:11.0660655Z [W1204 10:21:07.981288645 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0660657Z 2025-12-04T10:49:11.0660803Z [W1204 10:21:07.981545620 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0660806Z 2025-12-04T10:49:11.0660953Z [W1204 10:21:07.981620039 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0660954Z 2025-12-04T10:49:11.0660992Z FAILED [0.6816s] [100%] 2025-12-04T10:49:11.0660994Z 2025-12-04T10:49:11.0661048Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0661196Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0661242Z Traceback (most recent call last): 2025-12-04T10:49:11.0661398Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0661450Z method(*args, **kwargs) 2025-12-04T10:49:11.0661601Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0661661Z method(*args, **kwargs) 2025-12-04T10:49:11.0661810Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0661894Z with policy(): 2025-12-04T10:49:11.0662048Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0662088Z raise RuntimeError(msg) 2025-12-04T10:49:11.0662496Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0662500Z 2025-12-04T10:49:11.0662573Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0662866Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0662868Z 2025-12-04T10:49:11.0662953Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0663027Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0663083Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0663259Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0663331Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0663368Z graph_break [] 2025-12-04T10:49:11.0663440Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0663788Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0663831Z if out == self.unknown_value: 2025-12-04T10:49:11.0664018Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0664063Z Traceback (most recent call last): 2025-12-04T10:49:11.0664215Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0664255Z method(*args, **kwargs) 2025-12-04T10:49:11.0664404Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0664444Z method(*args, **kwargs) 2025-12-04T10:49:11.0664593Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0664630Z with policy(): 2025-12-04T10:49:11.0664779Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0664820Z raise RuntimeError(msg) 2025-12-04T10:49:11.0665226Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0665242Z 2025-12-04T10:49:11.0665316Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0665605Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0665623Z 2025-12-04T10:49:11.0665708Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0665780Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0665836Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0666011Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0666082Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0666119Z graph_break [] 2025-12-04T10:49:11.0666189Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0666529Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0666572Z if out == self.unknown_value: 2025-12-04T10:49:11.0666644Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0666697Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0666770Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0666944Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0666980Z graph_break [] 2025-12-04T10:49:11.0667032Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0667182Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0667227Z Traceback (most recent call last): 2025-12-04T10:49:11.0667379Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0667418Z method(*args, **kwargs) 2025-12-04T10:49:11.0667567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0667627Z method(*args, **kwargs) 2025-12-04T10:49:11.0667776Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0667813Z with policy(): 2025-12-04T10:49:11.0667963Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0668005Z raise RuntimeError(msg) 2025-12-04T10:49:11.0668410Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0668413Z 2025-12-04T10:49:11.0668486Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0668772Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0668774Z 2025-12-04T10:49:11.0668861Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0668942Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0668997Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0669170Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0669255Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0669291Z graph_break [] 2025-12-04T10:49:11.0669360Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0669703Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0669745Z if out == self.unknown_value: 2025-12-04T10:49:11.0669815Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0669868Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0669939Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0670111Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0670148Z graph_break [] 2025-12-04T10:49:11.0670217Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0670270Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0670342Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0670515Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0670550Z graph_break [] 2025-12-04T10:49:11.0670792Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-801c56af220acf2c.xml - 2025-12-04T10:49:11.0670851Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0671506Z FAILED [0.6816s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0671509Z 2025-12-04T10:49:11.0671581Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0671926Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0671930Z 2025-12-04T10:49:11.0672015Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0672075Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0672142Z ================== 1 failed, 57 deselected, 2 rerun in 11.46s ================== 2025-12-04T10:49:11.0672178Z Got exit code 1 2025-12-04T10:49:11.0672418Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0672545Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0672741Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-5f7b7b2f4f7c599c.xml 2025-12-04T10:49:11.0672816Z ============================= test session starts ============================== 2025-12-04T10:49:11.0672943Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0672984Z cachedir: .pytest_cache 2025-12-04T10:49:11.0673141Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0673188Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0673227Z configfile: pytest.ini 2025-12-04T10:49:11.0673391Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0673464Z collecting ... collected 58 items / 23 deselected / 35 selected 2025-12-04T10:49:11.0673516Z stepcurrent: skipping 23 already run items. 2025-12-04T10:49:11.0673559Z Running 35 items in this shard 2025-12-04T10:49:11.0673561Z 2025-12-04T10:49:11.0673806Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.5856s] [ 2%] 2025-12-04T10:49:11.0674047Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6766s] [ 2%] 2025-12-04T10:49:11.0674267Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 FAILED [0.6495s] [ 2%] 2025-12-04T10:49:11.0674269Z 2025-12-04T10:49:11.0674321Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0674468Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0674514Z Traceback (most recent call last): 2025-12-04T10:49:11.0674670Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0674711Z method(*args, **kwargs) 2025-12-04T10:49:11.0674861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0674900Z method(*args, **kwargs) 2025-12-04T10:49:11.0675049Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0675111Z with policy(): 2025-12-04T10:49:11.0675262Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0675303Z raise RuntimeError(msg) 2025-12-04T10:49:11.0675696Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 65536 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0675700Z 2025-12-04T10:49:11.0675772Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0676057Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0676060Z 2025-12-04T10:49:11.0676146Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0676218Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0676272Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0676462Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0676533Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0676584Z graph_break [] 2025-12-04T10:49:11.0676731Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0676776Z Traceback (most recent call last): 2025-12-04T10:49:11.0676927Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0676968Z method(*args, **kwargs) 2025-12-04T10:49:11.0677116Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0677156Z method(*args, **kwargs) 2025-12-04T10:49:11.0677305Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0677343Z with policy(): 2025-12-04T10:49:11.0677493Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0677534Z raise RuntimeError(msg) 2025-12-04T10:49:11.0677933Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 65536 and is now reported as 131072 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0677936Z 2025-12-04T10:49:11.0678007Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0678293Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0678296Z 2025-12-04T10:49:11.0678380Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0678452Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0678505Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0678680Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0678750Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0678808Z graph_break [] 2025-12-04T10:49:11.0678879Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0678933Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0679001Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0679176Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0679213Z graph_break [] 2025-12-04T10:49:11.0679266Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0679415Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0679459Z Traceback (most recent call last): 2025-12-04T10:49:11.0679613Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0679652Z method(*args, **kwargs) 2025-12-04T10:49:11.0679801Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0679838Z method(*args, **kwargs) 2025-12-04T10:49:11.0679998Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0680034Z with policy(): 2025-12-04T10:49:11.0680185Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0680236Z raise RuntimeError(msg) 2025-12-04T10:49:11.0680638Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0680640Z 2025-12-04T10:49:11.0680712Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0680996Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0680999Z 2025-12-04T10:49:11.0681083Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0681154Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0681208Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0681381Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0681453Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0681489Z graph_break [] 2025-12-04T10:49:11.0681559Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0681613Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0681683Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0681902Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0681939Z graph_break [] 2025-12-04T10:49:11.0682008Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0682061Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0682130Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0682343Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0682379Z graph_break [] 2025-12-04T10:49:11.0682623Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-5f7b7b2f4f7c599c.xml - 2025-12-04T10:49:11.0682683Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0683309Z FAILED [0.6495s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0683313Z 2025-12-04T10:49:11.0683387Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0683672Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0683695Z 2025-12-04T10:49:11.0683780Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0683841Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0683923Z ================== 1 failed, 23 deselected, 2 rerun in 4.05s =================== 2025-12-04T10:49:11.0683958Z Got exit code 1 2025-12-04T10:49:11.0683998Z Retrying single test... 2025-12-04T10:49:11.0684192Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ece7fe2250e237c7.xml 2025-12-04T10:49:11.0684250Z ============================= test session starts ============================== 2025-12-04T10:49:11.0684361Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0684402Z cachedir: .pytest_cache 2025-12-04T10:49:11.0684559Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0684605Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0684645Z configfile: pytest.ini 2025-12-04T10:49:11.0684805Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0684879Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0685160Z stepcurrent: skipping 23 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0685206Z Running 1 items in this shard 2025-12-04T10:49:11.0685208Z 2025-12-04T10:49:11.0685564Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:21:27.912281322 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0685567Z 2025-12-04T10:49:11.0685720Z [W1204 10:21:34.482925075 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0685723Z 2025-12-04T10:49:11.0685874Z [W1204 10:21:34.483092782 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0685876Z 2025-12-04T10:49:11.0686023Z [W1204 10:21:34.486417709 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0686045Z 2025-12-04T10:49:11.0686193Z [W1204 10:21:34.486721114 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0686195Z 2025-12-04T10:49:11.0686341Z [W1204 10:21:34.486801482 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0686344Z 2025-12-04T10:49:11.0686491Z [W1204 10:21:34.489232636 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0686493Z 2025-12-04T10:49:11.0686638Z [W1204 10:21:34.489501971 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0686641Z 2025-12-04T10:49:11.0686787Z [W1204 10:21:34.489579180 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0686789Z 2025-12-04T10:49:11.0686842Z ('RERUN', {'yellow': True}) [10.2538s] [100%] 2025-12-04T10:49:11.0687195Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:21:36.618790797 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0687208Z 2025-12-04T10:49:11.0687355Z [W1204 10:21:36.619175410 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0687370Z 2025-12-04T10:49:11.0687517Z [W1204 10:21:36.619263038 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0687520Z 2025-12-04T10:49:11.0687776Z [W1204 10:21:36.620645502 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0687778Z 2025-12-04T10:49:11.0687927Z [W1204 10:21:36.620969716 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0687929Z 2025-12-04T10:49:11.0688076Z [W1204 10:21:36.621052845 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0688079Z 2025-12-04T10:49:11.0688227Z [W1204 10:21:36.623273093 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0688230Z 2025-12-04T10:49:11.0688375Z [W1204 10:21:36.623532288 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0688377Z 2025-12-04T10:49:11.0688523Z [W1204 10:21:36.623607997 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0688525Z 2025-12-04T10:49:11.0688576Z ('RERUN', {'yellow': True}) [0.6542s] [100%] 2025-12-04T10:49:11.0688927Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:21:36.268871597 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0688930Z 2025-12-04T10:49:11.0689079Z [W1204 10:21:36.269265840 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0689083Z 2025-12-04T10:49:11.0689229Z [W1204 10:21:36.269349268 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0689231Z 2025-12-04T10:49:11.0689377Z [W1204 10:21:36.270736392 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0689379Z 2025-12-04T10:49:11.0689555Z [W1204 10:21:36.271069836 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0689557Z 2025-12-04T10:49:11.0689704Z [W1204 10:21:36.271157915 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0689707Z 2025-12-04T10:49:11.0689853Z [W1204 10:21:36.273355143 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0689856Z 2025-12-04T10:49:11.0690002Z [W1204 10:21:36.273614708 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0690004Z 2025-12-04T10:49:11.0690152Z [W1204 10:21:36.273690537 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0690154Z 2025-12-04T10:49:11.0690191Z FAILED [0.6195s] [100%] 2025-12-04T10:49:11.0690196Z 2025-12-04T10:49:11.0690247Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0690396Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0690451Z Traceback (most recent call last): 2025-12-04T10:49:11.0690607Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0690647Z method(*args, **kwargs) 2025-12-04T10:49:11.0690812Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0690851Z method(*args, **kwargs) 2025-12-04T10:49:11.0691001Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0691037Z with policy(): 2025-12-04T10:49:11.0691190Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0691229Z raise RuntimeError(msg) 2025-12-04T10:49:11.0691622Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 65536 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0691627Z 2025-12-04T10:49:11.0691698Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0692015Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0692017Z 2025-12-04T10:49:11.0692105Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0692177Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0692233Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0692408Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0692481Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0692517Z graph_break [] 2025-12-04T10:49:11.0692589Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0692934Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0692978Z if out == self.unknown_value: 2025-12-04T10:49:11.0693152Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0693198Z Traceback (most recent call last): 2025-12-04T10:49:11.0693349Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0693390Z method(*args, **kwargs) 2025-12-04T10:49:11.0693541Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0693581Z method(*args, **kwargs) 2025-12-04T10:49:11.0693730Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0693766Z with policy(): 2025-12-04T10:49:11.0693917Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0693959Z raise RuntimeError(msg) 2025-12-04T10:49:11.0694363Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 65536 and is now reported as 131072 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0694378Z 2025-12-04T10:49:11.0694450Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0694734Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0694752Z 2025-12-04T10:49:11.0694837Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0694908Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0694964Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0695138Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0695209Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0695246Z graph_break [] 2025-12-04T10:49:11.0695316Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0695660Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0695703Z if out == self.unknown_value: 2025-12-04T10:49:11.0695773Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0695828Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0695899Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0696074Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0696111Z graph_break [] 2025-12-04T10:49:11.0696162Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0696310Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0696356Z Traceback (most recent call last): 2025-12-04T10:49:11.0696509Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0696548Z method(*args, **kwargs) 2025-12-04T10:49:11.0696721Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0696760Z method(*args, **kwargs) 2025-12-04T10:49:11.0696911Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0696947Z with policy(): 2025-12-04T10:49:11.0697100Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0697141Z raise RuntimeError(msg) 2025-12-04T10:49:11.0697540Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0697543Z 2025-12-04T10:49:11.0697615Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0697900Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0697903Z 2025-12-04T10:49:11.0697987Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0698068Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0698122Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0698308Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0698379Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0698414Z graph_break [] 2025-12-04T10:49:11.0698486Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0698828Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0698871Z if out == self.unknown_value: 2025-12-04T10:49:11.0698942Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0698996Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0699068Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0699242Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0699277Z graph_break [] 2025-12-04T10:49:11.0699348Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0699402Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0699473Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0699643Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0699681Z graph_break [] 2025-12-04T10:49:11.0699921Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ece7fe2250e237c7.xml - 2025-12-04T10:49:11.0699980Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0700634Z FAILED [0.6195s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0700637Z 2025-12-04T10:49:11.0700708Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0700995Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0700998Z 2025-12-04T10:49:11.0701081Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0701142Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0701209Z ================== 1 failed, 57 deselected, 2 rerun in 11.67s ================== 2025-12-04T10:49:11.0701245Z Got exit code 1 2025-12-04T10:49:11.0701287Z Retrying single test... 2025-12-04T10:49:11.0701481Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-da6b49e56df80b30.xml 2025-12-04T10:49:11.0701538Z ============================= test session starts ============================== 2025-12-04T10:49:11.0701659Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0701699Z cachedir: .pytest_cache 2025-12-04T10:49:11.0701888Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0701951Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0701991Z configfile: pytest.ini 2025-12-04T10:49:11.0702154Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0702227Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0702509Z stepcurrent: skipping 23 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0702552Z Running 1 items in this shard 2025-12-04T10:49:11.0702555Z 2025-12-04T10:49:11.0702911Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:21:45.493699377 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0702914Z 2025-12-04T10:49:11.0703066Z [W1204 10:21:53.057904920 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0703068Z 2025-12-04T10:49:11.0703220Z [W1204 10:21:53.058053137 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0703221Z 2025-12-04T10:49:11.0703369Z [W1204 10:21:53.061848336 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0703371Z 2025-12-04T10:49:11.0703519Z [W1204 10:21:53.062142110 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0703521Z 2025-12-04T10:49:11.0703668Z [W1204 10:21:53.062229578 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0703671Z 2025-12-04T10:49:11.0703816Z [W1204 10:21:53.064637973 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0703819Z 2025-12-04T10:49:11.0704006Z [W1204 10:21:53.064901638 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0704008Z 2025-12-04T10:49:11.0704157Z [W1204 10:21:53.064984707 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0704158Z 2025-12-04T10:49:11.0704209Z ('RERUN', {'yellow': True}) [10.2307s] [100%] 2025-12-04T10:49:11.0704563Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:21:54.132095235 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0704566Z 2025-12-04T10:49:11.0704712Z [W1204 10:21:54.132491777 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0704715Z 2025-12-04T10:49:11.0704863Z [W1204 10:21:54.132571236 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0704865Z 2025-12-04T10:49:11.0705011Z [W1204 10:21:54.133941020 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0705013Z 2025-12-04T10:49:11.0705159Z [W1204 10:21:54.134267514 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0705175Z 2025-12-04T10:49:11.0705323Z [W1204 10:21:54.134345912 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0705336Z 2025-12-04T10:49:11.0705482Z [W1204 10:21:54.136575370 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0705484Z 2025-12-04T10:49:11.0705632Z [W1204 10:21:54.136836516 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0705634Z 2025-12-04T10:49:11.0705782Z [W1204 10:21:54.136911084 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0705783Z 2025-12-04T10:49:11.0705832Z ('RERUN', {'yellow': True}) [0.5859s] [100%] 2025-12-04T10:49:11.0706185Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:21:55.715468270 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0706188Z 2025-12-04T10:49:11.0706335Z [W1204 10:21:55.715855353 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0706336Z 2025-12-04T10:49:11.0706486Z [W1204 10:21:55.715946871 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0706488Z 2025-12-04T10:49:11.0706634Z [W1204 10:21:55.717329935 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0706636Z 2025-12-04T10:49:11.0706782Z [W1204 10:21:55.717654549 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0706785Z 2025-12-04T10:49:11.0706931Z [W1204 10:21:55.717732758 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0706934Z 2025-12-04T10:49:11.0707080Z [W1204 10:21:55.719945906 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0707082Z 2025-12-04T10:49:11.0707229Z [W1204 10:21:55.720208261 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0707251Z 2025-12-04T10:49:11.0707399Z [W1204 10:21:55.720286600 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0707401Z 2025-12-04T10:49:11.0707439Z FAILED [0.5690s] [100%] 2025-12-04T10:49:11.0707441Z 2025-12-04T10:49:11.0707495Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0707644Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0707691Z Traceback (most recent call last): 2025-12-04T10:49:11.0707846Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0707886Z method(*args, **kwargs) 2025-12-04T10:49:11.0708036Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0708077Z method(*args, **kwargs) 2025-12-04T10:49:11.0708226Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0708264Z with policy(): 2025-12-04T10:49:11.0708415Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0708467Z raise RuntimeError(msg) 2025-12-04T10:49:11.0708859Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 65536 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0708872Z 2025-12-04T10:49:11.0708944Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0709232Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0709234Z 2025-12-04T10:49:11.0709318Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0709391Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0709445Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0709621Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0709693Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0709729Z graph_break [] 2025-12-04T10:49:11.0709800Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0710144Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0710187Z if out == self.unknown_value: 2025-12-04T10:49:11.0710336Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0710382Z Traceback (most recent call last): 2025-12-04T10:49:11.0710535Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0710574Z method(*args, **kwargs) 2025-12-04T10:49:11.0710725Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0710764Z method(*args, **kwargs) 2025-12-04T10:49:11.0710933Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0710969Z with policy(): 2025-12-04T10:49:11.0711120Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0711161Z raise RuntimeError(msg) 2025-12-04T10:49:11.0711558Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 65536 and is now reported as 131072 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0711563Z 2025-12-04T10:49:11.0711635Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0711953Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0711955Z 2025-12-04T10:49:11.0712041Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0712111Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0712167Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0712355Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0712444Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0712481Z graph_break [] 2025-12-04T10:49:11.0712551Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0712895Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0712937Z if out == self.unknown_value: 2025-12-04T10:49:11.0713009Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0713062Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0713133Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0713306Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0713344Z graph_break [] 2025-12-04T10:49:11.0713394Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0713543Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0713587Z Traceback (most recent call last): 2025-12-04T10:49:11.0713743Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0713781Z method(*args, **kwargs) 2025-12-04T10:49:11.0713934Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0713973Z method(*args, **kwargs) 2025-12-04T10:49:11.0714123Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0714159Z with policy(): 2025-12-04T10:49:11.0714311Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0714350Z raise RuntimeError(msg) 2025-12-04T10:49:11.0714777Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0714780Z 2025-12-04T10:49:11.0714853Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0715136Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0715139Z 2025-12-04T10:49:11.0715226Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0715296Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0715351Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0715525Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0715596Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0715632Z graph_break [] 2025-12-04T10:49:11.0715702Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0716042Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0716126Z if out == self.unknown_value: 2025-12-04T10:49:11.0716196Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0716249Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0716319Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0716494Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0716530Z graph_break [] 2025-12-04T10:49:11.0716599Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0716652Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0716722Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0716894Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0716930Z graph_break [] 2025-12-04T10:49:11.0717171Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-da6b49e56df80b30.xml - 2025-12-04T10:49:11.0717229Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0717861Z FAILED [0.5690s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0717864Z 2025-12-04T10:49:11.0717936Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0718223Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0718225Z 2025-12-04T10:49:11.0718309Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0718387Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0718453Z ================== 1 failed, 57 deselected, 2 rerun in 11.51s ================== 2025-12-04T10:49:11.0718490Z Got exit code 1 2025-12-04T10:49:11.0718725Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0718851Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0719051Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ca5c790e0f2ba3fa.xml 2025-12-04T10:49:11.0719108Z ============================= test session starts ============================== 2025-12-04T10:49:11.0719220Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0719264Z cachedir: .pytest_cache 2025-12-04T10:49:11.0719419Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0719465Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0719504Z configfile: pytest.ini 2025-12-04T10:49:11.0719678Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0719750Z collecting ... collected 58 items / 24 deselected / 34 selected 2025-12-04T10:49:11.0719816Z stepcurrent: skipping 24 already run items. 2025-12-04T10:49:11.0719859Z Running 34 items in this shard 2025-12-04T10:49:11.0719861Z 2025-12-04T10:49:11.0720106Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [3.0180s] [ 2%] 2025-12-04T10:49:11.0720347Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6306s] [ 2%] 2025-12-04T10:49:11.0720568Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 FAILED [0.6204s] [ 2%] 2025-12-04T10:49:11.0720572Z 2025-12-04T10:49:11.0720623Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0720771Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0720816Z Traceback (most recent call last): 2025-12-04T10:49:11.0720971Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0721011Z method(*args, **kwargs) 2025-12-04T10:49:11.0721163Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0721203Z method(*args, **kwargs) 2025-12-04T10:49:11.0721351Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0721390Z with policy(): 2025-12-04T10:49:11.0721540Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0721581Z raise RuntimeError(msg) 2025-12-04T10:49:11.0722007Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0722009Z 2025-12-04T10:49:11.0722111Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0722393Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0722396Z 2025-12-04T10:49:11.0722481Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0722554Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0722609Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0722882Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0722953Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0722990Z graph_break [] 2025-12-04T10:49:11.0723138Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0723183Z Traceback (most recent call last): 2025-12-04T10:49:11.0723333Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0723393Z method(*args, **kwargs) 2025-12-04T10:49:11.0723541Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0723596Z method(*args, **kwargs) 2025-12-04T10:49:11.0723744Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0723780Z with policy(): 2025-12-04T10:49:11.0723930Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0723972Z raise RuntimeError(msg) 2025-12-04T10:49:11.0724367Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0724371Z 2025-12-04T10:49:11.0724441Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0724725Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0724729Z 2025-12-04T10:49:11.0724812Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0724883Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0724938Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0725209Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0725282Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0725319Z graph_break [] 2025-12-04T10:49:11.0725388Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0725444Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0725513Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0725797Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0725834Z graph_break [] 2025-12-04T10:49:11.0725884Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0726032Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0726077Z Traceback (most recent call last): 2025-12-04T10:49:11.0726229Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0726269Z method(*args, **kwargs) 2025-12-04T10:49:11.0726420Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0726458Z method(*args, **kwargs) 2025-12-04T10:49:11.0726608Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0726646Z with policy(): 2025-12-04T10:49:11.0726798Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0726838Z raise RuntimeError(msg) 2025-12-04T10:49:11.0727240Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0727270Z 2025-12-04T10:49:11.0727341Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0727625Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0727626Z 2025-12-04T10:49:11.0727712Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0727782Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0727836Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0728105Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0728177Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0728213Z graph_break [] 2025-12-04T10:49:11.0728283Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0728335Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0728404Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0728671Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0728708Z graph_break [] 2025-12-04T10:49:11.0728777Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0728832Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0728901Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0729171Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0729207Z graph_break [] 2025-12-04T10:49:11.0729472Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ca5c790e0f2ba3fa.xml - 2025-12-04T10:49:11.0729531Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0730156Z FAILED [0.6204s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0730160Z 2025-12-04T10:49:11.0730232Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0730517Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0730520Z 2025-12-04T10:49:11.0730603Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0730664Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0730748Z ================== 1 failed, 24 deselected, 2 rerun in 4.40s =================== 2025-12-04T10:49:11.0730785Z Got exit code 1 2025-12-04T10:49:11.0730823Z Retrying single test... 2025-12-04T10:49:11.0731037Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-db99f95f0f18e27e.xml 2025-12-04T10:49:11.0731094Z ============================= test session starts ============================== 2025-12-04T10:49:11.0731206Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0731246Z cachedir: .pytest_cache 2025-12-04T10:49:11.0731405Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0731450Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0731490Z configfile: pytest.ini 2025-12-04T10:49:11.0731654Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0731727Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0732046Z stepcurrent: skipping 24 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0732091Z Running 1 items in this shard 2025-12-04T10:49:11.0732094Z 2025-12-04T10:49:11.0732450Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:22:16.657298134 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0732454Z 2025-12-04T10:49:11.0732606Z [W1204 10:22:23.264018838 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0732609Z 2025-12-04T10:49:11.0732759Z [W1204 10:22:23.264169835 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0732762Z 2025-12-04T10:49:11.0732908Z [W1204 10:22:23.267921655 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0732910Z 2025-12-04T10:49:11.0733057Z [W1204 10:22:23.268236919 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0733059Z 2025-12-04T10:49:11.0733246Z [W1204 10:22:23.268315668 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0733250Z 2025-12-04T10:49:11.0733397Z [W1204 10:22:23.270973058 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0733400Z 2025-12-04T10:49:11.0733547Z [W1204 10:22:23.271246813 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0733550Z 2025-12-04T10:49:11.0733695Z [W1204 10:22:23.271323811 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0733697Z 2025-12-04T10:49:11.0733747Z ('RERUN', {'yellow': True}) [10.5822s] [100%] 2025-12-04T10:49:11.0734099Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:22:24.064792167 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0734102Z 2025-12-04T10:49:11.0734250Z [W1204 10:22:24.065211989 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0734278Z 2025-12-04T10:49:11.0734425Z [W1204 10:22:24.065309967 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0734442Z 2025-12-04T10:49:11.0734587Z [W1204 10:22:24.066767550 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0734589Z 2025-12-04T10:49:11.0734736Z [W1204 10:22:24.067032115 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0734738Z 2025-12-04T10:49:11.0734885Z [W1204 10:22:24.067110544 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0734887Z 2025-12-04T10:49:11.0735034Z [W1204 10:22:24.069428470 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0735037Z 2025-12-04T10:49:11.0735183Z [W1204 10:22:24.069689265 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0735185Z 2025-12-04T10:49:11.0735332Z [W1204 10:22:24.069764594 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0735334Z 2025-12-04T10:49:11.0735383Z ('RERUN', {'yellow': True}) [0.6473s] [100%] 2025-12-04T10:49:11.0735737Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:22:25.702458918 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0735739Z 2025-12-04T10:49:11.0735889Z [W1204 10:22:25.702900890 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0735892Z 2025-12-04T10:49:11.0736037Z [W1204 10:22:25.703006798 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0736040Z 2025-12-04T10:49:11.0736188Z [W1204 10:22:25.704485450 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0736190Z 2025-12-04T10:49:11.0736337Z [W1204 10:22:25.704778765 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0736339Z 2025-12-04T10:49:11.0736508Z [W1204 10:22:25.704862563 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0736510Z 2025-12-04T10:49:11.0736656Z [W1204 10:22:25.707236749 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0736659Z 2025-12-04T10:49:11.0736804Z [W1204 10:22:25.707509913 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0736807Z 2025-12-04T10:49:11.0736952Z [W1204 10:22:25.707588032 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0736955Z 2025-12-04T10:49:11.0736994Z FAILED [0.6713s] [100%] 2025-12-04T10:49:11.0736996Z 2025-12-04T10:49:11.0737047Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0737198Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0737243Z Traceback (most recent call last): 2025-12-04T10:49:11.0737399Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0737438Z method(*args, **kwargs) 2025-12-04T10:49:11.0737601Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0737639Z method(*args, **kwargs) 2025-12-04T10:49:11.0737803Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0737839Z with policy(): 2025-12-04T10:49:11.0737992Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0738031Z raise RuntimeError(msg) 2025-12-04T10:49:11.0738428Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0738430Z 2025-12-04T10:49:11.0738504Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0738791Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0738794Z 2025-12-04T10:49:11.0738880Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0738951Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0739007Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0739278Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0739350Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0739387Z graph_break [] 2025-12-04T10:49:11.0739457Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0739800Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0739846Z if out == self.unknown_value: 2025-12-04T10:49:11.0739994Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0740060Z Traceback (most recent call last): 2025-12-04T10:49:11.0740214Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0740253Z method(*args, **kwargs) 2025-12-04T10:49:11.0740403Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0740443Z method(*args, **kwargs) 2025-12-04T10:49:11.0740593Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0740630Z with policy(): 2025-12-04T10:49:11.0740781Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0740820Z raise RuntimeError(msg) 2025-12-04T10:49:11.0741222Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0741224Z 2025-12-04T10:49:11.0741295Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0741592Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0741607Z 2025-12-04T10:49:11.0741693Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0741763Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0741818Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0742121Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0742192Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0742228Z graph_break [] 2025-12-04T10:49:11.0742300Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0742640Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0742684Z if out == self.unknown_value: 2025-12-04T10:49:11.0742755Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0742809Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0742883Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0743152Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0743188Z graph_break [] 2025-12-04T10:49:11.0743239Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0743387Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0743434Z Traceback (most recent call last): 2025-12-04T10:49:11.0743587Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0743626Z method(*args, **kwargs) 2025-12-04T10:49:11.0743820Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0743859Z method(*args, **kwargs) 2025-12-04T10:49:11.0744009Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0744045Z with policy(): 2025-12-04T10:49:11.0744197Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0744237Z raise RuntimeError(msg) 2025-12-04T10:49:11.0744641Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0744646Z 2025-12-04T10:49:11.0744717Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0745004Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0745006Z 2025-12-04T10:49:11.0745090Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0745173Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0745228Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0745507Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0745579Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0745615Z graph_break [] 2025-12-04T10:49:11.0745688Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0746026Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0746070Z if out == self.unknown_value: 2025-12-04T10:49:11.0746139Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0746193Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0746263Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0746531Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0746566Z graph_break [] 2025-12-04T10:49:11.0746638Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0746692Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0746762Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0747029Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0747067Z graph_break [] 2025-12-04T10:49:11.0747309Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-db99f95f0f18e27e.xml - 2025-12-04T10:49:11.0747366Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0748021Z FAILED [0.6713s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0748024Z 2025-12-04T10:49:11.0748096Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0748381Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0748383Z 2025-12-04T10:49:11.0748467Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0748529Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0748596Z ================== 1 failed, 57 deselected, 2 rerun in 12.04s ================== 2025-12-04T10:49:11.0748632Z Got exit code 1 2025-12-04T10:49:11.0748673Z Retrying single test... 2025-12-04T10:49:11.0748870Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-88ebcb1c483f535c.xml 2025-12-04T10:49:11.0748939Z ============================= test session starts ============================== 2025-12-04T10:49:11.0749048Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0749102Z cachedir: .pytest_cache 2025-12-04T10:49:11.0749259Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0749305Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0749344Z configfile: pytest.ini 2025-12-04T10:49:11.0749506Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0749579Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0749861Z stepcurrent: skipping 24 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0749904Z Running 1 items in this shard 2025-12-04T10:49:11.0749906Z 2025-12-04T10:49:11.0750264Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:22:35.791038420 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0750267Z 2025-12-04T10:49:11.0750420Z [W1204 10:22:42.507660658 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0750422Z 2025-12-04T10:49:11.0750571Z [W1204 10:22:42.507837495 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0750573Z 2025-12-04T10:49:11.0750722Z [W1204 10:22:42.511505916 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0750724Z 2025-12-04T10:49:11.0750870Z [W1204 10:22:42.511812341 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0750873Z 2025-12-04T10:49:11.0751020Z [W1204 10:22:42.511896139 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0751022Z 2025-12-04T10:49:11.0751190Z [W1204 10:22:42.514486461 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0751192Z 2025-12-04T10:49:11.0751338Z [W1204 10:22:42.514753536 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0751340Z 2025-12-04T10:49:11.0751486Z [W1204 10:22:42.514829194 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0751489Z 2025-12-04T10:49:11.0751538Z ('RERUN', {'yellow': True}) [10.6890s] [100%] 2025-12-04T10:49:11.0751920Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:22:43.141194370 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0751922Z 2025-12-04T10:49:11.0752071Z [W1204 10:22:43.141580753 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0752072Z 2025-12-04T10:49:11.0752218Z [W1204 10:22:43.141676281 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0752220Z 2025-12-04T10:49:11.0752367Z [W1204 10:22:43.143084345 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0752385Z 2025-12-04T10:49:11.0752531Z [W1204 10:22:43.143343350 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0752546Z 2025-12-04T10:49:11.0752694Z [W1204 10:22:43.143419448 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0752696Z 2025-12-04T10:49:11.0752844Z [W1204 10:22:43.145682806 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0752847Z 2025-12-04T10:49:11.0752994Z [W1204 10:22:43.145939001 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0752996Z 2025-12-04T10:49:11.0753143Z [W1204 10:22:43.146018510 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0753146Z 2025-12-04T10:49:11.0753194Z ('RERUN', {'yellow': True}) [0.4789s] [100%] 2025-12-04T10:49:11.0753547Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:22:44.618130098 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0753550Z 2025-12-04T10:49:11.0753699Z [W1204 10:22:44.618549120 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0753702Z 2025-12-04T10:49:11.0753849Z [W1204 10:22:44.618637189 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0753850Z 2025-12-04T10:49:11.0753998Z [W1204 10:22:44.620027483 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0754000Z 2025-12-04T10:49:11.0754146Z [W1204 10:22:44.620281318 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0754149Z 2025-12-04T10:49:11.0754296Z [W1204 10:22:44.620358686 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0754298Z 2025-12-04T10:49:11.0754443Z [W1204 10:22:44.622608844 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0754472Z 2025-12-04T10:49:11.0754620Z [W1204 10:22:44.622861810 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0754622Z 2025-12-04T10:49:11.0754769Z [W1204 10:22:44.622938078 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0754772Z 2025-12-04T10:49:11.0754809Z FAILED [0.4721s] [100%] 2025-12-04T10:49:11.0754811Z 2025-12-04T10:49:11.0754864Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0755015Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0755060Z Traceback (most recent call last): 2025-12-04T10:49:11.0755215Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0755256Z method(*args, **kwargs) 2025-12-04T10:49:11.0755406Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0755447Z method(*args, **kwargs) 2025-12-04T10:49:11.0755596Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0755644Z with policy(): 2025-12-04T10:49:11.0755795Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0755847Z raise RuntimeError(msg) 2025-12-04T10:49:11.0756237Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0756240Z 2025-12-04T10:49:11.0756314Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0756599Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0756602Z 2025-12-04T10:49:11.0756687Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0756759Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0756815Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0757087Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0757159Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0757197Z graph_break [] 2025-12-04T10:49:11.0757267Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0757611Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0757657Z if out == self.unknown_value: 2025-12-04T10:49:11.0757804Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0757849Z Traceback (most recent call last): 2025-12-04T10:49:11.0758000Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0758039Z method(*args, **kwargs) 2025-12-04T10:49:11.0758219Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0758259Z method(*args, **kwargs) 2025-12-04T10:49:11.0758408Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0758445Z with policy(): 2025-12-04T10:49:11.0758595Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0758635Z raise RuntimeError(msg) 2025-12-04T10:49:11.0759033Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0759036Z 2025-12-04T10:49:11.0759109Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0759395Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0759398Z 2025-12-04T10:49:11.0759497Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0759568Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0759636Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0759905Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0759976Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0760012Z graph_break [] 2025-12-04T10:49:11.0760083Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0760423Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0760468Z if out == self.unknown_value: 2025-12-04T10:49:11.0760538Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0760593Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0760664Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0760933Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0760972Z graph_break [] 2025-12-04T10:49:11.0761022Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0761171Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0761218Z Traceback (most recent call last): 2025-12-04T10:49:11.0761369Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0761410Z method(*args, **kwargs) 2025-12-04T10:49:11.0761560Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0761600Z method(*args, **kwargs) 2025-12-04T10:49:11.0761748Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0761785Z with policy(): 2025-12-04T10:49:11.0762015Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0762056Z raise RuntimeError(msg) 2025-12-04T10:49:11.0762455Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0762460Z 2025-12-04T10:49:11.0762532Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0762815Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0762818Z 2025-12-04T10:49:11.0762904Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0762975Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0763029Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0763298Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0763381Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0763433Z graph_break [] 2025-12-04T10:49:11.0763503Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0763847Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0763889Z if out == self.unknown_value: 2025-12-04T10:49:11.0763961Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0764014Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0764085Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0764351Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0764393Z graph_break [] 2025-12-04T10:49:11.0764462Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0764515Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0764584Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0764852Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0764889Z graph_break [] 2025-12-04T10:49:11.0765130Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-88ebcb1c483f535c.xml - 2025-12-04T10:49:11.0765189Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0765839Z FAILED [0.4721s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0765842Z 2025-12-04T10:49:11.0765915Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0766200Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0766203Z 2025-12-04T10:49:11.0766290Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0766352Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0766416Z ================== 1 failed, 57 deselected, 2 rerun in 11.77s ================== 2025-12-04T10:49:11.0766453Z Got exit code 1 2025-12-04T10:49:11.0766689Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0766817Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0767013Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-002ff2858e7c8ebf.xml 2025-12-04T10:49:11.0767080Z ============================= test session starts ============================== 2025-12-04T10:49:11.0767191Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0767246Z cachedir: .pytest_cache 2025-12-04T10:49:11.0767402Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0767448Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0767487Z configfile: pytest.ini 2025-12-04T10:49:11.0767651Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0767723Z collecting ... collected 58 items / 25 deselected / 33 selected 2025-12-04T10:49:11.0767776Z stepcurrent: skipping 25 already run items. 2025-12-04T10:49:11.0767819Z Running 33 items in this shard 2025-12-04T10:49:11.0767823Z 2025-12-04T10:49:11.0768070Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.4811s] [ 3%] 2025-12-04T10:49:11.0768317Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.4559s] [ 3%] 2025-12-04T10:49:11.0768538Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 FAILED [0.4661s] [ 3%] 2025-12-04T10:49:11.0768540Z 2025-12-04T10:49:11.0768592Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0768740Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0768786Z Traceback (most recent call last): 2025-12-04T10:49:11.0768941Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0768983Z method(*args, **kwargs) 2025-12-04T10:49:11.0769133Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0769173Z method(*args, **kwargs) 2025-12-04T10:49:11.0769321Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0769359Z with policy(): 2025-12-04T10:49:11.0769533Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0769575Z raise RuntimeError(msg) 2025-12-04T10:49:11.0769971Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0769975Z 2025-12-04T10:49:11.0770046Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0770333Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0770335Z 2025-12-04T10:49:11.0770422Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0770493Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0770546Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0770722Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0770803Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0770858Z graph_break [] 2025-12-04T10:49:11.0771005Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0771050Z Traceback (most recent call last): 2025-12-04T10:49:11.0771203Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0771241Z method(*args, **kwargs) 2025-12-04T10:49:11.0771394Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0771432Z method(*args, **kwargs) 2025-12-04T10:49:11.0771581Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0771617Z with policy(): 2025-12-04T10:49:11.0771769Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0771810Z raise RuntimeError(msg) 2025-12-04T10:49:11.0772254Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0772257Z 2025-12-04T10:49:11.0772329Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0772619Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0772622Z 2025-12-04T10:49:11.0772706Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0772778Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0772834Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0773006Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0773078Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0773114Z graph_break [] 2025-12-04T10:49:11.0773221Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0773275Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0773345Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0773517Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0773555Z graph_break [] 2025-12-04T10:49:11.0773605Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0773754Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0773798Z Traceback (most recent call last): 2025-12-04T10:49:11.0773950Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0773991Z method(*args, **kwargs) 2025-12-04T10:49:11.0774141Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0774179Z method(*args, **kwargs) 2025-12-04T10:49:11.0774329Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0774377Z with policy(): 2025-12-04T10:49:11.0774529Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0774583Z raise RuntimeError(msg) 2025-12-04T10:49:11.0774989Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0774993Z 2025-12-04T10:49:11.0775067Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0775350Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0775353Z 2025-12-04T10:49:11.0775438Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0775510Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0775564Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0775736Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0775808Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0775845Z graph_break [] 2025-12-04T10:49:11.0775917Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0775971Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0776041Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0776214Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0776251Z graph_break [] 2025-12-04T10:49:11.0776322Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0776376Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0776446Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0776637Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0776675Z graph_break [] 2025-12-04T10:49:11.0776916Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-002ff2858e7c8ebf.xml - 2025-12-04T10:49:11.0776978Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0777608Z FAILED [0.4661s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0777612Z 2025-12-04T10:49:11.0777685Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0777971Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0777983Z 2025-12-04T10:49:11.0778067Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0778128Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0778206Z ================== 1 failed, 25 deselected, 2 rerun in 3.54s =================== 2025-12-04T10:49:11.0778243Z Got exit code 1 2025-12-04T10:49:11.0778283Z Retrying single test... 2025-12-04T10:49:11.0778478Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-b37411f74dd42cb6.xml 2025-12-04T10:49:11.0778535Z ============================= test session starts ============================== 2025-12-04T10:49:11.0778647Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0778688Z cachedir: .pytest_cache 2025-12-04T10:49:11.0778845Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0778891Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0778931Z configfile: pytest.ini 2025-12-04T10:49:11.0779092Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0779167Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0779449Z stepcurrent: skipping 25 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0779495Z Running 1 items in this shard 2025-12-04T10:49:11.0779497Z 2025-12-04T10:49:11.0779855Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:23:03.363690941 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0779859Z 2025-12-04T10:49:11.0780010Z [W1204 10:23:11.661771141 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0780013Z 2025-12-04T10:49:11.0780163Z [W1204 10:23:11.661947128 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0780165Z 2025-12-04T10:49:11.0780311Z [W1204 10:23:11.665950253 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0780313Z 2025-12-04T10:49:11.0780483Z [W1204 10:23:11.666253077 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0780485Z 2025-12-04T10:49:11.0780632Z [W1204 10:23:11.666332696 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0780635Z 2025-12-04T10:49:11.0780781Z [W1204 10:23:11.668743391 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0780784Z 2025-12-04T10:49:11.0780931Z [W1204 10:23:11.669010166 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0780933Z 2025-12-04T10:49:11.0781079Z [W1204 10:23:11.669097284 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0781081Z 2025-12-04T10:49:11.0781133Z ('RERUN', {'yellow': True}) [10.0623s] [100%] 2025-12-04T10:49:11.0781486Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:23:12.751229287 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0781498Z 2025-12-04T10:49:11.0781646Z [W1204 10:23:12.751634420 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0781660Z 2025-12-04T10:49:11.0781808Z [W1204 10:23:12.751718588 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0781810Z 2025-12-04T10:49:11.0781997Z [W1204 10:23:12.753150411 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0781999Z 2025-12-04T10:49:11.0782148Z [W1204 10:23:12.753480685 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0782150Z 2025-12-04T10:49:11.0782295Z [W1204 10:23:12.753559344 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0782299Z 2025-12-04T10:49:11.0782445Z [W1204 10:23:12.755788552 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0782448Z 2025-12-04T10:49:11.0782596Z [W1204 10:23:12.756055627 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0782598Z 2025-12-04T10:49:11.0782744Z [W1204 10:23:12.756133716 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0782746Z 2025-12-04T10:49:11.0782796Z ('RERUN', {'yellow': True}) [0.5859s] [100%] 2025-12-04T10:49:11.0783233Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:23:12.335283772 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0783237Z 2025-12-04T10:49:11.0783389Z [W1204 10:23:12.335683455 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0783391Z 2025-12-04T10:49:11.0783539Z [W1204 10:23:12.335767923 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0783541Z 2025-12-04T10:49:11.0783686Z [W1204 10:23:12.337182507 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0783688Z 2025-12-04T10:49:11.0783871Z [W1204 10:23:12.337508980 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0783873Z 2025-12-04T10:49:11.0784019Z [W1204 10:23:12.337587349 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0784022Z 2025-12-04T10:49:11.0784169Z [W1204 10:23:12.339830187 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0784171Z 2025-12-04T10:49:11.0784319Z [W1204 10:23:12.340093812 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0784320Z 2025-12-04T10:49:11.0784467Z [W1204 10:23:12.340170921 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0784469Z 2025-12-04T10:49:11.0784508Z FAILED [0.5785s] [100%] 2025-12-04T10:49:11.0784509Z 2025-12-04T10:49:11.0784561Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0784713Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0784756Z Traceback (most recent call last): 2025-12-04T10:49:11.0784940Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0784979Z method(*args, **kwargs) 2025-12-04T10:49:11.0785145Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0785183Z method(*args, **kwargs) 2025-12-04T10:49:11.0785334Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0785370Z with policy(): 2025-12-04T10:49:11.0785525Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0785564Z raise RuntimeError(msg) 2025-12-04T10:49:11.0785961Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0785964Z 2025-12-04T10:49:11.0786037Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0786322Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0786324Z 2025-12-04T10:49:11.0786410Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0786482Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0786538Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0786711Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0786785Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0786821Z graph_break [] 2025-12-04T10:49:11.0786893Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0787235Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0787280Z if out == self.unknown_value: 2025-12-04T10:49:11.0787449Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0787493Z Traceback (most recent call last): 2025-12-04T10:49:11.0787646Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0787686Z method(*args, **kwargs) 2025-12-04T10:49:11.0787836Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0787876Z method(*args, **kwargs) 2025-12-04T10:49:11.0788025Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0788061Z with policy(): 2025-12-04T10:49:11.0788213Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0788252Z raise RuntimeError(msg) 2025-12-04T10:49:11.0788661Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0788673Z 2025-12-04T10:49:11.0788745Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0789030Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0789046Z 2025-12-04T10:49:11.0789131Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0789202Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0789259Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0789433Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0789505Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0789542Z graph_break [] 2025-12-04T10:49:11.0789613Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0789955Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0790000Z if out == self.unknown_value: 2025-12-04T10:49:11.0790069Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0790124Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0790195Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0790370Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0790407Z graph_break [] 2025-12-04T10:49:11.0790460Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0790607Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0790653Z Traceback (most recent call last): 2025-12-04T10:49:11.0790807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0790845Z method(*args, **kwargs) 2025-12-04T10:49:11.0791016Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0791055Z method(*args, **kwargs) 2025-12-04T10:49:11.0791204Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0791239Z with policy(): 2025-12-04T10:49:11.0791390Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0791431Z raise RuntimeError(msg) 2025-12-04T10:49:11.0791836Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0791839Z 2025-12-04T10:49:11.0791955Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0792246Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0792248Z 2025-12-04T10:49:11.0792334Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0792430Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0792485Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0792680Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0792751Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0792787Z graph_break [] 2025-12-04T10:49:11.0792857Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0793197Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0793241Z if out == self.unknown_value: 2025-12-04T10:49:11.0793312Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0793366Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0793435Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0793609Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0793644Z graph_break [] 2025-12-04T10:49:11.0793714Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0793768Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0793838Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0794009Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0794047Z graph_break [] 2025-12-04T10:49:11.0794288Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-b37411f74dd42cb6.xml - 2025-12-04T10:49:11.0794348Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0795000Z FAILED [0.5785s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0795003Z 2025-12-04T10:49:11.0795075Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0795361Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0795365Z 2025-12-04T10:49:11.0795449Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0795510Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0795575Z ================== 1 failed, 57 deselected, 2 rerun in 11.36s ================== 2025-12-04T10:49:11.0795613Z Got exit code 1 2025-12-04T10:49:11.0795652Z Retrying single test... 2025-12-04T10:49:11.0795850Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ad02b7982eedf6c5.xml 2025-12-04T10:49:11.0795908Z ============================= test session starts ============================== 2025-12-04T10:49:11.0796028Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0796068Z cachedir: .pytest_cache 2025-12-04T10:49:11.0796225Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0796283Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0796322Z configfile: pytest.ini 2025-12-04T10:49:11.0796484Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0796555Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0796840Z stepcurrent: skipping 25 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0796883Z Running 1 items in this shard 2025-12-04T10:49:11.0796886Z 2025-12-04T10:49:11.0797244Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:23:22.689176431 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0797248Z 2025-12-04T10:49:11.0797400Z [W1204 10:23:29.209872233 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0797402Z 2025-12-04T10:49:11.0797552Z [W1204 10:23:29.210035750 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0797553Z 2025-12-04T10:49:11.0797701Z [W1204 10:23:29.213666652 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0797703Z 2025-12-04T10:49:11.0797849Z [W1204 10:23:29.213973606 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0797852Z 2025-12-04T10:49:11.0797998Z [W1204 10:23:29.214059955 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0798001Z 2025-12-04T10:49:11.0798147Z [W1204 10:23:29.216510959 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0798150Z 2025-12-04T10:49:11.0798297Z [W1204 10:23:29.216780344 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0798326Z 2025-12-04T10:49:11.0798474Z [W1204 10:23:29.216857493 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0798475Z 2025-12-04T10:49:11.0798525Z ('RERUN', {'yellow': True}) [10.2448s] [100%] 2025-12-04T10:49:11.0798882Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:23:30.373516667 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0798886Z 2025-12-04T10:49:11.0799033Z [W1204 10:23:30.373955969 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0799036Z 2025-12-04T10:49:11.0799183Z [W1204 10:23:30.374040847 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0799185Z 2025-12-04T10:49:11.0799332Z [W1204 10:23:30.375424712 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0799334Z 2025-12-04T10:49:11.0799479Z [W1204 10:23:30.375747266 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0799493Z 2025-12-04T10:49:11.0799640Z [W1204 10:23:30.375825524 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0799652Z 2025-12-04T10:49:11.0799797Z [W1204 10:23:30.378045723 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0799799Z 2025-12-04T10:49:11.0799945Z [W1204 10:23:30.378308358 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0799949Z 2025-12-04T10:49:11.0800095Z [W1204 10:23:30.378384537 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0800097Z 2025-12-04T10:49:11.0800145Z ('RERUN', {'yellow': True}) [0.6761s] [100%] 2025-12-04T10:49:11.0800499Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:23:31.077599327 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0800502Z 2025-12-04T10:49:11.0800649Z [W1204 10:23:31.078053508 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0800651Z 2025-12-04T10:49:11.0800797Z [W1204 10:23:31.078137197 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0800801Z 2025-12-04T10:49:11.0800947Z [W1204 10:23:31.079558200 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0800949Z 2025-12-04T10:49:11.0801095Z [W1204 10:23:31.079905524 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0801097Z 2025-12-04T10:49:11.0801245Z [W1204 10:23:31.079982922 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0801247Z 2025-12-04T10:49:11.0801393Z [W1204 10:23:31.082235030 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0801395Z 2025-12-04T10:49:11.0801542Z [W1204 10:23:31.082508305 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0801543Z 2025-12-04T10:49:11.0801710Z [W1204 10:23:31.082585884 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0801713Z 2025-12-04T10:49:11.0801750Z FAILED [0.7247s] [100%] 2025-12-04T10:49:11.0801752Z 2025-12-04T10:49:11.0801803Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0801983Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0802030Z Traceback (most recent call last): 2025-12-04T10:49:11.0802184Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0802225Z method(*args, **kwargs) 2025-12-04T10:49:11.0802376Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0802416Z method(*args, **kwargs) 2025-12-04T10:49:11.0802567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0802604Z with policy(): 2025-12-04T10:49:11.0802757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0802814Z raise RuntimeError(msg) 2025-12-04T10:49:11.0803208Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0803227Z 2025-12-04T10:49:11.0803300Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0803588Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0803591Z 2025-12-04T10:49:11.0803676Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0803749Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0803806Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0803981Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0804054Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0804092Z graph_break [] 2025-12-04T10:49:11.0804161Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0804506Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0804550Z if out == self.unknown_value: 2025-12-04T10:49:11.0804698Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0804742Z Traceback (most recent call last): 2025-12-04T10:49:11.0804898Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0804938Z method(*args, **kwargs) 2025-12-04T10:49:11.0805090Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0805129Z method(*args, **kwargs) 2025-12-04T10:49:11.0805303Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0805341Z with policy(): 2025-12-04T10:49:11.0805493Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0805533Z raise RuntimeError(msg) 2025-12-04T10:49:11.0805939Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0805943Z 2025-12-04T10:49:11.0806015Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0806300Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0806304Z 2025-12-04T10:49:11.0806391Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0806462Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0806517Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0806701Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0806773Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0806822Z graph_break [] 2025-12-04T10:49:11.0806891Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0807235Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0807277Z if out == self.unknown_value: 2025-12-04T10:49:11.0807347Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0807401Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0807472Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0807645Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0807682Z graph_break [] 2025-12-04T10:49:11.0807732Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0807880Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0807924Z Traceback (most recent call last): 2025-12-04T10:49:11.0808078Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0808116Z method(*args, **kwargs) 2025-12-04T10:49:11.0808268Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0808307Z method(*args, **kwargs) 2025-12-04T10:49:11.0808456Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0808495Z with policy(): 2025-12-04T10:49:11.0808646Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0808686Z raise RuntimeError(msg) 2025-12-04T10:49:11.0809107Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0809110Z 2025-12-04T10:49:11.0809183Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0809469Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0809472Z 2025-12-04T10:49:11.0809558Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0809629Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0809685Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0809857Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0809930Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0809966Z graph_break [] 2025-12-04T10:49:11.0810036Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0810375Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0810437Z if out == self.unknown_value: 2025-12-04T10:49:11.0810508Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0810561Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0810632Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0810806Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0810845Z graph_break [] 2025-12-04T10:49:11.0810914Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0810967Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0811038Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0811210Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0811247Z graph_break [] 2025-12-04T10:49:11.0811491Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ad02b7982eedf6c5.xml - 2025-12-04T10:49:11.0811548Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0812220Z FAILED [0.7247s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0812224Z 2025-12-04T10:49:11.0812295Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0812579Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0812581Z 2025-12-04T10:49:11.0812665Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0812768Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0812834Z ================== 1 failed, 57 deselected, 2 rerun in 11.81s ================== 2025-12-04T10:49:11.0812870Z Got exit code 1 2025-12-04T10:49:11.0813109Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0813236Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0813434Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d1e3d809de7f51b9.xml 2025-12-04T10:49:11.0813492Z ============================= test session starts ============================== 2025-12-04T10:49:11.0813603Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0813646Z cachedir: .pytest_cache 2025-12-04T10:49:11.0813802Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0813848Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0813890Z configfile: pytest.ini 2025-12-04T10:49:11.0814067Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0814140Z collecting ... collected 58 items / 26 deselected / 32 selected 2025-12-04T10:49:11.0814206Z stepcurrent: skipping 26 already run items. 2025-12-04T10:49:11.0814251Z Running 32 items in this shard 2025-12-04T10:49:11.0814253Z 2025-12-04T10:49:11.0814497Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.5789s] [ 3%] 2025-12-04T10:49:11.0814736Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.4704s] [ 3%] 2025-12-04T10:49:11.0814956Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 FAILED [0.4530s] [ 3%] 2025-12-04T10:49:11.0814960Z 2025-12-04T10:49:11.0815010Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0815157Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0815202Z Traceback (most recent call last): 2025-12-04T10:49:11.0815356Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0815396Z method(*args, **kwargs) 2025-12-04T10:49:11.0815549Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0815590Z method(*args, **kwargs) 2025-12-04T10:49:11.0815740Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0815779Z with policy(): 2025-12-04T10:49:11.0815931Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0815972Z raise RuntimeError(msg) 2025-12-04T10:49:11.0816362Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0816364Z 2025-12-04T10:49:11.0816459Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0816742Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0816745Z 2025-12-04T10:49:11.0816832Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0816904Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0816957Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0817133Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0817204Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0817241Z graph_break [] 2025-12-04T10:49:11.0817387Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0817432Z Traceback (most recent call last): 2025-12-04T10:49:11.0817583Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0817624Z method(*args, **kwargs) 2025-12-04T10:49:11.0817784Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0817822Z method(*args, **kwargs) 2025-12-04T10:49:11.0817984Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0818022Z with policy(): 2025-12-04T10:49:11.0818175Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0818216Z raise RuntimeError(msg) 2025-12-04T10:49:11.0818611Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0818615Z 2025-12-04T10:49:11.0818687Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0818973Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0818976Z 2025-12-04T10:49:11.0819061Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0819134Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0819187Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0819363Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0819434Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0819472Z graph_break [] 2025-12-04T10:49:11.0819543Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0819599Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0819669Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0819843Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0819879Z graph_break [] 2025-12-04T10:49:11.0819931Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0820099Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0820145Z Traceback (most recent call last): 2025-12-04T10:49:11.0820298Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0820339Z method(*args, **kwargs) 2025-12-04T10:49:11.0820488Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0820528Z method(*args, **kwargs) 2025-12-04T10:49:11.0820678Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0820713Z with policy(): 2025-12-04T10:49:11.0820865Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0820904Z raise RuntimeError(msg) 2025-12-04T10:49:11.0821302Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0821316Z 2025-12-04T10:49:11.0821387Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0821669Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0821682Z 2025-12-04T10:49:11.0821766Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0821837Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0821928Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0822102Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0822175Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0823826Z graph_break [] 2025-12-04T10:49:11.0823902Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0823957Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0824031Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0824207Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0824243Z graph_break [] 2025-12-04T10:49:11.0824315Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0824369Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0824439Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0824610Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0824649Z graph_break [] 2025-12-04T10:49:11.0824892Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d1e3d809de7f51b9.xml - 2025-12-04T10:49:11.0824954Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0825619Z FAILED [0.4530s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0825622Z 2025-12-04T10:49:11.0825693Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0825980Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0825983Z 2025-12-04T10:49:11.0826067Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0826129Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0826196Z ================== 1 failed, 26 deselected, 2 rerun in 3.64s =================== 2025-12-04T10:49:11.0826232Z Got exit code 1 2025-12-04T10:49:11.0826275Z Retrying single test... 2025-12-04T10:49:11.0826471Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-00b6d235f2865021.xml 2025-12-04T10:49:11.0826528Z ============================= test session starts ============================== 2025-12-04T10:49:11.0826656Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0826699Z cachedir: .pytest_cache 2025-12-04T10:49:11.0826856Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0826933Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0826973Z configfile: pytest.ini 2025-12-04T10:49:11.0827135Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0827211Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0827492Z stepcurrent: skipping 26 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0827536Z Running 1 items in this shard 2025-12-04T10:49:11.0827539Z 2025-12-04T10:49:11.0827897Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:23:51.291566316 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0827900Z 2025-12-04T10:49:11.0828052Z [W1204 10:23:59.698960200 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0828054Z 2025-12-04T10:49:11.0828204Z [W1204 10:23:59.699141537 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0828207Z 2025-12-04T10:49:11.0828356Z [W1204 10:23:59.702608302 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0828358Z 2025-12-04T10:49:11.0828505Z [W1204 10:23:59.702946306 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0828507Z 2025-12-04T10:49:11.0828654Z [W1204 10:23:59.703034114 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0828658Z 2025-12-04T10:49:11.0828804Z [W1204 10:23:59.705488239 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0828806Z 2025-12-04T10:49:11.0828976Z [W1204 10:23:59.705764634 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0828979Z 2025-12-04T10:49:11.0829126Z [W1204 10:23:59.705841772 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0829128Z 2025-12-04T10:49:11.0829178Z ('RERUN', {'yellow': True}) [10.0442s] [100%] 2025-12-04T10:49:11.0829533Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:24:00.864841767 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0829537Z 2025-12-04T10:49:11.0829684Z [W1204 10:24:00.865354428 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0829687Z 2025-12-04T10:49:11.0829835Z [W1204 10:24:00.865473926 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0829837Z 2025-12-04T10:49:11.0829983Z [W1204 10:24:00.866924359 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0829985Z 2025-12-04T10:49:11.0830133Z [W1204 10:24:00.867327741 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0830146Z 2025-12-04T10:49:11.0830294Z [W1204 10:24:00.867417650 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0830310Z 2025-12-04T10:49:11.0830457Z [W1204 10:24:00.869684507 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0830459Z 2025-12-04T10:49:11.0830608Z [W1204 10:24:00.869971402 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0830610Z 2025-12-04T10:49:11.0830757Z [W1204 10:24:00.870069690 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0830759Z 2025-12-04T10:49:11.0830807Z ('RERUN', {'yellow': True}) [0.6192s] [100%] 2025-12-04T10:49:11.0831157Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:24:00.463671686 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0831160Z 2025-12-04T10:49:11.0831306Z [W1204 10:24:00.464087498 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0831308Z 2025-12-04T10:49:11.0831457Z [W1204 10:24:00.464187656 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0831458Z 2025-12-04T10:49:11.0831605Z [W1204 10:24:00.465583870 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0831607Z 2025-12-04T10:49:11.0831754Z [W1204 10:24:00.465931774 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0831757Z 2025-12-04T10:49:11.0831941Z [W1204 10:24:00.466026742 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0831944Z 2025-12-04T10:49:11.0832090Z [W1204 10:24:00.468249721 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0832092Z 2025-12-04T10:49:11.0832238Z [W1204 10:24:00.468521626 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0832275Z 2025-12-04T10:49:11.0832423Z [W1204 10:24:00.468600554 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0832425Z 2025-12-04T10:49:11.0832462Z FAILED [0.6119s] [100%] 2025-12-04T10:49:11.0832464Z 2025-12-04T10:49:11.0832517Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0832665Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0832712Z Traceback (most recent call last): 2025-12-04T10:49:11.0832869Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0832910Z method(*args, **kwargs) 2025-12-04T10:49:11.0833061Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0833102Z method(*args, **kwargs) 2025-12-04T10:49:11.0833253Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0833290Z with policy(): 2025-12-04T10:49:11.0833442Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0833498Z raise RuntimeError(msg) 2025-12-04T10:49:11.0833890Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0833907Z 2025-12-04T10:49:11.0833980Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0834267Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0834269Z 2025-12-04T10:49:11.0834354Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0834427Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0834482Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0834658Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0834730Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0834767Z graph_break [] 2025-12-04T10:49:11.0834837Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0835183Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0835227Z if out == self.unknown_value: 2025-12-04T10:49:11.0835373Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0835418Z Traceback (most recent call last): 2025-12-04T10:49:11.0835571Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0835611Z method(*args, **kwargs) 2025-12-04T10:49:11.0835761Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0835800Z method(*args, **kwargs) 2025-12-04T10:49:11.0835970Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0836007Z with policy(): 2025-12-04T10:49:11.0836157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0836198Z raise RuntimeError(msg) 2025-12-04T10:49:11.0836593Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0836598Z 2025-12-04T10:49:11.0836671Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0836956Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0836958Z 2025-12-04T10:49:11.0837044Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0837114Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0837169Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0837352Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0837438Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0837475Z graph_break [] 2025-12-04T10:49:11.0837545Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0837890Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0837932Z if out == self.unknown_value: 2025-12-04T10:49:11.0838003Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0838056Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0838128Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0838301Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0838340Z graph_break [] 2025-12-04T10:49:11.0838391Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0838538Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0838582Z Traceback (most recent call last): 2025-12-04T10:49:11.0838737Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0838775Z method(*args, **kwargs) 2025-12-04T10:49:11.0838925Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0838964Z method(*args, **kwargs) 2025-12-04T10:49:11.0839115Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0839152Z with policy(): 2025-12-04T10:49:11.0839302Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0839341Z raise RuntimeError(msg) 2025-12-04T10:49:11.0839759Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0839761Z 2025-12-04T10:49:11.0839834Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0840117Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0840123Z 2025-12-04T10:49:11.0840209Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0840280Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0840334Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0840509Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0840663Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0840698Z graph_break [] 2025-12-04T10:49:11.0840769Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0841109Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0841188Z if out == self.unknown_value: 2025-12-04T10:49:11.0841260Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0841313Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0841384Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0841558Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0841595Z graph_break [] 2025-12-04T10:49:11.0841664Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0841719Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0841789Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0842012Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0842049Z graph_break [] 2025-12-04T10:49:11.0842290Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-00b6d235f2865021.xml - 2025-12-04T10:49:11.0842348Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0842971Z FAILED [0.6119s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0842974Z 2025-12-04T10:49:11.0843046Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0843328Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0843331Z 2025-12-04T10:49:11.0843415Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0843513Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0843580Z ================== 1 failed, 57 deselected, 2 rerun in 11.43s ================== 2025-12-04T10:49:11.0843616Z Got exit code 1 2025-12-04T10:49:11.0843656Z Retrying single test... 2025-12-04T10:49:11.0843851Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-de6b0224933644bb.xml 2025-12-04T10:49:11.0843908Z ============================= test session starts ============================== 2025-12-04T10:49:11.0844019Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0844060Z cachedir: .pytest_cache 2025-12-04T10:49:11.0844218Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0844264Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0844306Z configfile: pytest.ini 2025-12-04T10:49:11.0844470Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0844543Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0844824Z stepcurrent: skipping 26 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0844897Z Running 1 items in this shard 2025-12-04T10:49:11.0844899Z 2025-12-04T10:49:11.0845254Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:24:10.318355874 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0845256Z 2025-12-04T10:49:11.0845410Z [W1204 10:24:18.937552536 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0845412Z 2025-12-04T10:49:11.0845561Z [W1204 10:24:18.937702993 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0845565Z 2025-12-04T10:49:11.0845713Z [W1204 10:24:18.941210558 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0845715Z 2025-12-04T10:49:11.0845864Z [W1204 10:24:18.941530662 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0845866Z 2025-12-04T10:49:11.0846013Z [W1204 10:24:18.941612370 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0846014Z 2025-12-04T10:49:11.0846163Z [W1204 10:24:18.944049775 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0846165Z 2025-12-04T10:49:11.0846311Z [W1204 10:24:18.944313050 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0846314Z 2025-12-04T10:49:11.0846461Z [W1204 10:24:18.944389779 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0846463Z 2025-12-04T10:49:11.0846513Z ('RERUN', {'yellow': True}) [10.3181s] [100%] 2025-12-04T10:49:11.0846866Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:24:19.099150834 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0846870Z 2025-12-04T10:49:11.0847040Z [W1204 10:24:19.099545077 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0847044Z 2025-12-04T10:49:11.0847189Z [W1204 10:24:19.099626325 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0847192Z 2025-12-04T10:49:11.0847339Z [W1204 10:24:19.100999040 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0847341Z 2025-12-04T10:49:11.0847488Z [W1204 10:24:19.101327274 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0847492Z 2025-12-04T10:49:11.0847638Z [W1204 10:24:19.101406352 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0847639Z 2025-12-04T10:49:11.0847788Z [W1204 10:24:19.103615621 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0847790Z 2025-12-04T10:49:11.0847937Z [W1204 10:24:19.103874186 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0847939Z 2025-12-04T10:49:11.0848097Z [W1204 10:24:19.103951005 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0848098Z 2025-12-04T10:49:11.0848147Z ('RERUN', {'yellow': True}) [0.6123s] [100%] 2025-12-04T10:49:11.0848509Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:24:20.725094159 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0848512Z 2025-12-04T10:49:11.0848662Z [W1204 10:24:20.725484222 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0848664Z 2025-12-04T10:49:11.0848810Z [W1204 10:24:20.725565431 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0848812Z 2025-12-04T10:49:11.0848960Z [W1204 10:24:20.726955455 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0848962Z 2025-12-04T10:49:11.0849109Z [W1204 10:24:20.727284319 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0849112Z 2025-12-04T10:49:11.0849259Z [W1204 10:24:20.727363607 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0849261Z 2025-12-04T10:49:11.0849409Z [W1204 10:24:20.729551857 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0849410Z 2025-12-04T10:49:11.0849557Z [W1204 10:24:20.729808422 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0849559Z 2025-12-04T10:49:11.0849707Z [W1204 10:24:20.729883550 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0849710Z 2025-12-04T10:49:11.0849747Z FAILED [0.6205s] [100%] 2025-12-04T10:49:11.0849750Z 2025-12-04T10:49:11.0849802Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0849950Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0849995Z Traceback (most recent call last): 2025-12-04T10:49:11.0850173Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0850213Z method(*args, **kwargs) 2025-12-04T10:49:11.0850364Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0850403Z method(*args, **kwargs) 2025-12-04T10:49:11.0850553Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0850590Z with policy(): 2025-12-04T10:49:11.0850743Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0850783Z raise RuntimeError(msg) 2025-12-04T10:49:11.0851175Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0851178Z 2025-12-04T10:49:11.0851251Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0851536Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0851552Z 2025-12-04T10:49:11.0851638Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0851722Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0851778Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0851995Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0852071Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0852107Z graph_break [] 2025-12-04T10:49:11.0852177Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0852521Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0852565Z if out == self.unknown_value: 2025-12-04T10:49:11.0852709Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0852755Z Traceback (most recent call last): 2025-12-04T10:49:11.0852906Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0852946Z method(*args, **kwargs) 2025-12-04T10:49:11.0853096Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0853136Z method(*args, **kwargs) 2025-12-04T10:49:11.0853285Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0853321Z with policy(): 2025-12-04T10:49:11.0853472Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0853514Z raise RuntimeError(msg) 2025-12-04T10:49:11.0853909Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0853912Z 2025-12-04T10:49:11.0853984Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0854304Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0854306Z 2025-12-04T10:49:11.0854392Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0854465Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0854519Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0854699Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0854770Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0854806Z graph_break [] 2025-12-04T10:49:11.0854878Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0855220Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0855276Z if out == self.unknown_value: 2025-12-04T10:49:11.0855346Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0855401Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0855486Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0855661Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0855697Z graph_break [] 2025-12-04T10:49:11.0855748Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0855895Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0855941Z Traceback (most recent call last): 2025-12-04T10:49:11.0856091Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0856132Z method(*args, **kwargs) 2025-12-04T10:49:11.0856281Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0856321Z method(*args, **kwargs) 2025-12-04T10:49:11.0856470Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0856506Z with policy(): 2025-12-04T10:49:11.0856657Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0856697Z raise RuntimeError(msg) 2025-12-04T10:49:11.0857096Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0857102Z 2025-12-04T10:49:11.0857173Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0857455Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0857459Z 2025-12-04T10:49:11.0857543Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0857613Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0857687Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0857860Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0857930Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0857968Z graph_break [] 2025-12-04T10:49:11.0858038Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0858375Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0858418Z if out == self.unknown_value: 2025-12-04T10:49:11.0858489Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0858543Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0858614Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0858789Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0858836Z graph_break [] 2025-12-04T10:49:11.0858907Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0858959Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0859045Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0859216Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0859252Z graph_break [] 2025-12-04T10:49:11.0859495Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-de6b0224933644bb.xml - 2025-12-04T10:49:11.0859555Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0860175Z FAILED [0.6205s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0860179Z 2025-12-04T10:49:11.0860251Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0860535Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0860537Z 2025-12-04T10:49:11.0860621Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0860682Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0860749Z ================== 1 failed, 57 deselected, 2 rerun in 11.70s ================== 2025-12-04T10:49:11.0860786Z Got exit code 1 2025-12-04T10:49:11.0861022Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0861151Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0861347Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c9eca7813ba492a0.xml 2025-12-04T10:49:11.0861424Z ============================= test session starts ============================== 2025-12-04T10:49:11.0861534Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0861576Z cachedir: .pytest_cache 2025-12-04T10:49:11.0861730Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0861776Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0861816Z configfile: pytest.ini 2025-12-04T10:49:11.0862017Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0862090Z collecting ... collected 58 items / 27 deselected / 31 selected 2025-12-04T10:49:11.0862142Z stepcurrent: skipping 27 already run items. 2025-12-04T10:49:11.0862185Z Running 31 items in this shard 2025-12-04T10:49:11.0862188Z 2025-12-04T10:49:11.0862434Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [3.0312s] [ 3%] 2025-12-04T10:49:11.0862676Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.5926s] [ 3%] 2025-12-04T10:49:11.0862910Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 FAILED [0.5938s] [ 3%] 2025-12-04T10:49:11.0862927Z 2025-12-04T10:49:11.0862979Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0863124Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0863170Z Traceback (most recent call last): 2025-12-04T10:49:11.0863328Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0863370Z method(*args, **kwargs) 2025-12-04T10:49:11.0863520Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0863561Z method(*args, **kwargs) 2025-12-04T10:49:11.0863709Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0863747Z with policy(): 2025-12-04T10:49:11.0863898Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0863938Z raise RuntimeError(msg) 2025-12-04T10:49:11.0864330Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0864332Z 2025-12-04T10:49:11.0864403Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0864690Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0864693Z 2025-12-04T10:49:11.0864778Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0864849Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0864903Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0865199Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0865273Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0865309Z graph_break [] 2025-12-04T10:49:11.0865457Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0865502Z Traceback (most recent call last): 2025-12-04T10:49:11.0865653Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0865693Z method(*args, **kwargs) 2025-12-04T10:49:11.0865843Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0865882Z method(*args, **kwargs) 2025-12-04T10:49:11.0866035Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0866072Z with policy(): 2025-12-04T10:49:11.0866224Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0866263Z raise RuntimeError(msg) 2025-12-04T10:49:11.0866658Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0866680Z 2025-12-04T10:49:11.0866752Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0867036Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0867041Z 2025-12-04T10:49:11.0867126Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0867196Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0867251Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0867522Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0867594Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0867630Z graph_break [] 2025-12-04T10:49:11.0867702Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0867755Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0867825Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0868093Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0868129Z graph_break [] 2025-12-04T10:49:11.0868182Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0868328Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0868373Z Traceback (most recent call last): 2025-12-04T10:49:11.0868525Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0868564Z method(*args, **kwargs) 2025-12-04T10:49:11.0868713Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0868780Z method(*args, **kwargs) 2025-12-04T10:49:11.0868930Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0868967Z with policy(): 2025-12-04T10:49:11.0869117Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0869159Z raise RuntimeError(msg) 2025-12-04T10:49:11.0869555Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0869559Z 2025-12-04T10:49:11.0869632Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0869919Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0869922Z 2025-12-04T10:49:11.0870005Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0870087Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0870141Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0870411Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0870493Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0870530Z graph_break [] 2025-12-04T10:49:11.0870599Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0870654Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0870724Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0870991Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0871028Z graph_break [] 2025-12-04T10:49:11.0871098Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0871155Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0871226Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0871493Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0871529Z graph_break [] 2025-12-04T10:49:11.0871771Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c9eca7813ba492a0.xml - 2025-12-04T10:49:11.0871829Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0872502Z FAILED [0.5938s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0872506Z 2025-12-04T10:49:11.0872601Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0872884Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0872887Z 2025-12-04T10:49:11.0872971Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0873031Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0873097Z ================== 1 failed, 27 deselected, 2 rerun in 4.37s =================== 2025-12-04T10:49:11.0873133Z Got exit code 1 2025-12-04T10:49:11.0873174Z Retrying single test... 2025-12-04T10:49:11.0873368Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-bea9009210b6950a.xml 2025-12-04T10:49:11.0873426Z ============================= test session starts ============================== 2025-12-04T10:49:11.0873536Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0873577Z cachedir: .pytest_cache 2025-12-04T10:49:11.0873734Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0873792Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0873831Z configfile: pytest.ini 2025-12-04T10:49:11.0873993Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0874082Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0874365Z stepcurrent: skipping 27 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0874409Z Running 1 items in this shard 2025-12-04T10:49:11.0874411Z 2025-12-04T10:49:11.0874766Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:24:42.555769062 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0874770Z 2025-12-04T10:49:11.0874922Z [W1204 10:24:49.174641447 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0874925Z 2025-12-04T10:49:11.0875074Z [W1204 10:24:49.174793894 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0875076Z 2025-12-04T10:49:11.0875224Z [W1204 10:24:49.178120343 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0875226Z 2025-12-04T10:49:11.0875374Z [W1204 10:24:49.178385038 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0875376Z 2025-12-04T10:49:11.0875522Z [W1204 10:24:49.178462476 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0875525Z 2025-12-04T10:49:11.0875671Z [W1204 10:24:49.180979500 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0875674Z 2025-12-04T10:49:11.0875821Z [W1204 10:24:49.181247285 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0875823Z 2025-12-04T10:49:11.0875970Z [W1204 10:24:49.181323173 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0875972Z 2025-12-04T10:49:11.0876040Z ('RERUN', {'yellow': True}) [10.5051s] [100%] 2025-12-04T10:49:11.0876394Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:24:50.787814888 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0876399Z 2025-12-04T10:49:11.0876546Z [W1204 10:24:50.788194621 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0876549Z 2025-12-04T10:49:11.0876695Z [W1204 10:24:50.788291589 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0876697Z 2025-12-04T10:49:11.0876844Z [W1204 10:24:50.789678573 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0876846Z 2025-12-04T10:49:11.0876993Z [W1204 10:24:50.789941698 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0876994Z 2025-12-04T10:49:11.0877142Z [W1204 10:24:50.790026207 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0877156Z 2025-12-04T10:49:11.0877303Z [W1204 10:24:50.792310474 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0877315Z 2025-12-04T10:49:11.0877462Z [W1204 10:24:50.792567390 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0877464Z 2025-12-04T10:49:11.0877610Z [W1204 10:24:50.792642118 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0877612Z 2025-12-04T10:49:11.0877659Z ('RERUN', {'yellow': True}) [0.4559s] [100%] 2025-12-04T10:49:11.0878010Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:24:50.231244608 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0878013Z 2025-12-04T10:49:11.0878159Z [W1204 10:24:50.231627781 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0878163Z 2025-12-04T10:49:11.0878311Z [W1204 10:24:50.231711229 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0878313Z 2025-12-04T10:49:11.0878463Z [W1204 10:24:50.233128953 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0878464Z 2025-12-04T10:49:11.0878612Z [W1204 10:24:50.233386418 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0878614Z 2025-12-04T10:49:11.0878761Z [W1204 10:24:50.233460397 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0878764Z 2025-12-04T10:49:11.0878909Z [W1204 10:24:50.235844103 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0878912Z 2025-12-04T10:49:11.0879059Z [W1204 10:24:50.236351903 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0879061Z 2025-12-04T10:49:11.0879208Z [W1204 10:24:50.236463291 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0879209Z 2025-12-04T10:49:11.0879246Z FAILED [0.4489s] [100%] 2025-12-04T10:49:11.0879248Z 2025-12-04T10:49:11.0879319Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0879466Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0879511Z Traceback (most recent call last): 2025-12-04T10:49:11.0879669Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0879708Z method(*args, **kwargs) 2025-12-04T10:49:11.0879861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0879901Z method(*args, **kwargs) 2025-12-04T10:49:11.0880050Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0880087Z with policy(): 2025-12-04T10:49:11.0880239Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0880279Z raise RuntimeError(msg) 2025-12-04T10:49:11.0880677Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0880690Z 2025-12-04T10:49:11.0880762Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0881066Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0881068Z 2025-12-04T10:49:11.0881152Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0881225Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0881280Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0881551Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0881623Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0881661Z graph_break [] 2025-12-04T10:49:11.0881731Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0882110Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0882156Z if out == self.unknown_value: 2025-12-04T10:49:11.0882301Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0882347Z Traceback (most recent call last): 2025-12-04T10:49:11.0882498Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0882538Z method(*args, **kwargs) 2025-12-04T10:49:11.0882689Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0882729Z method(*args, **kwargs) 2025-12-04T10:49:11.0882879Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0882915Z with policy(): 2025-12-04T10:49:11.0883065Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0883140Z raise RuntimeError(msg) 2025-12-04T10:49:11.0883536Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0883539Z 2025-12-04T10:49:11.0883611Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0883896Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0883899Z 2025-12-04T10:49:11.0883984Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0884057Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0884111Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0884380Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0884463Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0884500Z graph_break [] 2025-12-04T10:49:11.0884569Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0884923Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0884966Z if out == self.unknown_value: 2025-12-04T10:49:11.0885038Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0885092Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0885163Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0885430Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0885467Z graph_break [] 2025-12-04T10:49:11.0885519Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0885666Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0885710Z Traceback (most recent call last): 2025-12-04T10:49:11.0885862Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0885904Z method(*args, **kwargs) 2025-12-04T10:49:11.0886052Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0886091Z method(*args, **kwargs) 2025-12-04T10:49:11.0886241Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0886278Z with policy(): 2025-12-04T10:49:11.0886428Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0886470Z raise RuntimeError(msg) 2025-12-04T10:49:11.0886889Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0886893Z 2025-12-04T10:49:11.0886966Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0887251Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0887254Z 2025-12-04T10:49:11.0887339Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0887411Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0887464Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0887733Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0887804Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0887840Z graph_break [] 2025-12-04T10:49:11.0887910Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0888250Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0888305Z if out == self.unknown_value: 2025-12-04T10:49:11.0888387Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0888440Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0888511Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0888782Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0888819Z graph_break [] 2025-12-04T10:49:11.0888888Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0888942Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0889012Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0889284Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0889322Z graph_break [] 2025-12-04T10:49:11.0889564Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-bea9009210b6950a.xml - 2025-12-04T10:49:11.0889624Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0890246Z FAILED [0.4489s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0890250Z 2025-12-04T10:49:11.0890322Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0890604Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0890606Z 2025-12-04T10:49:11.0890713Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0890775Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0890841Z ================== 1 failed, 57 deselected, 2 rerun in 11.56s ================== 2025-12-04T10:49:11.0890878Z Got exit code 1 2025-12-04T10:49:11.0890918Z Retrying single test... 2025-12-04T10:49:11.0891114Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-a24c0da403582f2c.xml 2025-12-04T10:49:11.0891171Z ============================= test session starts ============================== 2025-12-04T10:49:11.0891281Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0891322Z cachedir: .pytest_cache 2025-12-04T10:49:11.0891481Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0891526Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0891566Z configfile: pytest.ini 2025-12-04T10:49:11.0891726Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0891811Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0892132Z stepcurrent: skipping 27 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0892206Z Running 1 items in this shard 2025-12-04T10:49:11.0892208Z 2025-12-04T10:49:11.0892564Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:25:00.630941323 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0892567Z 2025-12-04T10:49:11.0892717Z [W1204 10:25:07.923019599 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0892719Z 2025-12-04T10:49:11.0892869Z [W1204 10:25:07.923188426 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0892872Z 2025-12-04T10:49:11.0893018Z [W1204 10:25:07.926729330 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0893021Z 2025-12-04T10:49:11.0893167Z [W1204 10:25:07.926994745 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0893169Z 2025-12-04T10:49:11.0893317Z [W1204 10:25:07.927075044 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0893319Z 2025-12-04T10:49:11.0893464Z [W1204 10:25:07.929605217 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0893466Z 2025-12-04T10:49:11.0893612Z [W1204 10:25:07.929890702 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0893615Z 2025-12-04T10:49:11.0893761Z [W1204 10:25:07.929967430 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0893765Z 2025-12-04T10:49:11.0893816Z ('RERUN', {'yellow': True}) [10.2321s] [100%] 2025-12-04T10:49:11.0894202Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:25:07.519621565 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0894205Z 2025-12-04T10:49:11.0894353Z [W1204 10:25:07.519988678 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0894355Z 2025-12-04T10:49:11.0894501Z [W1204 10:25:07.520081427 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0894504Z 2025-12-04T10:49:11.0894652Z [W1204 10:25:07.521479361 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0894654Z 2025-12-04T10:49:11.0894801Z [W1204 10:25:07.521725946 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0894802Z 2025-12-04T10:49:11.0894947Z [W1204 10:25:07.521802145 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0894952Z 2025-12-04T10:49:11.0895102Z [W1204 10:25:07.524068623 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0895104Z 2025-12-04T10:49:11.0895250Z [W1204 10:25:07.524319948 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0895283Z 2025-12-04T10:49:11.0895430Z [W1204 10:25:07.524394697 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0895441Z 2025-12-04T10:49:11.0895491Z ('RERUN', {'yellow': True}) [0.4515s] [100%] 2025-12-04T10:49:11.0895842Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:25:08.972705483 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0895846Z 2025-12-04T10:49:11.0895996Z [W1204 10:25:08.973080536 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0895999Z 2025-12-04T10:49:11.0896146Z [W1204 10:25:08.973167884 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0896149Z 2025-12-04T10:49:11.0896294Z [W1204 10:25:08.974538689 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0896297Z 2025-12-04T10:49:11.0896445Z [W1204 10:25:08.974788254 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0896447Z 2025-12-04T10:49:11.0896592Z [W1204 10:25:08.974863723 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0896594Z 2025-12-04T10:49:11.0896742Z [W1204 10:25:08.977108532 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0896744Z 2025-12-04T10:49:11.0896891Z [W1204 10:25:08.977358637 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0896894Z 2025-12-04T10:49:11.0897039Z [W1204 10:25:08.977433905 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0897043Z 2025-12-04T10:49:11.0897080Z FAILED [0.4476s] [100%] 2025-12-04T10:49:11.0897083Z 2025-12-04T10:49:11.0897134Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0897281Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0897325Z Traceback (most recent call last): 2025-12-04T10:49:11.0897501Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0897541Z method(*args, **kwargs) 2025-12-04T10:49:11.0897692Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0897732Z method(*args, **kwargs) 2025-12-04T10:49:11.0897884Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0897922Z with policy(): 2025-12-04T10:49:11.0898076Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0898115Z raise RuntimeError(msg) 2025-12-04T10:49:11.0898513Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.0898515Z 2025-12-04T10:49:11.0898588Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0898871Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0898897Z 2025-12-04T10:49:11.0898983Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0899053Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0899108Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0899378Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0899451Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0899486Z graph_break [] 2025-12-04T10:49:11.0899557Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0899902Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0899945Z if out == self.unknown_value: 2025-12-04T10:49:11.0900093Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0900136Z Traceback (most recent call last): 2025-12-04T10:49:11.0900291Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0900330Z method(*args, **kwargs) 2025-12-04T10:49:11.0900481Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0900520Z method(*args, **kwargs) 2025-12-04T10:49:11.0900670Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0900706Z with policy(): 2025-12-04T10:49:11.0900858Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0900897Z raise RuntimeError(msg) 2025-12-04T10:49:11.0901313Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.0901316Z 2025-12-04T10:49:11.0901388Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0901671Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0901676Z 2025-12-04T10:49:11.0901762Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0901834Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0901923Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0902193Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0902264Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0902302Z graph_break [] 2025-12-04T10:49:11.0902373Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0902734Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0902790Z if out == self.unknown_value: 2025-12-04T10:49:11.0902860Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0902914Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0902984Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0903254Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0903290Z graph_break [] 2025-12-04T10:49:11.0903340Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0903489Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.0903533Z Traceback (most recent call last): 2025-12-04T10:49:11.0903685Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0903724Z method(*args, **kwargs) 2025-12-04T10:49:11.0903874Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0903912Z method(*args, **kwargs) 2025-12-04T10:49:11.0904062Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0904098Z with policy(): 2025-12-04T10:49:11.0904248Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0904289Z raise RuntimeError(msg) 2025-12-04T10:49:11.0904689Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0904693Z 2025-12-04T10:49:11.0904764Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0905077Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0905079Z 2025-12-04T10:49:11.0905165Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0905235Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0905291Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0905559Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0905631Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0905666Z graph_break [] 2025-12-04T10:49:11.0905737Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0906079Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0906124Z if out == self.unknown_value: 2025-12-04T10:49:11.0906194Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0906260Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0906330Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0906614Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0906651Z graph_break [] 2025-12-04T10:49:11.0906720Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0906774Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0906843Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0907115Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0907152Z graph_break [] 2025-12-04T10:49:11.0907393Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-a24c0da403582f2c.xml - 2025-12-04T10:49:11.0907452Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0908077Z FAILED [0.4476s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.0908080Z 2025-12-04T10:49:11.0908153Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0908435Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0908438Z 2025-12-04T10:49:11.0908522Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0908582Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0908649Z ================== 1 failed, 57 deselected, 2 rerun in 11.29s ================== 2025-12-04T10:49:11.0908710Z Got exit code 1 2025-12-04T10:49:11.0908944Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.0909071Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0909267Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-fd829784cb5c9d9a.xml 2025-12-04T10:49:11.0909323Z ============================= test session starts ============================== 2025-12-04T10:49:11.0909436Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0909480Z cachedir: .pytest_cache 2025-12-04T10:49:11.0909638Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0909684Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0909724Z configfile: pytest.ini 2025-12-04T10:49:11.0909884Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0909958Z collecting ... collected 58 items / 28 deselected / 30 selected 2025-12-04T10:49:11.0910020Z stepcurrent: skipping 28 already run items. 2025-12-04T10:49:11.0910064Z Running 30 items in this shard 2025-12-04T10:49:11.0910066Z 2025-12-04T10:49:11.0910325Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.5213s] [ 3%] 2025-12-04T10:49:11.0910567Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.4479s] [ 3%] 2025-12-04T10:49:11.0910787Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 FAILED [0.4347s] [ 3%] 2025-12-04T10:49:11.0910789Z 2025-12-04T10:49:11.0910839Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0910990Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0911035Z Traceback (most recent call last): 2025-12-04T10:49:11.0911192Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0911231Z method(*args, **kwargs) 2025-12-04T10:49:11.0911384Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0911423Z method(*args, **kwargs) 2025-12-04T10:49:11.0911575Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0911612Z with policy(): 2025-12-04T10:49:11.0911765Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0911806Z raise RuntimeError(msg) 2025-12-04T10:49:11.0912229Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0912232Z 2025-12-04T10:49:11.0912305Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0912625Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0912627Z 2025-12-04T10:49:11.0912713Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0912785Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0912841Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0913016Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0913090Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0913126Z graph_break [] 2025-12-04T10:49:11.0913272Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0913316Z Traceback (most recent call last): 2025-12-04T10:49:11.0913469Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0913509Z method(*args, **kwargs) 2025-12-04T10:49:11.0913657Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0913716Z method(*args, **kwargs) 2025-12-04T10:49:11.0913865Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0913917Z with policy(): 2025-12-04T10:49:11.0914068Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0914107Z raise RuntimeError(msg) 2025-12-04T10:49:11.0914514Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0914517Z 2025-12-04T10:49:11.0914588Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0914872Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0914876Z 2025-12-04T10:49:11.0914961Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0915033Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0915087Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0915263Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0915333Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0915369Z graph_break [] 2025-12-04T10:49:11.0915439Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0915493Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0915563Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0915736Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0915772Z graph_break [] 2025-12-04T10:49:11.0915824Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0915974Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0916020Z Traceback (most recent call last): 2025-12-04T10:49:11.0916194Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0916235Z method(*args, **kwargs) 2025-12-04T10:49:11.0916385Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0916425Z method(*args, **kwargs) 2025-12-04T10:49:11.0916573Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0916610Z with policy(): 2025-12-04T10:49:11.0916762Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0916802Z raise RuntimeError(msg) 2025-12-04T10:49:11.0917211Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0917213Z 2025-12-04T10:49:11.0917283Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0917577Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0917590Z 2025-12-04T10:49:11.0917674Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0917746Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0917799Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0917974Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0918044Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0918080Z graph_break [] 2025-12-04T10:49:11.0918150Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0918205Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0918274Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0918446Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0918483Z graph_break [] 2025-12-04T10:49:11.0918552Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0918607Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0918676Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0918850Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0918885Z graph_break [] 2025-12-04T10:49:11.0919126Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-fd829784cb5c9d9a.xml - 2025-12-04T10:49:11.0919185Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0919836Z FAILED [0.4347s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0919839Z 2025-12-04T10:49:11.0919911Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0920198Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0920201Z 2025-12-04T10:49:11.0920286Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0920348Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0920414Z ================== 1 failed, 28 deselected, 2 rerun in 3.57s =================== 2025-12-04T10:49:11.0920449Z Got exit code 1 2025-12-04T10:49:11.0920489Z Retrying single test... 2025-12-04T10:49:11.0920686Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8b412928b2a76780.xml 2025-12-04T10:49:11.0920743Z ============================= test session starts ============================== 2025-12-04T10:49:11.0920853Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0920905Z cachedir: .pytest_cache 2025-12-04T10:49:11.0921063Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0921109Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0921158Z configfile: pytest.ini 2025-12-04T10:49:11.0921319Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0921391Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0921676Z stepcurrent: skipping 28 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0921720Z Running 1 items in this shard 2025-12-04T10:49:11.0921723Z 2025-12-04T10:49:11.0922129Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:25:27.615861567 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0922133Z 2025-12-04T10:49:11.0922284Z [W1204 10:25:34.253038365 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0922286Z 2025-12-04T10:49:11.0922435Z [W1204 10:25:34.253194942 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0922437Z 2025-12-04T10:49:11.0922587Z [W1204 10:25:34.256774305 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0922589Z 2025-12-04T10:49:11.0922736Z [W1204 10:25:34.257075860 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0922740Z 2025-12-04T10:49:11.0922888Z [W1204 10:25:34.257154718 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0922890Z 2025-12-04T10:49:11.0923038Z [W1204 10:25:34.259699251 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0923040Z 2025-12-04T10:49:11.0923185Z [W1204 10:25:34.259963066 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0923187Z 2025-12-04T10:49:11.0923379Z [W1204 10:25:34.260044055 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0923381Z 2025-12-04T10:49:11.0923432Z ('RERUN', {'yellow': True}) [10.3257s] [100%] 2025-12-04T10:49:11.0923792Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:25:35.422158770 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0923796Z 2025-12-04T10:49:11.0923944Z [W1204 10:25:35.422609082 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0923946Z 2025-12-04T10:49:11.0924092Z [W1204 10:25:35.422709320 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0924094Z 2025-12-04T10:49:11.0924243Z [W1204 10:25:35.424105374 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0924245Z 2025-12-04T10:49:11.0924390Z [W1204 10:25:35.424437038 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0924406Z 2025-12-04T10:49:11.0924554Z [W1204 10:25:35.424515447 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0924556Z 2025-12-04T10:49:11.0924716Z [W1204 10:25:35.426830234 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0924718Z 2025-12-04T10:49:11.0924864Z [W1204 10:25:35.427097999 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0924866Z 2025-12-04T10:49:11.0925015Z [W1204 10:25:35.427175927 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0925016Z 2025-12-04T10:49:11.0925065Z ('RERUN', {'yellow': True}) [0.6578s] [100%] 2025-12-04T10:49:11.0925420Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:25:36.080955709 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0925423Z 2025-12-04T10:49:11.0925571Z [W1204 10:25:36.081363761 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0925574Z 2025-12-04T10:49:11.0925719Z [W1204 10:25:36.081448219 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0925721Z 2025-12-04T10:49:11.0925869Z [W1204 10:25:36.082846224 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0925871Z 2025-12-04T10:49:11.0926018Z [W1204 10:25:36.083176097 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0926019Z 2025-12-04T10:49:11.0926169Z [W1204 10:25:36.083255366 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0926171Z 2025-12-04T10:49:11.0926319Z [W1204 10:25:36.085585023 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0926322Z 2025-12-04T10:49:11.0926468Z [W1204 10:25:36.085845588 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0926469Z 2025-12-04T10:49:11.0926639Z [W1204 10:25:36.085922157 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0926641Z 2025-12-04T10:49:11.0926679Z FAILED [0.6578s] [100%] 2025-12-04T10:49:11.0926680Z 2025-12-04T10:49:11.0926733Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0926882Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0926929Z Traceback (most recent call last): 2025-12-04T10:49:11.0927085Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0927127Z method(*args, **kwargs) 2025-12-04T10:49:11.0927277Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0927316Z method(*args, **kwargs) 2025-12-04T10:49:11.0927467Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0927504Z with policy(): 2025-12-04T10:49:11.0927655Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0927696Z raise RuntimeError(msg) 2025-12-04T10:49:11.0928102Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0928117Z 2025-12-04T10:49:11.0928190Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0928477Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0928482Z 2025-12-04T10:49:11.0928566Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0928637Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0928692Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0928869Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0928941Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0928978Z graph_break [] 2025-12-04T10:49:11.0929048Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0929392Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0929437Z if out == self.unknown_value: 2025-12-04T10:49:11.0929585Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0929630Z Traceback (most recent call last): 2025-12-04T10:49:11.0929783Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0929822Z method(*args, **kwargs) 2025-12-04T10:49:11.0929972Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0930012Z method(*args, **kwargs) 2025-12-04T10:49:11.0930161Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0930197Z with policy(): 2025-12-04T10:49:11.0930369Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0930410Z raise RuntimeError(msg) 2025-12-04T10:49:11.0930814Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0930817Z 2025-12-04T10:49:11.0930891Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0931180Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0931182Z 2025-12-04T10:49:11.0931267Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0931339Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0931394Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0931568Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0931649Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0931686Z graph_break [] 2025-12-04T10:49:11.0931771Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0932148Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0932190Z if out == self.unknown_value: 2025-12-04T10:49:11.0932262Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0932316Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0932388Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0932560Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0932598Z graph_break [] 2025-12-04T10:49:11.0932648Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0932799Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0932844Z Traceback (most recent call last): 2025-12-04T10:49:11.0932995Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0933036Z method(*args, **kwargs) 2025-12-04T10:49:11.0933189Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0933228Z method(*args, **kwargs) 2025-12-04T10:49:11.0933378Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0933415Z with policy(): 2025-12-04T10:49:11.0933568Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0933610Z raise RuntimeError(msg) 2025-12-04T10:49:11.0934013Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0934050Z 2025-12-04T10:49:11.0934123Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0934408Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0934412Z 2025-12-04T10:49:11.0934496Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0934568Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0934622Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0934796Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0934942Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0934981Z graph_break [] 2025-12-04T10:49:11.0935051Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0935390Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0935448Z if out == self.unknown_value: 2025-12-04T10:49:11.0935519Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0935599Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0935670Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0935844Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0935881Z graph_break [] 2025-12-04T10:49:11.0935952Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0936006Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0936075Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0936247Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0936284Z graph_break [] 2025-12-04T10:49:11.0936524Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8b412928b2a76780.xml - 2025-12-04T10:49:11.0936586Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0937217Z FAILED [0.6578s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0937221Z 2025-12-04T10:49:11.0937293Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0937576Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0937581Z 2025-12-04T10:49:11.0937666Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0937728Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0937815Z ================== 1 failed, 57 deselected, 2 rerun in 11.79s ================== 2025-12-04T10:49:11.0937852Z Got exit code 1 2025-12-04T10:49:11.0937891Z Retrying single test... 2025-12-04T10:49:11.0938088Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-7f5a34953cd8dfd7.xml 2025-12-04T10:49:11.0938145Z ============================= test session starts ============================== 2025-12-04T10:49:11.0938257Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0938298Z cachedir: .pytest_cache 2025-12-04T10:49:11.0938456Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0938501Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0938541Z configfile: pytest.ini 2025-12-04T10:49:11.0938705Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0938778Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0939063Z stepcurrent: skipping 28 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0939120Z Running 1 items in this shard 2025-12-04T10:49:11.0939122Z 2025-12-04T10:49:11.0939480Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:25:45.112645936 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0939495Z 2025-12-04T10:49:11.0939646Z [W1204 10:25:52.503906627 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0939649Z 2025-12-04T10:49:11.0939799Z [W1204 10:25:52.504084873 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0939801Z 2025-12-04T10:49:11.0939950Z [W1204 10:25:52.508289545 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0939952Z 2025-12-04T10:49:11.0940100Z [W1204 10:25:52.508598030 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0940103Z 2025-12-04T10:49:11.0940249Z [W1204 10:25:52.508675098 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0940252Z 2025-12-04T10:49:11.0940398Z [W1204 10:25:52.511211871 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0940400Z 2025-12-04T10:49:11.0940549Z [W1204 10:25:52.511468577 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0940551Z 2025-12-04T10:49:11.0940697Z [W1204 10:25:52.511546425 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0940700Z 2025-12-04T10:49:11.0940750Z ('RERUN', {'yellow': True}) [10.0781s] [100%] 2025-12-04T10:49:11.0941103Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:25:54.682386976 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0941106Z 2025-12-04T10:49:11.0941253Z [W1204 10:25:54.682788018 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0941275Z 2025-12-04T10:49:11.0941422Z [W1204 10:25:54.682868037 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0941424Z 2025-12-04T10:49:11.0941570Z [W1204 10:25:54.684257811 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0941573Z 2025-12-04T10:49:11.0941720Z [W1204 10:25:54.684576155 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0941723Z 2025-12-04T10:49:11.0941912Z [W1204 10:25:54.684652554 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0941914Z 2025-12-04T10:49:11.0942064Z [W1204 10:25:54.686958881 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0942065Z 2025-12-04T10:49:11.0942216Z [W1204 10:25:54.687238236 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0942218Z 2025-12-04T10:49:11.0942365Z [W1204 10:25:54.687316535 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0942386Z 2025-12-04T10:49:11.0942436Z ('RERUN', {'yellow': True}) [0.6614s] [100%] 2025-12-04T10:49:11.0942787Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:25:54.335089316 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0942803Z 2025-12-04T10:49:11.0942952Z [W1204 10:25:54.335477179 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0942954Z 2025-12-04T10:49:11.0943102Z [W1204 10:25:54.335558268 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0943105Z 2025-12-04T10:49:11.0943250Z [W1204 10:25:54.336933182 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0943253Z 2025-12-04T10:49:11.0943402Z [W1204 10:25:54.337256986 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0943405Z 2025-12-04T10:49:11.0943551Z [W1204 10:25:54.337335515 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0943554Z 2025-12-04T10:49:11.0943702Z [W1204 10:25:54.339635852 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0943704Z 2025-12-04T10:49:11.0943852Z [W1204 10:25:54.339887888 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0943855Z 2025-12-04T10:49:11.0944004Z [W1204 10:25:54.339961336 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0944007Z 2025-12-04T10:49:11.0944046Z FAILED [0.6498s] [100%] 2025-12-04T10:49:11.0944047Z 2025-12-04T10:49:11.0944099Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0944250Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0944295Z Traceback (most recent call last): 2025-12-04T10:49:11.0944452Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0944493Z method(*args, **kwargs) 2025-12-04T10:49:11.0944671Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0944710Z method(*args, **kwargs) 2025-12-04T10:49:11.0944860Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0944897Z with policy(): 2025-12-04T10:49:11.0945048Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0945088Z raise RuntimeError(msg) 2025-12-04T10:49:11.0945484Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0945489Z 2025-12-04T10:49:11.0945563Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0945850Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0945871Z 2025-12-04T10:49:11.0945958Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0946029Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0946098Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0946272Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0946344Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0946380Z graph_break [] 2025-12-04T10:49:11.0946452Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0946793Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0946839Z if out == self.unknown_value: 2025-12-04T10:49:11.0946986Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0947032Z Traceback (most recent call last): 2025-12-04T10:49:11.0947183Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0947225Z method(*args, **kwargs) 2025-12-04T10:49:11.0947375Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0947415Z method(*args, **kwargs) 2025-12-04T10:49:11.0947565Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0947600Z with policy(): 2025-12-04T10:49:11.0947751Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0947791Z raise RuntimeError(msg) 2025-12-04T10:49:11.0948194Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0948198Z 2025-12-04T10:49:11.0948270Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0948580Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0948582Z 2025-12-04T10:49:11.0948667Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0948740Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0948796Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0948969Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0949043Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0949080Z graph_break [] 2025-12-04T10:49:11.0949151Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0949494Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0949538Z if out == self.unknown_value: 2025-12-04T10:49:11.0949607Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0949677Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0949747Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0949934Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0949970Z graph_break [] 2025-12-04T10:49:11.0950021Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0950170Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.0950215Z Traceback (most recent call last): 2025-12-04T10:49:11.0950367Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0950407Z method(*args, **kwargs) 2025-12-04T10:49:11.0950557Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0950597Z method(*args, **kwargs) 2025-12-04T10:49:11.0950747Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0950785Z with policy(): 2025-12-04T10:49:11.0950935Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0950976Z raise RuntimeError(msg) 2025-12-04T10:49:11.0951385Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0951388Z 2025-12-04T10:49:11.0951460Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0951748Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0951751Z 2025-12-04T10:49:11.0951835Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0951948Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0952002Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0952212Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0952283Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0952320Z graph_break [] 2025-12-04T10:49:11.0952392Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0952732Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0952776Z if out == self.unknown_value: 2025-12-04T10:49:11.0952846Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0952901Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0952972Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0953146Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0953181Z graph_break [] 2025-12-04T10:49:11.0953253Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0953320Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0953390Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0953577Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0953614Z graph_break [] 2025-12-04T10:49:11.0953854Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-7f5a34953cd8dfd7.xml - 2025-12-04T10:49:11.0953915Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0954548Z FAILED [0.6498s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0954552Z 2025-12-04T10:49:11.0954623Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0954908Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0954912Z 2025-12-04T10:49:11.0954995Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0955058Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0955122Z ================== 1 failed, 57 deselected, 2 rerun in 11.54s ================== 2025-12-04T10:49:11.0955160Z Got exit code 1 2025-12-04T10:49:11.0955397Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.0955526Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.0955722Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-9bec0ec888252497.xml 2025-12-04T10:49:11.0955798Z ============================= test session starts ============================== 2025-12-04T10:49:11.0955910Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0955950Z cachedir: .pytest_cache 2025-12-04T10:49:11.0956108Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0956155Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0956194Z configfile: pytest.ini 2025-12-04T10:49:11.0956353Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0956429Z collecting ... collected 58 items / 29 deselected / 29 selected 2025-12-04T10:49:11.0956480Z stepcurrent: skipping 29 already run items. 2025-12-04T10:49:11.0956525Z Running 29 items in this shard 2025-12-04T10:49:11.0956527Z 2025-12-04T10:49:11.0956772Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.6348s] [ 3%] 2025-12-04T10:49:11.0957014Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.5564s] [ 3%] 2025-12-04T10:49:11.0957245Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 FAILED [0.5534s] [ 3%] 2025-12-04T10:49:11.0957260Z 2025-12-04T10:49:11.0957310Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0957457Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0957501Z Traceback (most recent call last): 2025-12-04T10:49:11.0957662Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0957701Z method(*args, **kwargs) 2025-12-04T10:49:11.0957853Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0957893Z method(*args, **kwargs) 2025-12-04T10:49:11.0958043Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0958079Z with policy(): 2025-12-04T10:49:11.0958230Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0958271Z raise RuntimeError(msg) 2025-12-04T10:49:11.0958665Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0958667Z 2025-12-04T10:49:11.0958739Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0959026Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0959029Z 2025-12-04T10:49:11.0959114Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0959187Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0959242Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0959419Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0959511Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0959547Z graph_break [] 2025-12-04T10:49:11.0959694Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0959738Z Traceback (most recent call last): 2025-12-04T10:49:11.0959892Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0959931Z method(*args, **kwargs) 2025-12-04T10:49:11.0960080Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0960120Z method(*args, **kwargs) 2025-12-04T10:49:11.0960269Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0960305Z with policy(): 2025-12-04T10:49:11.0960458Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0960497Z raise RuntimeError(msg) 2025-12-04T10:49:11.0960892Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0960907Z 2025-12-04T10:49:11.0960978Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0961272Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0961275Z 2025-12-04T10:49:11.0961360Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0961434Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0961489Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0961663Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0961735Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0961771Z graph_break [] 2025-12-04T10:49:11.0961842Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0961933Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0962003Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0962175Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0962211Z graph_break [] 2025-12-04T10:49:11.0962263Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0962410Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0962455Z Traceback (most recent call last): 2025-12-04T10:49:11.0962610Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0962650Z method(*args, **kwargs) 2025-12-04T10:49:11.0962801Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0962841Z method(*args, **kwargs) 2025-12-04T10:49:11.0962989Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0963025Z with policy(): 2025-12-04T10:49:11.0963212Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0963252Z raise RuntimeError(msg) 2025-12-04T10:49:11.0963647Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0963651Z 2025-12-04T10:49:11.0963723Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0964007Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0964009Z 2025-12-04T10:49:11.0964094Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0964166Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0964220Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0964393Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0964477Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0964513Z graph_break [] 2025-12-04T10:49:11.0964600Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0964653Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0964722Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0964895Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0964932Z graph_break [] 2025-12-04T10:49:11.0965003Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0965055Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0965125Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0965296Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0965333Z graph_break [] 2025-12-04T10:49:11.0965573Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-9bec0ec888252497.xml - 2025-12-04T10:49:11.0965633Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0966261Z FAILED [0.5534s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0966266Z 2025-12-04T10:49:11.0966337Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0966621Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0966625Z 2025-12-04T10:49:11.0966707Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0966787Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0966852Z ================== 1 failed, 29 deselected, 2 rerun in 3.89s =================== 2025-12-04T10:49:11.0966890Z Got exit code 1 2025-12-04T10:49:11.0966929Z Retrying single test... 2025-12-04T10:49:11.0967126Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-2dbdafa4d9179798.xml 2025-12-04T10:49:11.0967182Z ============================= test session starts ============================== 2025-12-04T10:49:11.0967293Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0967336Z cachedir: .pytest_cache 2025-12-04T10:49:11.0967493Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0967538Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0967577Z configfile: pytest.ini 2025-12-04T10:49:11.0967740Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0967811Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0968096Z stepcurrent: skipping 29 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0968150Z Running 1 items in this shard 2025-12-04T10:49:11.0968152Z 2025-12-04T10:49:11.0968519Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:26:15.696010203 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0968522Z 2025-12-04T10:49:11.0968675Z [W1204 10:26:22.369953965 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0968677Z 2025-12-04T10:49:11.0968826Z [W1204 10:26:22.370135972 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0968828Z 2025-12-04T10:49:11.0968977Z [W1204 10:26:22.373556278 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0968980Z 2025-12-04T10:49:11.0969127Z [W1204 10:26:22.373839953 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0969130Z 2025-12-04T10:49:11.0969280Z [W1204 10:26:22.373922362 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0969282Z 2025-12-04T10:49:11.0969430Z [W1204 10:26:22.376258889 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0969432Z 2025-12-04T10:49:11.0969581Z [W1204 10:26:22.376517694 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0969583Z 2025-12-04T10:49:11.0969731Z [W1204 10:26:22.376593942 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0969733Z 2025-12-04T10:49:11.0969783Z ('RERUN', {'yellow': True}) [10.2058s] [100%] 2025-12-04T10:49:11.0970142Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:26:23.313900106 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0970145Z 2025-12-04T10:49:11.0970312Z [W1204 10:26:23.314285479 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0970314Z 2025-12-04T10:49:11.0970462Z [W1204 10:26:23.314389757 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0970463Z 2025-12-04T10:49:11.0970610Z [W1204 10:26:23.315770451 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0970614Z 2025-12-04T10:49:11.0970760Z [W1204 10:26:23.316112335 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0970763Z 2025-12-04T10:49:11.0970910Z [W1204 10:26:23.316193434 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0970913Z 2025-12-04T10:49:11.0971060Z [W1204 10:26:23.318358414 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0971063Z 2025-12-04T10:49:11.0971212Z [W1204 10:26:23.318613099 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0971214Z 2025-12-04T10:49:11.0971362Z [W1204 10:26:23.318688738 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0971378Z 2025-12-04T10:49:11.0971426Z ('RERUN', {'yellow': True}) [0.4512s] [100%] 2025-12-04T10:49:11.0971775Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:26:24.758968529 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0971788Z 2025-12-04T10:49:11.0971970Z [W1204 10:26:24.759357262 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0971973Z 2025-12-04T10:49:11.0972123Z [W1204 10:26:24.759453370 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0972125Z 2025-12-04T10:49:11.0972272Z [W1204 10:26:24.760828855 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0972275Z 2025-12-04T10:49:11.0972423Z [W1204 10:26:24.761171178 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0972426Z 2025-12-04T10:49:11.0972574Z [W1204 10:26:24.761254057 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0972576Z 2025-12-04T10:49:11.0972721Z [W1204 10:26:24.763442026 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0972723Z 2025-12-04T10:49:11.0972872Z [W1204 10:26:24.763703472 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0972874Z 2025-12-04T10:49:11.0973020Z [W1204 10:26:24.763780230 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0973023Z 2025-12-04T10:49:11.0973061Z FAILED [0.4399s] [100%] 2025-12-04T10:49:11.0973063Z 2025-12-04T10:49:11.0973114Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0973262Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0973308Z Traceback (most recent call last): 2025-12-04T10:49:11.0973465Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0973507Z method(*args, **kwargs) 2025-12-04T10:49:11.0973686Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0973726Z method(*args, **kwargs) 2025-12-04T10:49:11.0973876Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0973914Z with policy(): 2025-12-04T10:49:11.0974067Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0974109Z raise RuntimeError(msg) 2025-12-04T10:49:11.0974504Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0974507Z 2025-12-04T10:49:11.0974582Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0974870Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0974887Z 2025-12-04T10:49:11.0974973Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0975045Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0975115Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0975291Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0975362Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0975399Z graph_break [] 2025-12-04T10:49:11.0975471Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0975815Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0975860Z if out == self.unknown_value: 2025-12-04T10:49:11.0976007Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0976052Z Traceback (most recent call last): 2025-12-04T10:49:11.0976205Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0976245Z method(*args, **kwargs) 2025-12-04T10:49:11.0976395Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0976434Z method(*args, **kwargs) 2025-12-04T10:49:11.0976586Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0976621Z with policy(): 2025-12-04T10:49:11.0976775Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0976816Z raise RuntimeError(msg) 2025-12-04T10:49:11.0977214Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0977217Z 2025-12-04T10:49:11.0977289Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0977596Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0977599Z 2025-12-04T10:49:11.0977686Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0977757Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0977812Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0977984Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0978060Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0978095Z graph_break [] 2025-12-04T10:49:11.0978166Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0978509Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0978553Z if out == self.unknown_value: 2025-12-04T10:49:11.0978623Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0978688Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0978759Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0978945Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0978981Z graph_break [] 2025-12-04T10:49:11.0979031Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0979181Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0979225Z Traceback (most recent call last): 2025-12-04T10:49:11.0979378Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0979416Z method(*args, **kwargs) 2025-12-04T10:49:11.0979567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0979606Z method(*args, **kwargs) 2025-12-04T10:49:11.0979757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0979793Z with policy(): 2025-12-04T10:49:11.0979943Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0979983Z raise RuntimeError(msg) 2025-12-04T10:49:11.0980384Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0980387Z 2025-12-04T10:49:11.0980460Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0980742Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0980745Z 2025-12-04T10:49:11.0980831Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0980902Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0980957Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0981148Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0981219Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0981254Z graph_break [] 2025-12-04T10:49:11.0981327Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0981665Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0981710Z if out == self.unknown_value: 2025-12-04T10:49:11.0981780Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0981834Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0981940Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0982116Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0982153Z graph_break [] 2025-12-04T10:49:11.0982222Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0982293Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0982362Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0982549Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0982585Z graph_break [] 2025-12-04T10:49:11.0982826Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-2dbdafa4d9179798.xml - 2025-12-04T10:49:11.0982885Z =========================== short test summary info ============================ 2025-12-04T10:49:11.0983509Z FAILED [0.4399s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0983516Z 2025-12-04T10:49:11.0983586Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0983868Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0983872Z 2025-12-04T10:49:11.0983956Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0984017Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.0984083Z ================== 1 failed, 57 deselected, 2 rerun in 11.27s ================== 2025-12-04T10:49:11.0984120Z Got exit code 1 2025-12-04T10:49:11.0984160Z Retrying single test... 2025-12-04T10:49:11.0984356Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d391ccf0385c5998.xml 2025-12-04T10:49:11.0984414Z ============================= test session starts ============================== 2025-12-04T10:49:11.0984524Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.0984564Z cachedir: .pytest_cache 2025-12-04T10:49:11.0984745Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.0984791Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.0984831Z configfile: pytest.ini 2025-12-04T10:49:11.0984992Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.0985067Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.0985352Z stepcurrent: skipping 29 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0985397Z Running 1 items in this shard 2025-12-04T10:49:11.0985399Z 2025-12-04T10:49:11.0985756Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:26:33.723578562 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0985759Z 2025-12-04T10:49:11.0985911Z [W1204 10:26:40.121485074 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0985912Z 2025-12-04T10:49:11.0986073Z [W1204 10:26:40.121650540 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0986075Z 2025-12-04T10:49:11.0986223Z [W1204 10:26:40.125302593 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0986237Z 2025-12-04T10:49:11.0986384Z [W1204 10:26:40.125594928 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0986387Z 2025-12-04T10:49:11.0986534Z [W1204 10:26:40.125675376 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0986537Z 2025-12-04T10:49:11.0986684Z [W1204 10:26:40.128158980 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0986686Z 2025-12-04T10:49:11.0986831Z [W1204 10:26:40.128422165 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0986834Z 2025-12-04T10:49:11.0986982Z [W1204 10:26:40.128499024 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0986985Z 2025-12-04T10:49:11.0987034Z ('RERUN', {'yellow': True}) [10.0435s] [100%] 2025-12-04T10:49:11.0987392Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:26:41.144698262 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0987394Z 2025-12-04T10:49:11.0987544Z [W1204 10:26:41.145078335 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0987546Z 2025-12-04T10:49:11.0987693Z [W1204 10:26:41.145163404 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0987695Z 2025-12-04T10:49:11.0987841Z [W1204 10:26:41.146534558 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0987844Z 2025-12-04T10:49:11.0987991Z [W1204 10:26:41.146856882 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0987993Z 2025-12-04T10:49:11.0988160Z [W1204 10:26:41.146934951 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0988162Z 2025-12-04T10:49:11.0988309Z [W1204 10:26:41.149210809 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0988311Z 2025-12-04T10:49:11.0988457Z [W1204 10:26:41.149469174 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0988460Z 2025-12-04T10:49:11.0988608Z [W1204 10:26:41.149545433 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0988611Z 2025-12-04T10:49:11.0988659Z ('RERUN', {'yellow': True}) [0.5203s] [100%] 2025-12-04T10:49:11.0989011Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:26:42.662076995 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0989013Z 2025-12-04T10:49:11.0989162Z [W1204 10:26:42.662469348 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0989164Z 2025-12-04T10:49:11.0989310Z [W1204 10:26:42.662561756 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0989324Z 2025-12-04T10:49:11.0989474Z [W1204 10:26:42.663985180 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0989492Z 2025-12-04T10:49:11.0989640Z [W1204 10:26:42.664323604 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0989642Z 2025-12-04T10:49:11.0989790Z [W1204 10:26:42.664403012 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0989792Z 2025-12-04T10:49:11.0989938Z [W1204 10:26:42.666687610 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0989942Z 2025-12-04T10:49:11.0990088Z [W1204 10:26:42.666943535 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0990090Z 2025-12-04T10:49:11.0990237Z [W1204 10:26:42.667025374 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.0990240Z 2025-12-04T10:49:11.0990278Z FAILED [0.5239s] [100%] 2025-12-04T10:49:11.0990280Z 2025-12-04T10:49:11.0990331Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.0990477Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0990525Z Traceback (most recent call last): 2025-12-04T10:49:11.0990680Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0990721Z method(*args, **kwargs) 2025-12-04T10:49:11.0990872Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0990913Z method(*args, **kwargs) 2025-12-04T10:49:11.0991064Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0991102Z with policy(): 2025-12-04T10:49:11.0991253Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0991294Z raise RuntimeError(msg) 2025-12-04T10:49:11.0991706Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.0991709Z 2025-12-04T10:49:11.0991782Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0992113Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0992115Z 2025-12-04T10:49:11.0992201Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0992273Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0992327Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0992503Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0992574Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0992611Z graph_break [] 2025-12-04T10:49:11.0992682Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0993045Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0993106Z if out == self.unknown_value: 2025-12-04T10:49:11.0994857Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0994905Z Traceback (most recent call last): 2025-12-04T10:49:11.0995068Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0995108Z method(*args, **kwargs) 2025-12-04T10:49:11.0995260Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0995299Z method(*args, **kwargs) 2025-12-04T10:49:11.0995451Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0995488Z with policy(): 2025-12-04T10:49:11.0995640Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0995684Z raise RuntimeError(msg) 2025-12-04T10:49:11.0996085Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.0996087Z 2025-12-04T10:49:11.0996160Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0996445Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0996449Z 2025-12-04T10:49:11.0996536Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0996609Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0996666Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0996841Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0996951Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0996988Z graph_break [] 2025-12-04T10:49:11.0997058Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.0997400Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.0997445Z if out == self.unknown_value: 2025-12-04T10:49:11.0997516Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0997571Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.0997643Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.0997819Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.0997857Z graph_break [] 2025-12-04T10:49:11.0997908Z =================================== FAILURES =================================== 2025-12-04T10:49:11.0998055Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.0998114Z Traceback (most recent call last): 2025-12-04T10:49:11.0998268Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0998307Z method(*args, **kwargs) 2025-12-04T10:49:11.0998471Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.0998509Z method(*args, **kwargs) 2025-12-04T10:49:11.0998660Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.0998696Z with policy(): 2025-12-04T10:49:11.0998849Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.0998888Z raise RuntimeError(msg) 2025-12-04T10:49:11.0999288Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.0999294Z 2025-12-04T10:49:11.0999367Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.0999652Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.0999654Z 2025-12-04T10:49:11.0999744Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.0999815Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.0999869Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1000042Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1000114Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1000149Z graph_break [] 2025-12-04T10:49:11.1000222Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1000562Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1000605Z if out == self.unknown_value: 2025-12-04T10:49:11.1000695Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1000750Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1000820Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1000993Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1001031Z graph_break [] 2025-12-04T10:49:11.1001100Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1001155Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1001224Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1001397Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1001434Z graph_break [] 2025-12-04T10:49:11.1001673Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d391ccf0385c5998.xml - 2025-12-04T10:49:11.1001731Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1002414Z FAILED [0.5239s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1002434Z 2025-12-04T10:49:11.1002507Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1002791Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1002793Z 2025-12-04T10:49:11.1002878Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1002941Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1003009Z ================== 1 failed, 57 deselected, 2 rerun in 11.23s ================== 2025-12-04T10:49:11.1003047Z Got exit code 1 2025-12-04T10:49:11.1003282Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1003409Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1003607Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-9ab548a3d12d0ac7.xml 2025-12-04T10:49:11.1003663Z ============================= test session starts ============================== 2025-12-04T10:49:11.1003774Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1003816Z cachedir: .pytest_cache 2025-12-04T10:49:11.1003975Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1004021Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1004062Z configfile: pytest.ini 2025-12-04T10:49:11.1004225Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1004299Z collecting ... collected 58 items / 30 deselected / 28 selected 2025-12-04T10:49:11.1004378Z stepcurrent: skipping 30 already run items. 2025-12-04T10:49:11.1004422Z Running 28 items in this shard 2025-12-04T10:49:11.1004425Z 2025-12-04T10:49:11.1004669Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.8814s] [ 3%] 2025-12-04T10:49:11.1004911Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.4706s] [ 3%] 2025-12-04T10:49:11.1005131Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 FAILED [0.4426s] [ 3%] 2025-12-04T10:49:11.1005133Z 2025-12-04T10:49:11.1005183Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1005331Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1005376Z Traceback (most recent call last): 2025-12-04T10:49:11.1005533Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1005572Z method(*args, **kwargs) 2025-12-04T10:49:11.1005744Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1005782Z method(*args, **kwargs) 2025-12-04T10:49:11.1005943Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1005980Z with policy(): 2025-12-04T10:49:11.1006131Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1006171Z raise RuntimeError(msg) 2025-12-04T10:49:11.1006567Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1006570Z 2025-12-04T10:49:11.1006642Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1006927Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1006930Z 2025-12-04T10:49:11.1007015Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1007086Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1007141Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1007415Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1007487Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1007525Z graph_break [] 2025-12-04T10:49:11.1007671Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1007717Z Traceback (most recent call last): 2025-12-04T10:49:11.1007868Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1007907Z method(*args, **kwargs) 2025-12-04T10:49:11.1008056Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1008116Z method(*args, **kwargs) 2025-12-04T10:49:11.1008266Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1008302Z with policy(): 2025-12-04T10:49:11.1008453Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1008494Z raise RuntimeError(msg) 2025-12-04T10:49:11.1008896Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1008899Z 2025-12-04T10:49:11.1008971Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1009257Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1009260Z 2025-12-04T10:49:11.1009344Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1009415Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1009480Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1009749Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1009831Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1009867Z graph_break [] 2025-12-04T10:49:11.1009937Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1009992Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1010061Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1010330Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1010367Z graph_break [] 2025-12-04T10:49:11.1010419Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1010566Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1010611Z Traceback (most recent call last): 2025-12-04T10:49:11.1010764Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1010804Z method(*args, **kwargs) 2025-12-04T10:49:11.1010954Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1010993Z method(*args, **kwargs) 2025-12-04T10:49:11.1011144Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1011181Z with policy(): 2025-12-04T10:49:11.1011331Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1011372Z raise RuntimeError(msg) 2025-12-04T10:49:11.1011769Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1011772Z 2025-12-04T10:49:11.1011915Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1012198Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1012201Z 2025-12-04T10:49:11.1012285Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1012356Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1012410Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1012680Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1012753Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1012789Z graph_break [] 2025-12-04T10:49:11.1012860Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1012913Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1012983Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1013263Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1013315Z graph_break [] 2025-12-04T10:49:11.1013384Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1013437Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1013505Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1013771Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1013807Z graph_break [] 2025-12-04T10:49:11.1014050Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-9ab548a3d12d0ac7.xml - 2025-12-04T10:49:11.1014110Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1014733Z FAILED [0.4426s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1014737Z 2025-12-04T10:49:11.1014808Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1015090Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1015093Z 2025-12-04T10:49:11.1015178Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1015240Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1015305Z ================== 1 failed, 30 deselected, 2 rerun in 3.96s =================== 2025-12-04T10:49:11.1015341Z Got exit code 1 2025-12-04T10:49:11.1015381Z Retrying single test... 2025-12-04T10:49:11.1015598Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8ba4b164ab403bf7.xml 2025-12-04T10:49:11.1015655Z ============================= test session starts ============================== 2025-12-04T10:49:11.1015766Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1015808Z cachedir: .pytest_cache 2025-12-04T10:49:11.1015966Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1016011Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1016052Z configfile: pytest.ini 2025-12-04T10:49:11.1016215Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1016288Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1016571Z stepcurrent: skipping 30 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1016615Z Running 1 items in this shard 2025-12-04T10:49:11.1016617Z 2025-12-04T10:49:11.1016973Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:27:01.518841996 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1016998Z 2025-12-04T10:49:11.1017150Z [W1204 10:27:08.400827525 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1017152Z 2025-12-04T10:49:11.1017302Z [W1204 10:27:08.400989512 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1017304Z 2025-12-04T10:49:11.1017453Z [W1204 10:27:08.404919349 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1017455Z 2025-12-04T10:49:11.1017603Z [W1204 10:27:08.405219534 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1017605Z 2025-12-04T10:49:11.1017752Z [W1204 10:27:08.405297592 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1017755Z 2025-12-04T10:49:11.1017903Z [W1204 10:27:08.407896335 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1017905Z 2025-12-04T10:49:11.1018050Z [W1204 10:27:08.408176619 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1018053Z 2025-12-04T10:49:11.1018200Z [W1204 10:27:08.408254958 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1018202Z 2025-12-04T10:49:11.1018252Z ('RERUN', {'yellow': True}) [9.8819s] [100%] 2025-12-04T10:49:11.1018604Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:27:09.164122440 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1018607Z 2025-12-04T10:49:11.1018755Z [W1204 10:27:09.164513413 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1018757Z 2025-12-04T10:49:11.1018903Z [W1204 10:27:09.164605121 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1018907Z 2025-12-04T10:49:11.1019077Z [W1204 10:27:09.166005055 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1019079Z 2025-12-04T10:49:11.1019226Z [W1204 10:27:09.166262190 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1019229Z 2025-12-04T10:49:11.1019378Z [W1204 10:27:09.166336669 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1019380Z 2025-12-04T10:49:11.1019529Z [W1204 10:27:09.168607947 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1019532Z 2025-12-04T10:49:11.1019678Z [W1204 10:27:09.168859952 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1019680Z 2025-12-04T10:49:11.1019827Z [W1204 10:27:09.168933851 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1019829Z 2025-12-04T10:49:11.1019878Z ('RERUN', {'yellow': True}) [0.6212s] [100%] 2025-12-04T10:49:11.1020227Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:27:10.790446753 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1020239Z 2025-12-04T10:49:11.1020397Z [W1204 10:27:10.790855365 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1020398Z 2025-12-04T10:49:11.1020545Z [W1204 10:27:10.790954903 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1020546Z 2025-12-04T10:49:11.1020695Z [W1204 10:27:10.792379667 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1020697Z 2025-12-04T10:49:11.1020845Z [W1204 10:27:10.792659342 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1020846Z 2025-12-04T10:49:11.1020993Z [W1204 10:27:10.792740120 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1020995Z 2025-12-04T10:49:11.1021142Z [W1204 10:27:10.795053268 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1021146Z 2025-12-04T10:49:11.1021293Z [W1204 10:27:10.795317063 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1021294Z 2025-12-04T10:49:11.1021444Z [W1204 10:27:10.795394961 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1021446Z 2025-12-04T10:49:11.1021483Z FAILED [0.6262s] [100%] 2025-12-04T10:49:11.1021486Z 2025-12-04T10:49:11.1021538Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1021686Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1021732Z Traceback (most recent call last): 2025-12-04T10:49:11.1021928Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1021969Z method(*args, **kwargs) 2025-12-04T10:49:11.1022121Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1022160Z method(*args, **kwargs) 2025-12-04T10:49:11.1022339Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1022375Z with policy(): 2025-12-04T10:49:11.1022527Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1022566Z raise RuntimeError(msg) 2025-12-04T10:49:11.1022960Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1022963Z 2025-12-04T10:49:11.1023036Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1023323Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1023326Z 2025-12-04T10:49:11.1023411Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1023482Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1023537Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1023827Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1023915Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1023951Z graph_break [] 2025-12-04T10:49:11.1024022Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1024365Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1024409Z if out == self.unknown_value: 2025-12-04T10:49:11.1024555Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1024602Z Traceback (most recent call last): 2025-12-04T10:49:11.1024753Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1024794Z method(*args, **kwargs) 2025-12-04T10:49:11.1024943Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1024982Z method(*args, **kwargs) 2025-12-04T10:49:11.1025131Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1025168Z with policy(): 2025-12-04T10:49:11.1025321Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1025361Z raise RuntimeError(msg) 2025-12-04T10:49:11.1025757Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1025761Z 2025-12-04T10:49:11.1025832Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1026117Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1026119Z 2025-12-04T10:49:11.1026224Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1026296Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1026350Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1026619Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1026692Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1026728Z graph_break [] 2025-12-04T10:49:11.1026798Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1027139Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1027182Z if out == self.unknown_value: 2025-12-04T10:49:11.1027252Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1027307Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1027387Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1027654Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1027700Z graph_break [] 2025-12-04T10:49:11.1027752Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1027898Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1027945Z Traceback (most recent call last): 2025-12-04T10:49:11.1028100Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1028139Z method(*args, **kwargs) 2025-12-04T10:49:11.1028290Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1028331Z method(*args, **kwargs) 2025-12-04T10:49:11.1028479Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1028517Z with policy(): 2025-12-04T10:49:11.1028668Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1028708Z raise RuntimeError(msg) 2025-12-04T10:49:11.1029108Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1029111Z 2025-12-04T10:49:11.1029182Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1029467Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1029470Z 2025-12-04T10:49:11.1029555Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1029625Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1029679Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1029967Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1030038Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1030075Z graph_break [] 2025-12-04T10:49:11.1030146Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1030489Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1030534Z if out == self.unknown_value: 2025-12-04T10:49:11.1030604Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1030658Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1030728Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1030996Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1031042Z graph_break [] 2025-12-04T10:49:11.1031112Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1031165Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1031245Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1031509Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1031546Z graph_break [] 2025-12-04T10:49:11.1031791Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8ba4b164ab403bf7.xml - 2025-12-04T10:49:11.1031886Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1032510Z FAILED [0.6262s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1032514Z 2025-12-04T10:49:11.1032585Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1032872Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1032874Z 2025-12-04T10:49:11.1032957Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1033019Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1033086Z ================== 1 failed, 57 deselected, 2 rerun in 11.30s ================== 2025-12-04T10:49:11.1033122Z Got exit code 1 2025-12-04T10:49:11.1033162Z Retrying single test... 2025-12-04T10:49:11.1033360Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-082aaad3fa51c46e.xml 2025-12-04T10:49:11.1033416Z ============================= test session starts ============================== 2025-12-04T10:49:11.1033527Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1033602Z cachedir: .pytest_cache 2025-12-04T10:49:11.1033759Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1033805Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1033844Z configfile: pytest.ini 2025-12-04T10:49:11.1034007Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1034079Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1034363Z stepcurrent: skipping 30 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1034405Z Running 1 items in this shard 2025-12-04T10:49:11.1034408Z 2025-12-04T10:49:11.1034770Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:27:20.012087324 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1034772Z 2025-12-04T10:49:11.1034924Z [W1204 10:27:27.715286214 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1034941Z 2025-12-04T10:49:11.1035089Z [W1204 10:27:27.715468371 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1035103Z 2025-12-04T10:49:11.1035252Z [W1204 10:27:27.719407268 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1035254Z 2025-12-04T10:49:11.1035401Z [W1204 10:27:27.719707393 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1035404Z 2025-12-04T10:49:11.1035552Z [W1204 10:27:27.719783972 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1035554Z 2025-12-04T10:49:11.1035700Z [W1204 10:27:27.722229326 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1035703Z 2025-12-04T10:49:11.1035850Z [W1204 10:27:27.722497401 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1035853Z 2025-12-04T10:49:11.1036000Z [W1204 10:27:27.722573290 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1036002Z 2025-12-04T10:49:11.1036050Z ('RERUN', {'yellow': True}) [9.6932s] [100%] 2025-12-04T10:49:11.1036403Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:27:27.412318020 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1036405Z 2025-12-04T10:49:11.1036553Z [W1204 10:27:27.412664794 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1036556Z 2025-12-04T10:49:11.1036703Z [W1204 10:27:27.412743773 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1036706Z 2025-12-04T10:49:11.1036855Z [W1204 10:27:27.414111767 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1036857Z 2025-12-04T10:49:11.1037003Z [W1204 10:27:27.414358253 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1037005Z 2025-12-04T10:49:11.1037172Z [W1204 10:27:27.414431241 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1037174Z 2025-12-04T10:49:11.1037322Z [W1204 10:27:27.416604861 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1037325Z 2025-12-04T10:49:11.1037471Z [W1204 10:27:27.416859187 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1037474Z 2025-12-04T10:49:11.1037621Z [W1204 10:27:27.416933365 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1037623Z 2025-12-04T10:49:11.1037669Z ('RERUN', {'yellow': True}) [0.5533s] [100%] 2025-12-04T10:49:11.1038023Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:27:28.968069142 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1038025Z 2025-12-04T10:49:11.1038171Z [W1204 10:27:28.968415056 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1038184Z 2025-12-04T10:49:11.1038330Z [W1204 10:27:28.968493624 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1038342Z 2025-12-04T10:49:11.1038490Z [W1204 10:27:28.969833390 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1038492Z 2025-12-04T10:49:11.1038638Z [W1204 10:27:28.970084785 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1038640Z 2025-12-04T10:49:11.1038791Z [W1204 10:27:28.970161574 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1038793Z 2025-12-04T10:49:11.1038942Z [W1204 10:27:28.972324044 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1038945Z 2025-12-04T10:49:11.1039092Z [W1204 10:27:28.972579269 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1039094Z 2025-12-04T10:49:11.1039242Z [W1204 10:27:28.972656338 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1039244Z 2025-12-04T10:49:11.1039281Z FAILED [0.5553s] [100%] 2025-12-04T10:49:11.1039283Z 2025-12-04T10:49:11.1039335Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1039484Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1039530Z Traceback (most recent call last): 2025-12-04T10:49:11.1039687Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1039728Z method(*args, **kwargs) 2025-12-04T10:49:11.1039879Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1039921Z method(*args, **kwargs) 2025-12-04T10:49:11.1040071Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1040109Z with policy(): 2025-12-04T10:49:11.1040259Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1040300Z raise RuntimeError(msg) 2025-12-04T10:49:11.1040712Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1040715Z 2025-12-04T10:49:11.1040788Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1041074Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1041077Z 2025-12-04T10:49:11.1041162Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1041234Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1041288Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1041561Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1041633Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1041680Z graph_break [] 2025-12-04T10:49:11.1041751Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1042137Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1042181Z if out == self.unknown_value: 2025-12-04T10:49:11.1042326Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1042372Z Traceback (most recent call last): 2025-12-04T10:49:11.1042523Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1042562Z method(*args, **kwargs) 2025-12-04T10:49:11.1042712Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1042753Z method(*args, **kwargs) 2025-12-04T10:49:11.1042902Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1042940Z with policy(): 2025-12-04T10:49:11.1043090Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1043130Z raise RuntimeError(msg) 2025-12-04T10:49:11.1043535Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1043539Z 2025-12-04T10:49:11.1043610Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1043894Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1043898Z 2025-12-04T10:49:11.1043981Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1044053Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1044107Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1044407Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1044478Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1044516Z graph_break [] 2025-12-04T10:49:11.1044586Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1044927Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1044971Z if out == self.unknown_value: 2025-12-04T10:49:11.1045041Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1045094Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1045168Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1045439Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1045488Z graph_break [] 2025-12-04T10:49:11.1045539Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1045686Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1045750Z Traceback (most recent call last): 2025-12-04T10:49:11.1045903Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1045943Z method(*args, **kwargs) 2025-12-04T10:49:11.1046095Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1046134Z method(*args, **kwargs) 2025-12-04T10:49:11.1046282Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1046319Z with policy(): 2025-12-04T10:49:11.1046470Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1046510Z raise RuntimeError(msg) 2025-12-04T10:49:11.1046907Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1046911Z 2025-12-04T10:49:11.1046983Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1047270Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1047272Z 2025-12-04T10:49:11.1047356Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1047428Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1047481Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1047754Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1047824Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1047861Z graph_break [] 2025-12-04T10:49:11.1047974Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1048312Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1048359Z if out == self.unknown_value: 2025-12-04T10:49:11.1048430Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1048485Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1048555Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1048824Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1048859Z graph_break [] 2025-12-04T10:49:11.1048931Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1048983Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1049053Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1049327Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1049378Z graph_break [] 2025-12-04T10:49:11.1049618Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-082aaad3fa51c46e.xml - 2025-12-04T10:49:11.1049677Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1050305Z FAILED [0.5553s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1050309Z 2025-12-04T10:49:11.1050380Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1050666Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1050668Z 2025-12-04T10:49:11.1050751Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1050813Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1050878Z ================== 1 failed, 57 deselected, 2 rerun in 10.94s ================== 2025-12-04T10:49:11.1050914Z Got exit code 1 2025-12-04T10:49:11.1051148Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1051276Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1051470Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-4e686d98d5ef2e7a.xml 2025-12-04T10:49:11.1051528Z ============================= test session starts ============================== 2025-12-04T10:49:11.1051637Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1051679Z cachedir: .pytest_cache 2025-12-04T10:49:11.1051898Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1051945Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1051984Z configfile: pytest.ini 2025-12-04T10:49:11.1052145Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1052221Z collecting ... collected 58 items / 31 deselected / 27 selected 2025-12-04T10:49:11.1052273Z stepcurrent: skipping 31 already run items. 2025-12-04T10:49:11.1052317Z Running 27 items in this shard 2025-12-04T10:49:11.1052319Z 2025-12-04T10:49:11.1052565Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.7616s] [ 3%] 2025-12-04T10:49:11.1052810Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.7031s] [ 3%] 2025-12-04T10:49:11.1053028Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 FAILED [0.7114s] [ 3%] 2025-12-04T10:49:11.1053045Z 2025-12-04T10:49:11.1053097Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1053245Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1053306Z Traceback (most recent call last): 2025-12-04T10:49:11.1053461Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1053500Z method(*args, **kwargs) 2025-12-04T10:49:11.1053652Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1053691Z method(*args, **kwargs) 2025-12-04T10:49:11.1053840Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1053877Z with policy(): 2025-12-04T10:49:11.1054029Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1054069Z raise RuntimeError(msg) 2025-12-04T10:49:11.1054474Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1054476Z 2025-12-04T10:49:11.1054546Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1054836Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1054838Z 2025-12-04T10:49:11.1054923Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1054995Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1055049Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1055226Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1055297Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1055333Z graph_break [] 2025-12-04T10:49:11.1055504Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1055548Z Traceback (most recent call last): 2025-12-04T10:49:11.1055701Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1055740Z method(*args, **kwargs) 2025-12-04T10:49:11.1055890Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1055928Z method(*args, **kwargs) 2025-12-04T10:49:11.1056080Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1056115Z with policy(): 2025-12-04T10:49:11.1056266Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1056306Z raise RuntimeError(msg) 2025-12-04T10:49:11.1056718Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1056730Z 2025-12-04T10:49:11.1056801Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1057089Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1057101Z 2025-12-04T10:49:11.1057186Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1057256Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1057311Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1057485Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1057557Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1057593Z graph_break [] 2025-12-04T10:49:11.1057664Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1057717Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1057787Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1057961Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1057999Z graph_break [] 2025-12-04T10:49:11.1058049Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1058199Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1058243Z Traceback (most recent call last): 2025-12-04T10:49:11.1059779Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1059822Z method(*args, **kwargs) 2025-12-04T10:49:11.1059976Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1060017Z method(*args, **kwargs) 2025-12-04T10:49:11.1060168Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1060204Z with policy(): 2025-12-04T10:49:11.1060357Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1060399Z raise RuntimeError(msg) 2025-12-04T10:49:11.1060847Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1060850Z 2025-12-04T10:49:11.1060923Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1061209Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1061212Z 2025-12-04T10:49:11.1061297Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1061369Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1061425Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1061598Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1061670Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1061728Z graph_break [] 2025-12-04T10:49:11.1061799Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1061887Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1061978Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1062151Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1062188Z graph_break [] 2025-12-04T10:49:11.1062257Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1062312Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1062383Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1062557Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1062593Z graph_break [] 2025-12-04T10:49:11.1062835Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-4e686d98d5ef2e7a.xml - 2025-12-04T10:49:11.1062895Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1063534Z FAILED [0.7114s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1063536Z 2025-12-04T10:49:11.1063674Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1063959Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1063963Z 2025-12-04T10:49:11.1064047Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1064108Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1064174Z ================== 1 failed, 31 deselected, 2 rerun in 4.32s =================== 2025-12-04T10:49:11.1064227Z Got exit code 1 2025-12-04T10:49:11.1064267Z Retrying single test... 2025-12-04T10:49:11.1064465Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-153052ed0cf93f4e.xml 2025-12-04T10:49:11.1064522Z ============================= test session starts ============================== 2025-12-04T10:49:11.1064635Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1064675Z cachedir: .pytest_cache 2025-12-04T10:49:11.1064834Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1064879Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1064920Z configfile: pytest.ini 2025-12-04T10:49:11.1065083Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1065156Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1065439Z stepcurrent: skipping 31 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1065497Z Running 1 items in this shard 2025-12-04T10:49:11.1065499Z 2025-12-04T10:49:11.1065858Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:27:48.346999008 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1065875Z 2025-12-04T10:49:11.1066026Z [W1204 10:27:56.977714611 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1066029Z 2025-12-04T10:49:11.1066180Z [W1204 10:27:56.977869898 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1066183Z 2025-12-04T10:49:11.1066331Z [W1204 10:27:56.981211646 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1066335Z 2025-12-04T10:49:11.1066483Z [W1204 10:27:56.981492601 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1066486Z 2025-12-04T10:49:11.1066633Z [W1204 10:27:56.981573320 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1066636Z 2025-12-04T10:49:11.1066782Z [W1204 10:27:56.983971535 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1066784Z 2025-12-04T10:49:11.1066932Z [W1204 10:27:56.984240001 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1066933Z 2025-12-04T10:49:11.1067079Z [W1204 10:27:56.984318339 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1067081Z 2025-12-04T10:49:11.1067150Z ('RERUN', {'yellow': True}) [10.4637s] [100%] 2025-12-04T10:49:11.1067504Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:27:57.255951422 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1067507Z 2025-12-04T10:49:11.1067654Z [W1204 10:27:57.256437574 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1067656Z 2025-12-04T10:49:11.1067813Z [W1204 10:27:57.256521812 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1067815Z 2025-12-04T10:49:11.1067962Z [W1204 10:27:57.257905347 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1067965Z 2025-12-04T10:49:11.1068113Z [W1204 10:27:57.258232101 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1068115Z 2025-12-04T10:49:11.1068261Z [W1204 10:27:57.258313719 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1068264Z 2025-12-04T10:49:11.1068411Z [W1204 10:27:57.260573077 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1068413Z 2025-12-04T10:49:11.1068564Z [W1204 10:27:57.260834023 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1068566Z 2025-12-04T10:49:11.1068713Z [W1204 10:27:57.260910371 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1068715Z 2025-12-04T10:49:11.1068778Z ('RERUN', {'yellow': True}) [0.7721s] [100%] 2025-12-04T10:49:11.1069133Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:27:58.002617128 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1069150Z 2025-12-04T10:49:11.1069298Z [W1204 10:27:58.003012060 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1069300Z 2025-12-04T10:49:11.1069447Z [W1204 10:27:58.003102879 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1069449Z 2025-12-04T10:49:11.1069595Z [W1204 10:27:58.004502063 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1069597Z 2025-12-04T10:49:11.1069746Z [W1204 10:27:58.004839227 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1069747Z 2025-12-04T10:49:11.1069893Z [W1204 10:27:58.004917045 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1069896Z 2025-12-04T10:49:11.1070042Z [W1204 10:27:58.007135954 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1070044Z 2025-12-04T10:49:11.1070190Z [W1204 10:27:58.007402119 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1070193Z 2025-12-04T10:49:11.1070339Z [W1204 10:27:58.007477558 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1070341Z 2025-12-04T10:49:11.1070379Z FAILED [0.7392s] [100%] 2025-12-04T10:49:11.1070393Z 2025-12-04T10:49:11.1070446Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1070596Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1070642Z Traceback (most recent call last): 2025-12-04T10:49:11.1070799Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1070839Z method(*args, **kwargs) 2025-12-04T10:49:11.1071001Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1071040Z method(*args, **kwargs) 2025-12-04T10:49:11.1071191Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1071227Z with policy(): 2025-12-04T10:49:11.1071380Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1071420Z raise RuntimeError(msg) 2025-12-04T10:49:11.1071818Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1071822Z 2025-12-04T10:49:11.1071919Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1072207Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1072209Z 2025-12-04T10:49:11.1072297Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1072382Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1072438Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1072632Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1072703Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1072739Z graph_break [] 2025-12-04T10:49:11.1072811Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1073157Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1073202Z if out == self.unknown_value: 2025-12-04T10:49:11.1073351Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1073396Z Traceback (most recent call last): 2025-12-04T10:49:11.1073555Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1073598Z method(*args, **kwargs) 2025-12-04T10:49:11.1073748Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1073786Z method(*args, **kwargs) 2025-12-04T10:49:11.1073936Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1073973Z with policy(): 2025-12-04T10:49:11.1074124Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1074180Z raise RuntimeError(msg) 2025-12-04T10:49:11.1074589Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1074593Z 2025-12-04T10:49:11.1074665Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1074970Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1074973Z 2025-12-04T10:49:11.1075058Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1075131Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1075188Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1075362Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1075435Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1075470Z graph_break [] 2025-12-04T10:49:11.1075541Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1075881Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1075925Z if out == self.unknown_value: 2025-12-04T10:49:11.1075995Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1076064Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1076133Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1076307Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1076356Z graph_break [] 2025-12-04T10:49:11.1076409Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1076556Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1076602Z Traceback (most recent call last): 2025-12-04T10:49:11.1076754Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1076793Z method(*args, **kwargs) 2025-12-04T10:49:11.1076944Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1076984Z method(*args, **kwargs) 2025-12-04T10:49:11.1077132Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1077170Z with policy(): 2025-12-04T10:49:11.1077323Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1077363Z raise RuntimeError(msg) 2025-12-04T10:49:11.1077778Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1077781Z 2025-12-04T10:49:11.1077852Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1078150Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1078154Z 2025-12-04T10:49:11.1078238Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1078309Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1078364Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1078549Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1078619Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1078656Z graph_break [] 2025-12-04T10:49:11.1078726Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1079068Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1079112Z if out == self.unknown_value: 2025-12-04T10:49:11.1079182Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1079236Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1079306Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1079480Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1079516Z graph_break [] 2025-12-04T10:49:11.1079586Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1079652Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1079723Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1079894Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1079947Z graph_break [] 2025-12-04T10:49:11.1080184Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-153052ed0cf93f4e.xml - 2025-12-04T10:49:11.1080245Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1080880Z FAILED [0.7392s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1080885Z 2025-12-04T10:49:11.1080955Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1081243Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1081245Z 2025-12-04T10:49:11.1081329Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1081390Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1081456Z ================== 1 failed, 57 deselected, 2 rerun in 12.13s ================== 2025-12-04T10:49:11.1081505Z Got exit code 1 2025-12-04T10:49:11.1081545Z Retrying single test... 2025-12-04T10:49:11.1081744Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-60dd41c207ab8e87.xml 2025-12-04T10:49:11.1081801Z ============================= test session starts ============================== 2025-12-04T10:49:11.1081950Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1081993Z cachedir: .pytest_cache 2025-12-04T10:49:11.1082149Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1082211Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1082251Z configfile: pytest.ini 2025-12-04T10:49:11.1082411Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1082484Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1082768Z stepcurrent: skipping 31 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1082812Z Running 1 items in this shard 2025-12-04T10:49:11.1082814Z 2025-12-04T10:49:11.1083175Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:28:07.122354272 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1083178Z 2025-12-04T10:49:11.1083329Z [W1204 10:28:15.668429230 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1083333Z 2025-12-04T10:49:11.1083484Z [W1204 10:28:15.668586037 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1083498Z 2025-12-04T10:49:11.1083647Z [W1204 10:28:15.672049123 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1083663Z 2025-12-04T10:49:11.1083811Z [W1204 10:28:15.672322778 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1083813Z 2025-12-04T10:49:11.1083961Z [W1204 10:28:15.672399987 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1083964Z 2025-12-04T10:49:11.1084110Z [W1204 10:28:15.674734753 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1084112Z 2025-12-04T10:49:11.1084260Z [W1204 10:28:15.675007478 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1084262Z 2025-12-04T10:49:11.1084409Z [W1204 10:28:15.675086237 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1084411Z 2025-12-04T10:49:11.1084461Z ('RERUN', {'yellow': True}) [10.1662s] [100%] 2025-12-04T10:49:11.1084818Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:28:16.749204671 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1084821Z 2025-12-04T10:49:11.1084968Z [W1204 10:28:16.749573044 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1084970Z 2025-12-04T10:49:11.1085135Z [W1204 10:28:16.749656793 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1085138Z 2025-12-04T10:49:11.1085287Z [W1204 10:28:16.751036467 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1085289Z 2025-12-04T10:49:11.1085434Z [W1204 10:28:16.751360051 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1085436Z 2025-12-04T10:49:11.1085583Z [W1204 10:28:16.751436900 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1085585Z 2025-12-04T10:49:11.1085740Z [W1204 10:28:16.753621610 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1085742Z 2025-12-04T10:49:11.1085893Z [W1204 10:28:16.753878715 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1085896Z 2025-12-04T10:49:11.1086043Z [W1204 10:28:16.753953914 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1086047Z 2025-12-04T10:49:11.1086095Z ('RERUN', {'yellow': True}) [0.6011s] [100%] 2025-12-04T10:49:11.1086450Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:28:16.352921536 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1086452Z 2025-12-04T10:49:11.1086599Z [W1204 10:28:16.353304229 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1086601Z 2025-12-04T10:49:11.1086749Z [W1204 10:28:16.353396937 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1086764Z 2025-12-04T10:49:11.1086910Z [W1204 10:28:16.354768102 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1086925Z 2025-12-04T10:49:11.1087071Z [W1204 10:28:16.355108706 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1087072Z 2025-12-04T10:49:11.1087218Z [W1204 10:28:16.355190304 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1087220Z 2025-12-04T10:49:11.1087367Z [W1204 10:28:16.357366204 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1087369Z 2025-12-04T10:49:11.1087517Z [W1204 10:28:16.357619610 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1087519Z 2025-12-04T10:49:11.1087665Z [W1204 10:28:16.357692138 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1087668Z 2025-12-04T10:49:11.1087706Z FAILED [0.5811s] [100%] 2025-12-04T10:49:11.1087708Z 2025-12-04T10:49:11.1087759Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1087907Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1087952Z Traceback (most recent call last): 2025-12-04T10:49:11.1088111Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1088152Z method(*args, **kwargs) 2025-12-04T10:49:11.1088322Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1088363Z method(*args, **kwargs) 2025-12-04T10:49:11.1088512Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1088551Z with policy(): 2025-12-04T10:49:11.1088701Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1088741Z raise RuntimeError(msg) 2025-12-04T10:49:11.1089150Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1089153Z 2025-12-04T10:49:11.1089226Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1089515Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1089520Z 2025-12-04T10:49:11.1089605Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1089677Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1089731Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1089906Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1089977Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1090014Z graph_break [] 2025-12-04T10:49:11.1090085Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1090432Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1090497Z if out == self.unknown_value: 2025-12-04T10:49:11.1090646Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1090689Z Traceback (most recent call last): 2025-12-04T10:49:11.1090842Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1090881Z method(*args, **kwargs) 2025-12-04T10:49:11.1091032Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1091071Z method(*args, **kwargs) 2025-12-04T10:49:11.1091220Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1091257Z with policy(): 2025-12-04T10:49:11.1091408Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1091450Z raise RuntimeError(msg) 2025-12-04T10:49:11.1091900Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1091903Z 2025-12-04T10:49:11.1091976Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1092279Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1092282Z 2025-12-04T10:49:11.1092368Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1092440Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1092496Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1092671Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1092743Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1092793Z graph_break [] 2025-12-04T10:49:11.1092864Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1093205Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1093249Z if out == self.unknown_value: 2025-12-04T10:49:11.1093320Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1093375Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1093445Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1093619Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1093656Z graph_break [] 2025-12-04T10:49:11.1093707Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1093856Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1093900Z Traceback (most recent call last): 2025-12-04T10:49:11.1094066Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1094105Z method(*args, **kwargs) 2025-12-04T10:49:11.1094269Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1094307Z method(*args, **kwargs) 2025-12-04T10:49:11.1094456Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1094492Z with policy(): 2025-12-04T10:49:11.1094645Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1094684Z raise RuntimeError(msg) 2025-12-04T10:49:11.1095096Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1095099Z 2025-12-04T10:49:11.1095172Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1095459Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1095461Z 2025-12-04T10:49:11.1095547Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1095618Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1095674Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1095857Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1095930Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1095965Z graph_break [] 2025-12-04T10:49:11.1096038Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1096377Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1096420Z if out == self.unknown_value: 2025-12-04T10:49:11.1096502Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1096558Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1096629Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1096803Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1096840Z graph_break [] 2025-12-04T10:49:11.1096910Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1096964Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1097034Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1097206Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1097241Z graph_break [] 2025-12-04T10:49:11.1097484Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-60dd41c207ab8e87.xml - 2025-12-04T10:49:11.1097542Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1098186Z FAILED [0.5811s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1098200Z 2025-12-04T10:49:11.1098272Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1098559Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1098561Z 2025-12-04T10:49:11.1098646Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1098707Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1098773Z ================== 1 failed, 57 deselected, 2 rerun in 11.51s ================== 2025-12-04T10:49:11.1098810Z Got exit code 1 2025-12-04T10:49:11.1099049Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1099177Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1099373Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-636477257eff0c26.xml 2025-12-04T10:49:11.1099428Z ============================= test session starts ============================== 2025-12-04T10:49:11.1099553Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1099595Z cachedir: .pytest_cache 2025-12-04T10:49:11.1099753Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1099800Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1099840Z configfile: pytest.ini 2025-12-04T10:49:11.1100001Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1100073Z collecting ... collected 58 items / 32 deselected / 26 selected 2025-12-04T10:49:11.1100135Z stepcurrent: skipping 32 already run items. 2025-12-04T10:49:11.1100179Z Running 26 items in this shard 2025-12-04T10:49:11.1100181Z 2025-12-04T10:49:11.1100428Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.4738s] [ 3%] 2025-12-04T10:49:11.1100669Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.4651s] [ 3%] 2025-12-04T10:49:11.1100890Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 FAILED [0.4679s] [ 3%] 2025-12-04T10:49:11.1100892Z 2025-12-04T10:49:11.1100942Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1101091Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1101136Z Traceback (most recent call last): 2025-12-04T10:49:11.1101294Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1101334Z method(*args, **kwargs) 2025-12-04T10:49:11.1101496Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1101536Z method(*args, **kwargs) 2025-12-04T10:49:11.1101706Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1101743Z with policy(): 2025-12-04T10:49:11.1101937Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1101978Z raise RuntimeError(msg) 2025-12-04T10:49:11.1102372Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 65536 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1102375Z 2025-12-04T10:49:11.1102449Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1102737Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1102741Z 2025-12-04T10:49:11.1102827Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1102897Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1102952Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1103129Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1103200Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1103236Z graph_break [] 2025-12-04T10:49:11.1103402Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1103447Z Traceback (most recent call last): 2025-12-04T10:49:11.1103600Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1103640Z method(*args, **kwargs) 2025-12-04T10:49:11.1103788Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1103827Z method(*args, **kwargs) 2025-12-04T10:49:11.1103990Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1104028Z with policy(): 2025-12-04T10:49:11.1104178Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1104220Z raise RuntimeError(msg) 2025-12-04T10:49:11.1104619Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 65536 and is now reported as 131072 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1104622Z 2025-12-04T10:49:11.1104694Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1104982Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1104986Z 2025-12-04T10:49:11.1105069Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1105140Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1105209Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1105383Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1105469Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1105506Z graph_break [] 2025-12-04T10:49:11.1105576Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1105631Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1105700Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1105874Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1105909Z graph_break [] 2025-12-04T10:49:11.1105961Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1106109Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1106154Z Traceback (most recent call last): 2025-12-04T10:49:11.1106308Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1106348Z method(*args, **kwargs) 2025-12-04T10:49:11.1106498Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1106537Z method(*args, **kwargs) 2025-12-04T10:49:11.1106687Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1106725Z with policy(): 2025-12-04T10:49:11.1106890Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1106931Z raise RuntimeError(msg) 2025-12-04T10:49:11.1107329Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1107333Z 2025-12-04T10:49:11.1107403Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1107701Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1107704Z 2025-12-04T10:49:11.1107789Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1107864Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1107918Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1108092Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1108164Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1108200Z graph_break [] 2025-12-04T10:49:11.1108271Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1108324Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1108395Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1108568Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1108604Z graph_break [] 2025-12-04T10:49:11.1108687Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1108740Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1108809Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1108992Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1109028Z graph_break [] 2025-12-04T10:49:11.1109266Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-636477257eff0c26.xml - 2025-12-04T10:49:11.1109325Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1109956Z FAILED [0.4679s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1109960Z 2025-12-04T10:49:11.1110031Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1110316Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1110318Z 2025-12-04T10:49:11.1110402Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1110461Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1110538Z ================== 1 failed, 32 deselected, 2 rerun in 3.56s =================== 2025-12-04T10:49:11.1110576Z Got exit code 1 2025-12-04T10:49:11.1110616Z Retrying single test... 2025-12-04T10:49:11.1110813Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-7bc1841afb3bfe6d.xml 2025-12-04T10:49:11.1110871Z ============================= test session starts ============================== 2025-12-04T10:49:11.1110981Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1111023Z cachedir: .pytest_cache 2025-12-04T10:49:11.1111192Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1111239Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1111278Z configfile: pytest.ini 2025-12-04T10:49:11.1111440Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1111512Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1111792Z stepcurrent: skipping 32 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1111838Z Running 1 items in this shard 2025-12-04T10:49:11.1111840Z 2025-12-04T10:49:11.1112249Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:28:35.679373621 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1112251Z 2025-12-04T10:49:11.1112403Z [W1204 10:28:41.312769178 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1112422Z 2025-12-04T10:49:11.1112571Z [W1204 10:28:41.312949535 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1112573Z 2025-12-04T10:49:11.1112735Z [W1204 10:28:41.316690906 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1112737Z 2025-12-04T10:49:11.1112885Z [W1204 10:28:41.317014830 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1112887Z 2025-12-04T10:49:11.1113035Z [W1204 10:28:41.317099588 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1113037Z 2025-12-04T10:49:11.1113184Z [W1204 10:28:41.319670491 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1113186Z 2025-12-04T10:49:11.1113334Z [W1204 10:28:41.319943726 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1113335Z 2025-12-04T10:49:11.1113483Z [W1204 10:28:41.320028454 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1113486Z 2025-12-04T10:49:11.1113535Z ('RERUN', {'yellow': True}) [9.1522s] [100%] 2025-12-04T10:49:11.1113890Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:28:42.272454943 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1113892Z 2025-12-04T10:49:11.1114040Z [W1204 10:28:42.272833876 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1114042Z 2025-12-04T10:49:11.1114205Z [W1204 10:28:42.272916315 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1114207Z 2025-12-04T10:49:11.1114356Z [W1204 10:28:42.274302599 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1114358Z 2025-12-04T10:49:11.1114506Z [W1204 10:28:42.274630523 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1114510Z 2025-12-04T10:49:11.1114667Z [W1204 10:28:42.274711132 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1114669Z 2025-12-04T10:49:11.1114817Z [W1204 10:28:42.276940211 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1114819Z 2025-12-04T10:49:11.1114965Z [W1204 10:28:42.277207086 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1114968Z 2025-12-04T10:49:11.1115114Z [W1204 10:28:42.277285415 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1115117Z 2025-12-04T10:49:11.1115164Z ('RERUN', {'yellow': True}) [0.4558s] [100%] 2025-12-04T10:49:11.1115516Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:28:43.722585813 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1115518Z 2025-12-04T10:49:11.1115668Z [W1204 10:28:43.722962476 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1115669Z 2025-12-04T10:49:11.1115835Z [W1204 10:28:43.723051014 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1115837Z 2025-12-04T10:49:11.1115983Z [W1204 10:28:43.724444069 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1115995Z 2025-12-04T10:49:11.1116141Z [W1204 10:28:43.724768033 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1116143Z 2025-12-04T10:49:11.1116292Z [W1204 10:28:43.724846391 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1116294Z 2025-12-04T10:49:11.1116441Z [W1204 10:28:43.727078740 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1116443Z 2025-12-04T10:49:11.1116589Z [W1204 10:28:43.727340766 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1116592Z 2025-12-04T10:49:11.1116739Z [W1204 10:28:43.727416804 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1116742Z 2025-12-04T10:49:11.1116779Z FAILED [0.4501s] [100%] 2025-12-04T10:49:11.1116781Z 2025-12-04T10:49:11.1116834Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1116983Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1117029Z Traceback (most recent call last): 2025-12-04T10:49:11.1117186Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1117226Z method(*args, **kwargs) 2025-12-04T10:49:11.1117389Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1117430Z method(*args, **kwargs) 2025-12-04T10:49:11.1117578Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1117616Z with policy(): 2025-12-04T10:49:11.1117768Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1117807Z raise RuntimeError(msg) 2025-12-04T10:49:11.1118213Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 65536 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1118216Z 2025-12-04T10:49:11.1118289Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1118578Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1118582Z 2025-12-04T10:49:11.1118667Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1118739Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1118794Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1118970Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1119042Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1119077Z graph_break [] 2025-12-04T10:49:11.1119150Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1119503Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1119560Z if out == self.unknown_value: 2025-12-04T10:49:11.1119707Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1119752Z Traceback (most recent call last): 2025-12-04T10:49:11.1119903Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1119943Z method(*args, **kwargs) 2025-12-04T10:49:11.1120092Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1120131Z method(*args, **kwargs) 2025-12-04T10:49:11.1120282Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1120318Z with policy(): 2025-12-04T10:49:11.1120468Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1120510Z raise RuntimeError(msg) 2025-12-04T10:49:11.1120914Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 65536 and is now reported as 131072 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1120917Z 2025-12-04T10:49:11.1120988Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1121288Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1121291Z 2025-12-04T10:49:11.1121376Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1121449Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1121503Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1121677Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1121757Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1121794Z graph_break [] 2025-12-04T10:49:11.1121901Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1122244Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1122288Z if out == self.unknown_value: 2025-12-04T10:49:11.1122360Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1122413Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1122484Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1122657Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1122694Z graph_break [] 2025-12-04T10:49:11.1122746Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1122896Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1122956Z Traceback (most recent call last): 2025-12-04T10:49:11.1123110Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1123165Z method(*args, **kwargs) 2025-12-04T10:49:11.1123314Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1123354Z method(*args, **kwargs) 2025-12-04T10:49:11.1123502Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1123539Z with policy(): 2025-12-04T10:49:11.1123694Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1123734Z raise RuntimeError(msg) 2025-12-04T10:49:11.1124138Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1124142Z 2025-12-04T10:49:11.1124214Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1124499Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1124502Z 2025-12-04T10:49:11.1124588Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1124659Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1124713Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1124901Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1124973Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1125011Z graph_break [] 2025-12-04T10:49:11.1125081Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1125421Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1125476Z if out == self.unknown_value: 2025-12-04T10:49:11.1125547Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1125600Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1125670Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1125845Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1125881Z graph_break [] 2025-12-04T10:49:11.1125951Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1126004Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1126073Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1126246Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1126282Z graph_break [] 2025-12-04T10:49:11.1126523Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-7bc1841afb3bfe6d.xml - 2025-12-04T10:49:11.1126583Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1127221Z FAILED [0.4501s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1127234Z 2025-12-04T10:49:11.1127307Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1127591Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1127594Z 2025-12-04T10:49:11.1127679Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1127740Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1127805Z ================== 1 failed, 57 deselected, 2 rerun in 10.21s ================== 2025-12-04T10:49:11.1127842Z Got exit code 1 2025-12-04T10:49:11.1127881Z Retrying single test... 2025-12-04T10:49:11.1128078Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-0fe52e3b86d4b65e.xml 2025-12-04T10:49:11.1128133Z ============================= test session starts ============================== 2025-12-04T10:49:11.1128245Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1128285Z cachedir: .pytest_cache 2025-12-04T10:49:11.1128453Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1128500Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1128540Z configfile: pytest.ini 2025-12-04T10:49:11.1128701Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1128776Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1129057Z stepcurrent: skipping 32 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1129101Z Running 1 items in this shard 2025-12-04T10:49:11.1129120Z 2025-12-04T10:49:11.1129477Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:28:52.694976892 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1129481Z 2025-12-04T10:49:11.1129632Z [W1204 10:28:59.567614203 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1129635Z 2025-12-04T10:49:11.1129785Z [W1204 10:28:59.567790330 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1129787Z 2025-12-04T10:49:11.1129934Z [W1204 10:28:59.571838155 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1129936Z 2025-12-04T10:49:11.1130084Z [W1204 10:28:59.572145470 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1130086Z 2025-12-04T10:49:11.1130232Z [W1204 10:28:59.572229718 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1130246Z 2025-12-04T10:49:11.1130393Z [W1204 10:28:59.574739652 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1130404Z 2025-12-04T10:49:11.1130551Z [W1204 10:28:59.575007817 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1130553Z 2025-12-04T10:49:11.1130699Z [W1204 10:28:59.575087175 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1130701Z 2025-12-04T10:49:11.1130750Z ('RERUN', {'yellow': True}) [9.4118s] [100%] 2025-12-04T10:49:11.1131102Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:29:00.568691146 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1131105Z 2025-12-04T10:49:11.1131253Z [W1204 10:29:00.569130728 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1131257Z 2025-12-04T10:49:11.1131406Z [W1204 10:29:00.569231317 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1131408Z 2025-12-04T10:49:11.1131556Z [W1204 10:29:00.570690350 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1131559Z 2025-12-04T10:49:11.1131707Z [W1204 10:29:00.571054543 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1131709Z 2025-12-04T10:49:11.1131896Z [W1204 10:29:00.571138661 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1131912Z 2025-12-04T10:49:11.1132059Z [W1204 10:29:00.573389600 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1132061Z 2025-12-04T10:49:11.1132210Z [W1204 10:29:00.573670885 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1132212Z 2025-12-04T10:49:11.1132357Z [W1204 10:29:00.573748233 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1132359Z 2025-12-04T10:49:11.1132407Z ('RERUN', {'yellow': True}) [0.4702s] [100%] 2025-12-04T10:49:11.1132773Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:29:00.019656215 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1132777Z 2025-12-04T10:49:11.1132925Z [W1204 10:29:00.020033678 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1132927Z 2025-12-04T10:49:11.1133074Z [W1204 10:29:00.020122516 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1133076Z 2025-12-04T10:49:11.1133222Z [W1204 10:29:00.021488171 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1133224Z 2025-12-04T10:49:11.1133371Z [W1204 10:29:00.021815465 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1133373Z 2025-12-04T10:49:11.1133519Z [W1204 10:29:00.021893524 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1133533Z 2025-12-04T10:49:11.1133686Z [W1204 10:29:00.024055344 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1133687Z 2025-12-04T10:49:11.1133833Z [W1204 10:29:00.024311239 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1133848Z 2025-12-04T10:49:11.1133994Z [W1204 10:29:00.024386388 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1133996Z 2025-12-04T10:49:11.1134034Z FAILED [0.4461s] [100%] 2025-12-04T10:49:11.1134036Z 2025-12-04T10:49:11.1134088Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1134237Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1134282Z Traceback (most recent call last): 2025-12-04T10:49:11.1134441Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1134480Z method(*args, **kwargs) 2025-12-04T10:49:11.1134632Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1134671Z method(*args, **kwargs) 2025-12-04T10:49:11.1134823Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1134860Z with policy(): 2025-12-04T10:49:11.1135013Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1135053Z raise RuntimeError(msg) 2025-12-04T10:49:11.1135461Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 65536 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1135465Z 2025-12-04T10:49:11.1135538Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1135825Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1135827Z 2025-12-04T10:49:11.1135914Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1135996Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1136051Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1136228Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1136300Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1136336Z graph_break [] 2025-12-04T10:49:11.1136407Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1136750Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1136794Z if out == self.unknown_value: 2025-12-04T10:49:11.1136942Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1136987Z Traceback (most recent call last): 2025-12-04T10:49:11.1137139Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1137189Z method(*args, **kwargs) 2025-12-04T10:49:11.1137339Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1137377Z method(*args, **kwargs) 2025-12-04T10:49:11.1137535Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1137571Z with policy(): 2025-12-04T10:49:11.1137722Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1137761Z raise RuntimeError(msg) 2025-12-04T10:49:11.1138166Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 65536 and is now reported as 131072 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1138170Z 2025-12-04T10:49:11.1138240Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1138524Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1138528Z 2025-12-04T10:49:11.1138612Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1138684Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1138740Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1138914Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1138986Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1139032Z graph_break [] 2025-12-04T10:49:11.1139104Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1139445Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1139489Z if out == self.unknown_value: 2025-12-04T10:49:11.1139558Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1139614Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1139695Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1139868Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1139904Z graph_break [] 2025-12-04T10:49:11.1139957Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1140105Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1140151Z Traceback (most recent call last): 2025-12-04T10:49:11.1140303Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1140343Z method(*args, **kwargs) 2025-12-04T10:49:11.1140491Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1140531Z method(*args, **kwargs) 2025-12-04T10:49:11.1140679Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1140716Z with policy(): 2025-12-04T10:49:11.1140868Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1140919Z raise RuntimeError(msg) 2025-12-04T10:49:11.1141319Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1141332Z 2025-12-04T10:49:11.1141403Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1141689Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1141690Z 2025-12-04T10:49:11.1141775Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1141887Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1141942Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1142115Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1142187Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1142222Z graph_break [] 2025-12-04T10:49:11.1142294Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1142635Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1142678Z if out == self.unknown_value: 2025-12-04T10:49:11.1142770Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1142826Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1142896Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1143069Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1143106Z graph_break [] 2025-12-04T10:49:11.1143178Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1143231Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1143315Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1143487Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1143525Z graph_break [] 2025-12-04T10:49:11.1143767Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-0fe52e3b86d4b65e.xml - 2025-12-04T10:49:11.1143828Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1144457Z FAILED [0.4461s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1144459Z 2025-12-04T10:49:11.1144530Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1144816Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1144844Z 2025-12-04T10:49:11.1144928Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1144990Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1145056Z ================== 1 failed, 57 deselected, 2 rerun in 10.47s ================== 2025-12-04T10:49:11.1145092Z Got exit code 1 2025-12-04T10:49:11.1145328Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1145457Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1145654Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-de02401c4172b159.xml 2025-12-04T10:49:11.1145711Z ============================= test session starts ============================== 2025-12-04T10:49:11.1145825Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1145865Z cachedir: .pytest_cache 2025-12-04T10:49:11.1146022Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1146067Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1146107Z configfile: pytest.ini 2025-12-04T10:49:11.1146267Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1146341Z collecting ... collected 58 items / 33 deselected / 25 selected 2025-12-04T10:49:11.1146392Z stepcurrent: skipping 33 already run items. 2025-12-04T10:49:11.1146448Z Running 25 items in this shard 2025-12-04T10:49:11.1146450Z 2025-12-04T10:49:11.1146697Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.8011s] [ 4%] 2025-12-04T10:49:11.1146942Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.4292s] [ 4%] 2025-12-04T10:49:11.1147170Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 FAILED [0.4394s] [ 4%] 2025-12-04T10:49:11.1147173Z 2025-12-04T10:49:11.1147223Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1147371Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1147416Z Traceback (most recent call last): 2025-12-04T10:49:11.1147575Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1147615Z method(*args, **kwargs) 2025-12-04T10:49:11.1147766Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1147805Z method(*args, **kwargs) 2025-12-04T10:49:11.1147955Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1147992Z with policy(): 2025-12-04T10:49:11.1148143Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1148182Z raise RuntimeError(msg) 2025-12-04T10:49:11.1148574Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1148595Z 2025-12-04T10:49:11.1148666Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1148953Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1148955Z 2025-12-04T10:49:11.1149042Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1149112Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1149167Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1149439Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1149512Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1149547Z graph_break [] 2025-12-04T10:49:11.1149694Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1149739Z Traceback (most recent call last): 2025-12-04T10:49:11.1149892Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1149931Z method(*args, **kwargs) 2025-12-04T10:49:11.1150081Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1150119Z method(*args, **kwargs) 2025-12-04T10:49:11.1150279Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1150316Z with policy(): 2025-12-04T10:49:11.1150467Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1150507Z raise RuntimeError(msg) 2025-12-04T10:49:11.1150919Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1150921Z 2025-12-04T10:49:11.1150994Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1151282Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1151285Z 2025-12-04T10:49:11.1151369Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1151441Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1151496Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1151767Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1151838Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1151922Z graph_break [] 2025-12-04T10:49:11.1151994Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1152065Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1152136Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1152406Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1152455Z graph_break [] 2025-12-04T10:49:11.1152507Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1152656Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1152702Z Traceback (most recent call last): 2025-12-04T10:49:11.1152853Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1152893Z method(*args, **kwargs) 2025-12-04T10:49:11.1153044Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1153084Z method(*args, **kwargs) 2025-12-04T10:49:11.1153240Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1153277Z with policy(): 2025-12-04T10:49:11.1153429Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1153469Z raise RuntimeError(msg) 2025-12-04T10:49:11.1153872Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1153875Z 2025-12-04T10:49:11.1153960Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1154245Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1154248Z 2025-12-04T10:49:11.1154333Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1154404Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1154457Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1154740Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1154812Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1154849Z graph_break [] 2025-12-04T10:49:11.1154918Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1154971Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1155041Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1155310Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1155345Z graph_break [] 2025-12-04T10:49:11.1155417Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1155470Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1155542Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1155810Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1155858Z graph_break [] 2025-12-04T10:49:11.1156115Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-de02401c4172b159.xml - 2025-12-04T10:49:11.1156173Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1156801Z FAILED [0.4394s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1156804Z 2025-12-04T10:49:11.1156875Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1157161Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1157164Z 2025-12-04T10:49:11.1157248Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1157308Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1157374Z ================== 1 failed, 33 deselected, 2 rerun in 3.81s =================== 2025-12-04T10:49:11.1157410Z Got exit code 1 2025-12-04T10:49:11.1157451Z Retrying single test... 2025-12-04T10:49:11.1157657Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-05c9a9b596f0519d.xml 2025-12-04T10:49:11.1157716Z ============================= test session starts ============================== 2025-12-04T10:49:11.1157826Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1157868Z cachedir: .pytest_cache 2025-12-04T10:49:11.1158024Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1158070Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1158110Z configfile: pytest.ini 2025-12-04T10:49:11.1158281Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1158353Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1158636Z stepcurrent: skipping 33 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1158681Z Running 1 items in this shard 2025-12-04T10:49:11.1158684Z 2025-12-04T10:49:11.1159041Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:29:20.015190370 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1159044Z 2025-12-04T10:49:11.1159196Z [W1204 10:29:27.354963091 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1159197Z 2025-12-04T10:49:11.1159346Z [W1204 10:29:27.355122478 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1159348Z 2025-12-04T10:49:11.1159498Z [W1204 10:29:27.358527486 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1159510Z 2025-12-04T10:49:11.1159656Z [W1204 10:29:27.358822980 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1159672Z 2025-12-04T10:49:11.1159823Z [W1204 10:29:27.358900259 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1159825Z 2025-12-04T10:49:11.1159973Z [W1204 10:29:27.361376634 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1159974Z 2025-12-04T10:49:11.1160120Z [W1204 10:29:27.361640549 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1160122Z 2025-12-04T10:49:11.1160270Z [W1204 10:29:27.361714727 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1160273Z 2025-12-04T10:49:11.1160322Z ('RERUN', {'yellow': True}) [10.4238s] [100%] 2025-12-04T10:49:11.1160676Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:29:28.165805745 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1160679Z 2025-12-04T10:49:11.1160829Z [W1204 10:29:28.166207028 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1160831Z 2025-12-04T10:49:11.1160978Z [W1204 10:29:28.166294526 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1160980Z 2025-12-04T10:49:11.1161141Z [W1204 10:29:28.167703260 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1161143Z 2025-12-04T10:49:11.1161292Z [W1204 10:29:28.167953215 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1161294Z 2025-12-04T10:49:11.1161444Z [W1204 10:29:28.168031004 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1161446Z 2025-12-04T10:49:11.1161595Z [W1204 10:29:28.170233784 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1161610Z 2025-12-04T10:49:11.1161756Z [W1204 10:29:28.170490139 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1161758Z 2025-12-04T10:49:11.1161953Z [W1204 10:29:28.170562727 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1161956Z 2025-12-04T10:49:11.1162004Z ('RERUN', {'yellow': True}) [0.6537s] [100%] 2025-12-04T10:49:11.1162356Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:29:29.804321405 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1162359Z 2025-12-04T10:49:11.1162505Z [W1204 10:29:29.804744927 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1162509Z 2025-12-04T10:49:11.1162657Z [W1204 10:29:29.804829425 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1162658Z 2025-12-04T10:49:11.1162807Z [W1204 10:29:29.806205630 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1162831Z 2025-12-04T10:49:11.1162977Z [W1204 10:29:29.806457345 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1162992Z 2025-12-04T10:49:11.1163138Z [W1204 10:29:29.806529014 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1163140Z 2025-12-04T10:49:11.1163285Z [W1204 10:29:29.808740453 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1163287Z 2025-12-04T10:49:11.1163434Z [W1204 10:29:29.808992569 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1163436Z 2025-12-04T10:49:11.1163585Z [W1204 10:29:29.809069197 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1163587Z 2025-12-04T10:49:11.1163625Z FAILED [0.6251s] [100%] 2025-12-04T10:49:11.1163627Z 2025-12-04T10:49:11.1163678Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1163827Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1163872Z Traceback (most recent call last): 2025-12-04T10:49:11.1164033Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1164073Z method(*args, **kwargs) 2025-12-04T10:49:11.1164225Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1164265Z method(*args, **kwargs) 2025-12-04T10:49:11.1164435Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1166358Z with policy(): 2025-12-04T10:49:11.1166519Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1166563Z raise RuntimeError(msg) 2025-12-04T10:49:11.1166962Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1166964Z 2025-12-04T10:49:11.1167073Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1167361Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1167364Z 2025-12-04T10:49:11.1167452Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1167526Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1167586Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1167859Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1167932Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1167969Z graph_break [] 2025-12-04T10:49:11.1168042Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1168391Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1168445Z if out == self.unknown_value: 2025-12-04T10:49:11.1168594Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1168652Z Traceback (most recent call last): 2025-12-04T10:49:11.1168806Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1168845Z method(*args, **kwargs) 2025-12-04T10:49:11.1168996Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1169035Z method(*args, **kwargs) 2025-12-04T10:49:11.1169185Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1169222Z with policy(): 2025-12-04T10:49:11.1169375Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1169415Z raise RuntimeError(msg) 2025-12-04T10:49:11.1169819Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1169821Z 2025-12-04T10:49:11.1169895Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1170182Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1170184Z 2025-12-04T10:49:11.1170284Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1170356Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1170412Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1170683Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1170756Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1170792Z graph_break [] 2025-12-04T10:49:11.1170876Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1171221Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1171264Z if out == self.unknown_value: 2025-12-04T10:49:11.1171334Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1171391Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1171462Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1171733Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1171770Z graph_break [] 2025-12-04T10:49:11.1171821Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1172008Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1172081Z Traceback (most recent call last): 2025-12-04T10:49:11.1172236Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1172274Z method(*args, **kwargs) 2025-12-04T10:49:11.1172439Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1172477Z method(*args, **kwargs) 2025-12-04T10:49:11.1172629Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1172664Z with policy(): 2025-12-04T10:49:11.1172818Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1172857Z raise RuntimeError(msg) 2025-12-04T10:49:11.1173260Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1173264Z 2025-12-04T10:49:11.1173337Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1173620Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1173623Z 2025-12-04T10:49:11.1173709Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1173780Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1173834Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1174119Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1174192Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1174229Z graph_break [] 2025-12-04T10:49:11.1174300Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1174637Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1174695Z if out == self.unknown_value: 2025-12-04T10:49:11.1174766Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1174820Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1174891Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1175161Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1175198Z graph_break [] 2025-12-04T10:49:11.1175267Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1175320Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1175390Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1175657Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1175692Z graph_break [] 2025-12-04T10:49:11.1175936Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-05c9a9b596f0519d.xml - 2025-12-04T10:49:11.1176005Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1176646Z FAILED [0.6251s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1176649Z 2025-12-04T10:49:11.1176721Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1177008Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1177010Z 2025-12-04T10:49:11.1177095Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1177157Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1177223Z ================== 1 failed, 57 deselected, 2 rerun in 11.84s ================== 2025-12-04T10:49:11.1177259Z Got exit code 1 2025-12-04T10:49:11.1177300Z Retrying single test... 2025-12-04T10:49:11.1177499Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-cc4ab597b069c53d.xml 2025-12-04T10:49:11.1177557Z ============================= test session starts ============================== 2025-12-04T10:49:11.1177680Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1177722Z cachedir: .pytest_cache 2025-12-04T10:49:11.1177880Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1177927Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1177967Z configfile: pytest.ini 2025-12-04T10:49:11.1178129Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1178202Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1178496Z stepcurrent: skipping 33 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1178541Z Running 1 items in this shard 2025-12-04T10:49:11.1178543Z 2025-12-04T10:49:11.1178899Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:29:39.600496822 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1178903Z 2025-12-04T10:49:11.1179055Z [W1204 10:29:45.241466224 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1179057Z 2025-12-04T10:49:11.1179206Z [W1204 10:29:45.241625311 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1179209Z 2025-12-04T10:49:11.1179356Z [W1204 10:29:45.245255665 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1179358Z 2025-12-04T10:49:11.1179507Z [W1204 10:29:45.245557789 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1179524Z 2025-12-04T10:49:11.1179669Z [W1204 10:29:45.245633518 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1179682Z 2025-12-04T10:49:11.1179829Z [W1204 10:29:45.248066943 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1179831Z 2025-12-04T10:49:11.1179976Z [W1204 10:29:45.248335218 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1179979Z 2025-12-04T10:49:11.1180125Z [W1204 10:29:45.248412017 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1180127Z 2025-12-04T10:49:11.1180176Z ('RERUN', {'yellow': True}) [9.5081s] [100%] 2025-12-04T10:49:11.1180529Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:29:46.885019267 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1180532Z 2025-12-04T10:49:11.1180681Z [W1204 10:29:46.885381650 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1180682Z 2025-12-04T10:49:11.1180829Z [W1204 10:29:46.885466959 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1180831Z 2025-12-04T10:49:11.1180980Z [W1204 10:29:46.886845743 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1180982Z 2025-12-04T10:49:11.1181130Z [W1204 10:29:46.887107338 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1181149Z 2025-12-04T10:49:11.1181296Z [W1204 10:29:46.887183227 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1181298Z 2025-12-04T10:49:11.1181447Z [W1204 10:29:46.889366217 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1181448Z 2025-12-04T10:49:11.1181595Z [W1204 10:29:46.889622302 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1181597Z 2025-12-04T10:49:11.1181755Z [W1204 10:29:46.889696441 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1181757Z 2025-12-04T10:49:11.1181806Z ('RERUN', {'yellow': True}) [0.5081s] [100%] 2025-12-04T10:49:11.1182279Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:29:46.377259678 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1182282Z 2025-12-04T10:49:11.1182430Z [W1204 10:29:46.377663901 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1182431Z 2025-12-04T10:49:11.1182577Z [W1204 10:29:46.377757709 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1182579Z 2025-12-04T10:49:11.1182726Z [W1204 10:29:46.379155973 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1182728Z 2025-12-04T10:49:11.1182873Z [W1204 10:29:46.379414009 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1182891Z 2025-12-04T10:49:11.1183038Z [W1204 10:29:46.379490217 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1183039Z 2025-12-04T10:49:11.1183186Z [W1204 10:29:46.381675977 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1183204Z 2025-12-04T10:49:11.1183352Z [W1204 10:29:46.381933292 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1183354Z 2025-12-04T10:49:11.1183502Z [W1204 10:29:46.382012511 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1183504Z 2025-12-04T10:49:11.1183541Z FAILED [0.4866s] [100%] 2025-12-04T10:49:11.1183543Z 2025-12-04T10:49:11.1183596Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1183744Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1183791Z Traceback (most recent call last): 2025-12-04T10:49:11.1183948Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1183988Z method(*args, **kwargs) 2025-12-04T10:49:11.1184140Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1184179Z method(*args, **kwargs) 2025-12-04T10:49:11.1184331Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1184367Z with policy(): 2025-12-04T10:49:11.1184520Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1184559Z raise RuntimeError(msg) 2025-12-04T10:49:11.1184968Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1184972Z 2025-12-04T10:49:11.1185044Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1185351Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1185354Z 2025-12-04T10:49:11.1185440Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1185511Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1185568Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1185838Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1185910Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1185946Z graph_break [] 2025-12-04T10:49:11.1186017Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1186358Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1186402Z if out == self.unknown_value: 2025-12-04T10:49:11.1186551Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1186607Z Traceback (most recent call last): 2025-12-04T10:49:11.1186759Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1186809Z method(*args, **kwargs) 2025-12-04T10:49:11.1186958Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1186997Z method(*args, **kwargs) 2025-12-04T10:49:11.1187145Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1187183Z with policy(): 2025-12-04T10:49:11.1187333Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1187373Z raise RuntimeError(msg) 2025-12-04T10:49:11.1187775Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1187778Z 2025-12-04T10:49:11.1187850Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1188135Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1188137Z 2025-12-04T10:49:11.1188222Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1188293Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1188348Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1188628Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1188700Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1188737Z graph_break [] 2025-12-04T10:49:11.1188806Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1189157Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1189200Z if out == self.unknown_value: 2025-12-04T10:49:11.1189271Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1189326Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1189397Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1189663Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1189700Z graph_break [] 2025-12-04T10:49:11.1189751Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1189899Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1189945Z Traceback (most recent call last): 2025-12-04T10:49:11.1190098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1190137Z method(*args, **kwargs) 2025-12-04T10:49:11.1190287Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1190336Z method(*args, **kwargs) 2025-12-04T10:49:11.1190484Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1190533Z with policy(): 2025-12-04T10:49:11.1190684Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1190724Z raise RuntimeError(msg) 2025-12-04T10:49:11.1191129Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1191131Z 2025-12-04T10:49:11.1191204Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1191489Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1191495Z 2025-12-04T10:49:11.1191580Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1191651Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1191705Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1191995Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1192065Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1192119Z graph_break [] 2025-12-04T10:49:11.1192191Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1192531Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1192574Z if out == self.unknown_value: 2025-12-04T10:49:11.1192644Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1192698Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1192781Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1193048Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1193086Z graph_break [] 2025-12-04T10:49:11.1193156Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1193209Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1193279Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1193543Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1193580Z graph_break [] 2025-12-04T10:49:11.1193822Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-cc4ab597b069c53d.xml - 2025-12-04T10:49:11.1193882Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1194514Z FAILED [0.4866s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1194543Z 2025-12-04T10:49:11.1194614Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1194898Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1194901Z 2025-12-04T10:49:11.1194983Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1195046Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1195111Z ================== 1 failed, 57 deselected, 2 rerun in 10.64s ================== 2025-12-04T10:49:11.1195150Z Got exit code 1 2025-12-04T10:49:11.1195385Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1195513Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1195710Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-97a5c7bb711cad17.xml 2025-12-04T10:49:11.1195766Z ============================= test session starts ============================== 2025-12-04T10:49:11.1195878Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1195931Z cachedir: .pytest_cache 2025-12-04T10:49:11.1196090Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1196136Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1196176Z configfile: pytest.ini 2025-12-04T10:49:11.1196336Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1196409Z collecting ... collected 58 items / 34 deselected / 24 selected 2025-12-04T10:49:11.1196461Z stepcurrent: skipping 34 already run items. 2025-12-04T10:49:11.1196516Z Running 24 items in this shard 2025-12-04T10:49:11.1196518Z 2025-12-04T10:49:11.1196765Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.6063s] [ 4%] 2025-12-04T10:49:11.1197010Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6845s] [ 4%] 2025-12-04T10:49:11.1197230Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 FAILED [0.6605s] [ 4%] 2025-12-04T10:49:11.1197235Z 2025-12-04T10:49:11.1197285Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1197434Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1197479Z Traceback (most recent call last): 2025-12-04T10:49:11.1197636Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1197674Z method(*args, **kwargs) 2025-12-04T10:49:11.1197827Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1197878Z method(*args, **kwargs) 2025-12-04T10:49:11.1198028Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1198076Z with policy(): 2025-12-04T10:49:11.1198228Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1198268Z raise RuntimeError(msg) 2025-12-04T10:49:11.1198668Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1198671Z 2025-12-04T10:49:11.1198743Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1199031Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1199034Z 2025-12-04T10:49:11.1199118Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1199189Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1199243Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1199420Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1199492Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1199528Z graph_break [] 2025-12-04T10:49:11.1199696Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1199742Z Traceback (most recent call last): 2025-12-04T10:49:11.1199894Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1199933Z method(*args, **kwargs) 2025-12-04T10:49:11.1200083Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1200120Z method(*args, **kwargs) 2025-12-04T10:49:11.1200279Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1200315Z with policy(): 2025-12-04T10:49:11.1200465Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1200505Z raise RuntimeError(msg) 2025-12-04T10:49:11.1200913Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1200916Z 2025-12-04T10:49:11.1200988Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1201272Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1201275Z 2025-12-04T10:49:11.1201359Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1201429Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1201484Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1201667Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1201749Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1201785Z graph_break [] 2025-12-04T10:49:11.1201897Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1201950Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1202020Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1202193Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1202229Z graph_break [] 2025-12-04T10:49:11.1202280Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1202428Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1202473Z Traceback (most recent call last): 2025-12-04T10:49:11.1202627Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1202667Z method(*args, **kwargs) 2025-12-04T10:49:11.1202815Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1202854Z method(*args, **kwargs) 2025-12-04T10:49:11.1203003Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1203040Z with policy(): 2025-12-04T10:49:11.1203191Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1203248Z raise RuntimeError(msg) 2025-12-04T10:49:11.1203650Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1203653Z 2025-12-04T10:49:11.1203725Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1204021Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1204025Z 2025-12-04T10:49:11.1204109Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1204180Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1204235Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1204408Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1204480Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1204516Z graph_break [] 2025-12-04T10:49:11.1204586Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1204640Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1204708Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1204884Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1204920Z graph_break [] 2025-12-04T10:49:11.1204991Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1205057Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1205127Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1205311Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1205347Z graph_break [] 2025-12-04T10:49:11.1205589Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-97a5c7bb711cad17.xml - 2025-12-04T10:49:11.1205648Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1206277Z FAILED [0.6605s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1206281Z 2025-12-04T10:49:11.1206351Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1206636Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1206638Z 2025-12-04T10:49:11.1206723Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1206783Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1206858Z ================== 1 failed, 34 deselected, 2 rerun in 4.10s =================== 2025-12-04T10:49:11.1206897Z Got exit code 1 2025-12-04T10:49:11.1206937Z Retrying single test... 2025-12-04T10:49:11.1207133Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-680946d3fdcd572a.xml 2025-12-04T10:49:11.1207190Z ============================= test session starts ============================== 2025-12-04T10:49:11.1207301Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1207341Z cachedir: .pytest_cache 2025-12-04T10:49:11.1207510Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1207557Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1207597Z configfile: pytest.ini 2025-12-04T10:49:11.1207759Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1207833Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1208116Z stepcurrent: skipping 34 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1208160Z Running 1 items in this shard 2025-12-04T10:49:11.1208163Z 2025-12-04T10:49:11.1208524Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:30:06.083910522 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1208527Z 2025-12-04T10:49:11.1208679Z [W1204 10:30:13.390776863 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1208681Z 2025-12-04T10:49:11.1208842Z [W1204 10:30:13.390919061 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1208845Z 2025-12-04T10:49:11.1208992Z [W1204 10:30:13.394711521 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1209004Z 2025-12-04T10:49:11.1209152Z [W1204 10:30:13.395005805 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1209154Z 2025-12-04T10:49:11.1209301Z [W1204 10:30:13.395085364 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1209303Z 2025-12-04T10:49:11.1209450Z [W1204 10:30:13.397543089 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1209452Z 2025-12-04T10:49:11.1209599Z [W1204 10:30:13.397804374 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1209603Z 2025-12-04T10:49:11.1209748Z [W1204 10:30:13.397880643 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1209751Z 2025-12-04T10:49:11.1209801Z ('RERUN', {'yellow': True}) [9.9512s] [100%] 2025-12-04T10:49:11.1210156Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:30:14.384642176 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1210158Z 2025-12-04T10:49:11.1210306Z [W1204 10:30:14.385108688 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1210308Z 2025-12-04T10:49:11.1210464Z [W1204 10:30:14.385213326 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1210468Z 2025-12-04T10:49:11.1210613Z [W1204 10:30:14.386624430 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1210616Z 2025-12-04T10:49:11.1210762Z [W1204 10:30:14.386978814 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1210764Z 2025-12-04T10:49:11.1210920Z [W1204 10:30:14.387066212 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1210922Z 2025-12-04T10:49:11.1211068Z [W1204 10:30:14.389363630 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1211070Z 2025-12-04T10:49:11.1211216Z [W1204 10:30:14.389635025 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1211218Z 2025-12-04T10:49:11.1211365Z [W1204 10:30:14.389711403 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1211368Z 2025-12-04T10:49:11.1211524Z ('RERUN', {'yellow': True}) [0.5000s] [100%] 2025-12-04T10:49:11.1211911Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:30:15.863155876 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1211913Z 2025-12-04T10:49:11.1212061Z [W1204 10:30:15.863526380 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1212063Z 2025-12-04T10:49:11.1212212Z [W1204 10:30:15.863612568 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1212231Z 2025-12-04T10:49:11.1212379Z [W1204 10:30:15.864982473 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1212403Z 2025-12-04T10:49:11.1212549Z [W1204 10:30:15.865311877 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1212551Z 2025-12-04T10:49:11.1212697Z [W1204 10:30:15.865388955 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1212700Z 2025-12-04T10:49:11.1212847Z [W1204 10:30:15.867631294 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1212849Z 2025-12-04T10:49:11.1212996Z [W1204 10:30:15.867886090 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1212998Z 2025-12-04T10:49:11.1213144Z [W1204 10:30:15.867960638 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1213149Z 2025-12-04T10:49:11.1213187Z FAILED [0.4693s] [100%] 2025-12-04T10:49:11.1213190Z 2025-12-04T10:49:11.1213242Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1213391Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1213436Z Traceback (most recent call last): 2025-12-04T10:49:11.1213594Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1213634Z method(*args, **kwargs) 2025-12-04T10:49:11.1213802Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1213842Z method(*args, **kwargs) 2025-12-04T10:49:11.1213993Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1214030Z with policy(): 2025-12-04T10:49:11.1214183Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1214222Z raise RuntimeError(msg) 2025-12-04T10:49:11.1214633Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1214635Z 2025-12-04T10:49:11.1214708Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1214998Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1215001Z 2025-12-04T10:49:11.1215086Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1215157Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1215212Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1215387Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1215458Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1215493Z graph_break [] 2025-12-04T10:49:11.1215565Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1215921Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1215979Z if out == self.unknown_value: 2025-12-04T10:49:11.1216127Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1216172Z Traceback (most recent call last): 2025-12-04T10:49:11.1216324Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1216364Z method(*args, **kwargs) 2025-12-04T10:49:11.1216515Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1216554Z method(*args, **kwargs) 2025-12-04T10:49:11.1216702Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1216740Z with policy(): 2025-12-04T10:49:11.1216890Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1216931Z raise RuntimeError(msg) 2025-12-04T10:49:11.1217336Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1217339Z 2025-12-04T10:49:11.1217410Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1217707Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1217710Z 2025-12-04T10:49:11.1217795Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1217869Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1217923Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1218097Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1218178Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1218218Z graph_break [] 2025-12-04T10:49:11.1218288Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1218629Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1218674Z if out == self.unknown_value: 2025-12-04T10:49:11.1218743Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1218798Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1218868Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1219041Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1219077Z graph_break [] 2025-12-04T10:49:11.1219129Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1219277Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1219334Z Traceback (most recent call last): 2025-12-04T10:49:11.1219486Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1219525Z method(*args, **kwargs) 2025-12-04T10:49:11.1219687Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1219726Z method(*args, **kwargs) 2025-12-04T10:49:11.1219876Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1219912Z with policy(): 2025-12-04T10:49:11.1220063Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1220103Z raise RuntimeError(msg) 2025-12-04T10:49:11.1220510Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1220515Z 2025-12-04T10:49:11.1220586Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1220871Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1220873Z 2025-12-04T10:49:11.1220957Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1221028Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1221082Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1221266Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1221338Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1221374Z graph_break [] 2025-12-04T10:49:11.1221445Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1221787Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1221829Z if out == self.unknown_value: 2025-12-04T10:49:11.1221940Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1221994Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1222065Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1222240Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1222276Z graph_break [] 2025-12-04T10:49:11.1222347Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1222401Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1222471Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1222643Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1222680Z graph_break [] 2025-12-04T10:49:11.1222920Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-680946d3fdcd572a.xml - 2025-12-04T10:49:11.1222980Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1223622Z FAILED [0.4693s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1223641Z 2025-12-04T10:49:11.1223712Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1223999Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1224001Z 2025-12-04T10:49:11.1224085Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1224147Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1224212Z ================== 1 failed, 57 deselected, 2 rerun in 11.06s ================== 2025-12-04T10:49:11.1224250Z Got exit code 1 2025-12-04T10:49:11.1224289Z Retrying single test... 2025-12-04T10:49:11.1224487Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-dac08232149f1240.xml 2025-12-04T10:49:11.1224543Z ============================= test session starts ============================== 2025-12-04T10:49:11.1224656Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1224696Z cachedir: .pytest_cache 2025-12-04T10:49:11.1224854Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1224914Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1224955Z configfile: pytest.ini 2025-12-04T10:49:11.1225114Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1225187Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1225471Z stepcurrent: skipping 34 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1225516Z Running 1 items in this shard 2025-12-04T10:49:11.1225518Z 2025-12-04T10:49:11.1225888Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:30:23.470409222 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1225891Z 2025-12-04T10:49:11.1226041Z [W1204 10:30:31.755077668 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1226045Z 2025-12-04T10:49:11.1226195Z [W1204 10:30:31.755235366 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1226197Z 2025-12-04T10:49:11.1226343Z [W1204 10:30:31.758866429 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1226346Z 2025-12-04T10:49:11.1226493Z [W1204 10:30:31.759173223 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1226495Z 2025-12-04T10:49:11.1226641Z [W1204 10:30:31.759255962 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1226663Z 2025-12-04T10:49:11.1226812Z [W1204 10:30:31.761613699 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1226813Z 2025-12-04T10:49:11.1226978Z [W1204 10:30:31.761875364 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1226980Z 2025-12-04T10:49:11.1227125Z [W1204 10:30:31.761951122 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1227127Z 2025-12-04T10:49:11.1227176Z ('RERUN', {'yellow': True}) [9.9372s] [100%] 2025-12-04T10:49:11.1227530Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:30:32.920501213 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1227534Z 2025-12-04T10:49:11.1227680Z [W1204 10:30:32.920946665 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1227682Z 2025-12-04T10:49:11.1227831Z [W1204 10:30:32.921041483 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1227833Z 2025-12-04T10:49:11.1227978Z [W1204 10:30:32.922446698 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1227980Z 2025-12-04T10:49:11.1228127Z [W1204 10:30:32.922778811 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1228129Z 2025-12-04T10:49:11.1228275Z [W1204 10:30:32.922858160 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1228278Z 2025-12-04T10:49:11.1228434Z [W1204 10:30:32.925153628 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1228436Z 2025-12-04T10:49:11.1228583Z [W1204 10:30:32.925413193 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1228586Z 2025-12-04T10:49:11.1228731Z [W1204 10:30:32.925489042 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1228734Z 2025-12-04T10:49:11.1228782Z ('RERUN', {'yellow': True}) [0.6617s] [100%] 2025-12-04T10:49:11.1229147Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:30:33.579670844 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1229152Z 2025-12-04T10:49:11.1229299Z [W1204 10:30:33.580085986 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1229301Z 2025-12-04T10:49:11.1229449Z [W1204 10:30:33.580182294 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1229451Z 2025-12-04T10:49:11.1229596Z [W1204 10:30:33.581584788 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1229598Z 2025-12-04T10:49:11.1229745Z [W1204 10:30:33.581917342 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1229747Z 2025-12-04T10:49:11.1229893Z [W1204 10:30:33.581996711 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1229896Z 2025-12-04T10:49:11.1230053Z [W1204 10:30:33.584294599 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1230055Z 2025-12-04T10:49:11.1230201Z [W1204 10:30:33.584555834 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1230215Z 2025-12-04T10:49:11.1230361Z [W1204 10:30:33.584636942 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1230363Z 2025-12-04T10:49:11.1230402Z FAILED [0.6531s] [100%] 2025-12-04T10:49:11.1230404Z 2025-12-04T10:49:11.1230456Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1230604Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1230650Z Traceback (most recent call last): 2025-12-04T10:49:11.1230807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1230848Z method(*args, **kwargs) 2025-12-04T10:49:11.1231001Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1231043Z method(*args, **kwargs) 2025-12-04T10:49:11.1231194Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1231231Z with policy(): 2025-12-04T10:49:11.1231383Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1231424Z raise RuntimeError(msg) 2025-12-04T10:49:11.1231831Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1231835Z 2025-12-04T10:49:11.1231948Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1232234Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1232236Z 2025-12-04T10:49:11.1232322Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1232408Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1232465Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1232640Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1232712Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1232749Z graph_break [] 2025-12-04T10:49:11.1232819Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1233164Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1233207Z if out == self.unknown_value: 2025-12-04T10:49:11.1233358Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1233403Z Traceback (most recent call last): 2025-12-04T10:49:11.1233556Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1233595Z method(*args, **kwargs) 2025-12-04T10:49:11.1233759Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1233798Z method(*args, **kwargs) 2025-12-04T10:49:11.1233961Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1233997Z with policy(): 2025-12-04T10:49:11.1234148Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1234187Z raise RuntimeError(msg) 2025-12-04T10:49:11.1234594Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1234598Z 2025-12-04T10:49:11.1234671Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1234955Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1234958Z 2025-12-04T10:49:11.1235043Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1235113Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1235170Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1235343Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1235415Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1235463Z graph_break [] 2025-12-04T10:49:11.1235535Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1235875Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1235920Z if out == self.unknown_value: 2025-12-04T10:49:11.1235990Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1236044Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1236124Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1236299Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1236335Z graph_break [] 2025-12-04T10:49:11.1236387Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1236536Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1236581Z Traceback (most recent call last): 2025-12-04T10:49:11.1236734Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1236774Z method(*args, **kwargs) 2025-12-04T10:49:11.1236923Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1236962Z method(*args, **kwargs) 2025-12-04T10:49:11.1237112Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1237148Z with policy(): 2025-12-04T10:49:11.1237300Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1237351Z raise RuntimeError(msg) 2025-12-04T10:49:11.1237757Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1237770Z 2025-12-04T10:49:11.1237841Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1238127Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1238129Z 2025-12-04T10:49:11.1238214Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1238286Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1238341Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1238516Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1238588Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1238624Z graph_break [] 2025-12-04T10:49:11.1238694Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1239032Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1239075Z if out == self.unknown_value: 2025-12-04T10:49:11.1239156Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1239212Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1239282Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1239456Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1239492Z graph_break [] 2025-12-04T10:49:11.1239562Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1239616Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1239703Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1239876Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1239912Z graph_break [] 2025-12-04T10:49:11.1240155Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-dac08232149f1240.xml - 2025-12-04T10:49:11.1240214Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1240853Z FAILED [0.6531s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1240855Z 2025-12-04T10:49:11.1240926Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1241211Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1241233Z 2025-12-04T10:49:11.1241317Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1241377Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1241443Z ================== 1 failed, 57 deselected, 2 rerun in 11.39s ================== 2025-12-04T10:49:11.1241479Z Got exit code 1 2025-12-04T10:49:11.1241718Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1241844Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1242090Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-708d03080279a901.xml 2025-12-04T10:49:11.1242147Z ============================= test session starts ============================== 2025-12-04T10:49:11.1242260Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1242300Z cachedir: .pytest_cache 2025-12-04T10:49:11.1242458Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1242503Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1242544Z configfile: pytest.ini 2025-12-04T10:49:11.1242705Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1242778Z collecting ... collected 58 items / 35 deselected / 23 selected 2025-12-04T10:49:11.1242829Z stepcurrent: skipping 35 already run items. 2025-12-04T10:49:11.1242889Z Running 23 items in this shard 2025-12-04T10:49:11.1242891Z 2025-12-04T10:49:11.1243137Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.4509s] [ 4%] 2025-12-04T10:49:11.1243378Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.4388s] [ 4%] 2025-12-04T10:49:11.1243608Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 FAILED [0.4456s] [ 4%] 2025-12-04T10:49:11.1243611Z 2025-12-04T10:49:11.1243661Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1243809Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1243854Z Traceback (most recent call last): 2025-12-04T10:49:11.1244010Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1244050Z method(*args, **kwargs) 2025-12-04T10:49:11.1244204Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1244242Z method(*args, **kwargs) 2025-12-04T10:49:11.1244395Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1244432Z with policy(): 2025-12-04T10:49:11.1244584Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1244623Z raise RuntimeError(msg) 2025-12-04T10:49:11.1245017Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1245049Z 2025-12-04T10:49:11.1245122Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1245405Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1245407Z 2025-12-04T10:49:11.1245493Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1245563Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1245618Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1245792Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1245866Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1245903Z graph_break [] 2025-12-04T10:49:11.1246048Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1246092Z Traceback (most recent call last): 2025-12-04T10:49:11.1246245Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1246285Z method(*args, **kwargs) 2025-12-04T10:49:11.1246436Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1246474Z method(*args, **kwargs) 2025-12-04T10:49:11.1246634Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1246672Z with policy(): 2025-12-04T10:49:11.1246822Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1246863Z raise RuntimeError(msg) 2025-12-04T10:49:11.1247260Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1247274Z 2025-12-04T10:49:11.1247345Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1247627Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1247630Z 2025-12-04T10:49:11.1247715Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1247786Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1247842Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1248014Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1248087Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1248123Z graph_break [] 2025-12-04T10:49:11.1248194Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1248248Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1248317Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1248502Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1248537Z graph_break [] 2025-12-04T10:49:11.1248602Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1248748Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1248793Z Traceback (most recent call last): 2025-12-04T10:49:11.1248944Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1248985Z method(*args, **kwargs) 2025-12-04T10:49:11.1249135Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1249174Z method(*args, **kwargs) 2025-12-04T10:49:11.1249323Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1249361Z with policy(): 2025-12-04T10:49:11.1249510Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1249552Z raise RuntimeError(msg) 2025-12-04T10:49:11.1249948Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1249951Z 2025-12-04T10:49:11.1250022Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1250316Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1250320Z 2025-12-04T10:49:11.1250404Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1250477Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1250530Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1250707Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1250777Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1250824Z graph_break [] 2025-12-04T10:49:11.1250895Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1250950Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1251019Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1251192Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1251228Z graph_break [] 2025-12-04T10:49:11.1251299Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1251351Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1251421Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1251593Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1251632Z graph_break [] 2025-12-04T10:49:11.1251906Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-708d03080279a901.xml - 2025-12-04T10:49:11.1251965Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1252599Z FAILED [0.4456s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1252619Z 2025-12-04T10:49:11.1252690Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1252975Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1252979Z 2025-12-04T10:49:11.1253063Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1253123Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1253191Z ================== 1 failed, 35 deselected, 2 rerun in 3.47s =================== 2025-12-04T10:49:11.1253227Z Got exit code 1 2025-12-04T10:49:11.1253267Z Retrying single test... 2025-12-04T10:49:11.1253461Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-93f7ed27b864773f.xml 2025-12-04T10:49:11.1253519Z ============================= test session starts ============================== 2025-12-04T10:49:11.1253629Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1253670Z cachedir: .pytest_cache 2025-12-04T10:49:11.1253847Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1253894Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1253933Z configfile: pytest.ini 2025-12-04T10:49:11.1254094Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1254166Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1254449Z stepcurrent: skipping 35 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1254506Z Running 1 items in this shard 2025-12-04T10:49:11.1254508Z 2025-12-04T10:49:11.1254864Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:30:52.722949176 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1254868Z 2025-12-04T10:49:11.1255021Z [W1204 10:30:58.486574074 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1255024Z 2025-12-04T10:49:11.1255174Z [W1204 10:30:58.486747670 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1255176Z 2025-12-04T10:49:11.1255324Z [W1204 10:30:58.490755087 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1255326Z 2025-12-04T10:49:11.1255473Z [W1204 10:30:58.491033102 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1255475Z 2025-12-04T10:49:11.1255623Z [W1204 10:30:58.491115130 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1255635Z 2025-12-04T10:49:11.1255782Z [W1204 10:30:58.493641034 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1255796Z 2025-12-04T10:49:11.1255942Z [W1204 10:30:58.493904759 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1255944Z 2025-12-04T10:49:11.1256090Z [W1204 10:30:58.493981878 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1256092Z 2025-12-04T10:49:11.1256141Z ('RERUN', {'yellow': True}) [9.2204s] [100%] 2025-12-04T10:49:11.1256496Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:30:59.428099576 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1256499Z 2025-12-04T10:49:11.1256646Z [W1204 10:30:59.428465700 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1256650Z 2025-12-04T10:49:11.1256796Z [W1204 10:30:59.428546338 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1256798Z 2025-12-04T10:49:11.1256944Z [W1204 10:30:59.429910423 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1256946Z 2025-12-04T10:49:11.1257095Z [W1204 10:30:59.430250347 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1257099Z 2025-12-04T10:49:11.1257256Z [W1204 10:30:59.430331445 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1257259Z 2025-12-04T10:49:11.1257405Z [W1204 10:30:59.432508555 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1257409Z 2025-12-04T10:49:11.1257554Z [W1204 10:30:59.432767991 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1257556Z 2025-12-04T10:49:11.1257702Z [W1204 10:30:59.432844749 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1257705Z 2025-12-04T10:49:11.1257753Z ('RERUN', {'yellow': True}) [0.4259s] [100%] 2025-12-04T10:49:11.1258114Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:31:00.846057866 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1258117Z 2025-12-04T10:49:11.1258264Z [W1204 10:31:00.846449188 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1258267Z 2025-12-04T10:49:11.1258414Z [W1204 10:31:00.846545607 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1258416Z 2025-12-04T10:49:11.1258562Z [W1204 10:31:00.847939151 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1258564Z 2025-12-04T10:49:11.1258710Z [W1204 10:31:00.848309974 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1258712Z 2025-12-04T10:49:11.1258857Z [W1204 10:31:00.848397853 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1258870Z 2025-12-04T10:49:11.1259016Z [W1204 10:31:00.850611382 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1259018Z 2025-12-04T10:49:11.1259178Z [W1204 10:31:00.850879737 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1259180Z 2025-12-04T10:49:11.1259327Z [W1204 10:31:00.850955676 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1259329Z 2025-12-04T10:49:11.1259366Z FAILED [0.4159s] [100%] 2025-12-04T10:49:11.1259368Z 2025-12-04T10:49:11.1259420Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1259567Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1259613Z Traceback (most recent call last): 2025-12-04T10:49:11.1259771Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1259811Z method(*args, **kwargs) 2025-12-04T10:49:11.1259965Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1260004Z method(*args, **kwargs) 2025-12-04T10:49:11.1260154Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1260192Z with policy(): 2025-12-04T10:49:11.1260344Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1260385Z raise RuntimeError(msg) 2025-12-04T10:49:11.1260791Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1260796Z 2025-12-04T10:49:11.1260868Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1261155Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1261158Z 2025-12-04T10:49:11.1261242Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1261326Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1261381Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1261557Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1261629Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1261666Z graph_break [] 2025-12-04T10:49:11.1261736Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1262112Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1262156Z if out == self.unknown_value: 2025-12-04T10:49:11.1262308Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1262352Z Traceback (most recent call last): 2025-12-04T10:49:11.1262504Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1262560Z method(*args, **kwargs) 2025-12-04T10:49:11.1262709Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1262749Z method(*args, **kwargs) 2025-12-04T10:49:11.1262915Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1262952Z with policy(): 2025-12-04T10:49:11.1263102Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1263142Z raise RuntimeError(msg) 2025-12-04T10:49:11.1263542Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1263545Z 2025-12-04T10:49:11.1263618Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1263903Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1263906Z 2025-12-04T10:49:11.1263991Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1264062Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1264117Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1264292Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1264362Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1264414Z graph_break [] 2025-12-04T10:49:11.1264484Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1264824Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1264867Z if out == self.unknown_value: 2025-12-04T10:49:11.1264937Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1264991Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1265078Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1265251Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1265288Z graph_break [] 2025-12-04T10:49:11.1265341Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1265490Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1265535Z Traceback (most recent call last): 2025-12-04T10:49:11.1265689Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1265727Z method(*args, **kwargs) 2025-12-04T10:49:11.1265882Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1265922Z method(*args, **kwargs) 2025-12-04T10:49:11.1266072Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1266108Z with policy(): 2025-12-04T10:49:11.1266260Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1266311Z raise RuntimeError(msg) 2025-12-04T10:49:11.1266708Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1266727Z 2025-12-04T10:49:11.1266800Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1267083Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1267085Z 2025-12-04T10:49:11.1267170Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1267242Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1267296Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1267469Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1267541Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1267577Z graph_break [] 2025-12-04T10:49:11.1267647Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1267992Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1268034Z if out == self.unknown_value: 2025-12-04T10:49:11.1268117Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1268172Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1268243Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1268416Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1268452Z graph_break [] 2025-12-04T10:49:11.1268522Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1268576Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1268655Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1268827Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1268862Z graph_break [] 2025-12-04T10:49:11.1269104Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-93f7ed27b864773f.xml - 2025-12-04T10:49:11.1269162Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1269783Z FAILED [0.4159s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1269785Z 2025-12-04T10:49:11.1269857Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1270141Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1270166Z 2025-12-04T10:49:11.1270250Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1270310Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1270377Z ================== 1 failed, 57 deselected, 2 rerun in 10.20s ================== 2025-12-04T10:49:11.1270413Z Got exit code 1 2025-12-04T10:49:11.1270453Z Retrying single test... 2025-12-04T10:49:11.1270648Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-2bc6c2a722389482.xml 2025-12-04T10:49:11.1270706Z ============================= test session starts ============================== 2025-12-04T10:49:11.1270817Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1270859Z cachedir: .pytest_cache 2025-12-04T10:49:11.1271014Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1271062Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1271102Z configfile: pytest.ini 2025-12-04T10:49:11.1271265Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1271338Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1271620Z stepcurrent: skipping 35 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1271665Z Running 1 items in this shard 2025-12-04T10:49:11.1271667Z 2025-12-04T10:49:11.1272076Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:31:08.108147577 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1272081Z 2025-12-04T10:49:11.1272233Z [W1204 10:31:16.618789835 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1272235Z 2025-12-04T10:49:11.1272384Z [W1204 10:31:16.618932102 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1272403Z 2025-12-04T10:49:11.1272550Z [W1204 10:31:16.622155043 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1272552Z 2025-12-04T10:49:11.1272700Z [W1204 10:31:16.622445098 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1272702Z 2025-12-04T10:49:11.1272848Z [W1204 10:31:16.622522156 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1272851Z 2025-12-04T10:49:11.1272998Z [W1204 10:31:16.624854113 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1273000Z 2025-12-04T10:49:11.1273145Z [W1204 10:31:16.625121849 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1273148Z 2025-12-04T10:49:11.1273295Z [W1204 10:31:16.625201137 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1273298Z 2025-12-04T10:49:11.1273347Z ('RERUN', {'yellow': True}) [9.9356s] [100%] 2025-12-04T10:49:11.1273698Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:31:17.534145774 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1273727Z 2025-12-04T10:49:11.1273874Z [W1204 10:31:17.534534676 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1273876Z 2025-12-04T10:49:11.1274022Z [W1204 10:31:17.534630505 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1274023Z 2025-12-04T10:49:11.1274171Z [W1204 10:31:17.536018079 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1274173Z 2025-12-04T10:49:11.1274321Z [W1204 10:31:17.536373863 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1274324Z 2025-12-04T10:49:11.1274470Z [W1204 10:31:17.536457131 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1274473Z 2025-12-04T10:49:11.1274620Z [W1204 10:31:17.538621272 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1274622Z 2025-12-04T10:49:11.1274769Z [W1204 10:31:17.538883737 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1274771Z 2025-12-04T10:49:11.1274919Z [W1204 10:31:17.538960085 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1274920Z 2025-12-04T10:49:11.1274969Z ('RERUN', {'yellow': True}) [0.4343s] [100%] 2025-12-04T10:49:11.1275328Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:31:17.999085965 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1275332Z 2025-12-04T10:49:11.1275479Z [W1204 10:31:17.999473748 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1275480Z 2025-12-04T10:49:11.1275626Z [W1204 10:31:17.999567686 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1275628Z 2025-12-04T10:49:11.1275788Z [W1204 10:31:17.000946791 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1275790Z 2025-12-04T10:49:11.1275937Z [W1204 10:31:17.001290525 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1275941Z 2025-12-04T10:49:11.1276087Z [W1204 10:31:17.001371943 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1276091Z 2025-12-04T10:49:11.1276238Z [W1204 10:31:17.003547113 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1276239Z 2025-12-04T10:49:11.1276387Z [W1204 10:31:17.003804609 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1276389Z 2025-12-04T10:49:11.1276538Z [W1204 10:31:17.003879297 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1276540Z 2025-12-04T10:49:11.1276578Z FAILED [0.4596s] [100%] 2025-12-04T10:49:11.1276580Z 2025-12-04T10:49:11.1276633Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1276791Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1276836Z Traceback (most recent call last): 2025-12-04T10:49:11.1277008Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1277047Z method(*args, **kwargs) 2025-12-04T10:49:11.1277198Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1277236Z method(*args, **kwargs) 2025-12-04T10:49:11.1277388Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1277425Z with policy(): 2025-12-04T10:49:11.1277576Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1277618Z raise RuntimeError(msg) 2025-12-04T10:49:11.1278010Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1278014Z 2025-12-04T10:49:11.1278086Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1278374Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1278376Z 2025-12-04T10:49:11.1278462Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1278544Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1278601Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1278777Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1278851Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1278887Z graph_break [] 2025-12-04T10:49:11.1278958Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1279310Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1279355Z if out == self.unknown_value: 2025-12-04T10:49:11.1279502Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1279547Z Traceback (most recent call last): 2025-12-04T10:49:11.1279699Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1279739Z method(*args, **kwargs) 2025-12-04T10:49:11.1279888Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1279928Z method(*args, **kwargs) 2025-12-04T10:49:11.1280075Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1280113Z with policy(): 2025-12-04T10:49:11.1280264Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1280305Z raise RuntimeError(msg) 2025-12-04T10:49:11.1280702Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1280734Z 2025-12-04T10:49:11.1280806Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1281093Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1281096Z 2025-12-04T10:49:11.1281181Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1281252Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1281306Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1281481Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1281553Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1281590Z graph_break [] 2025-12-04T10:49:11.1281660Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1282036Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1282079Z if out == self.unknown_value: 2025-12-04T10:49:11.1282149Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1282204Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1282275Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1282467Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1282503Z graph_break [] 2025-12-04T10:49:11.1282555Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1282702Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1282748Z Traceback (most recent call last): 2025-12-04T10:49:11.1282916Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1282956Z method(*args, **kwargs) 2025-12-04T10:49:11.1283107Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1283146Z method(*args, **kwargs) 2025-12-04T10:49:11.1283298Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1283335Z with policy(): 2025-12-04T10:49:11.1283485Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1283528Z raise RuntimeError(msg) 2025-12-04T10:49:11.1283925Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1283929Z 2025-12-04T10:49:11.1284000Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1284284Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1284301Z 2025-12-04T10:49:11.1284385Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1284471Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1284525Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1284697Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1284768Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1284805Z graph_break [] 2025-12-04T10:49:11.1284875Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1285219Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1285261Z if out == self.unknown_value: 2025-12-04T10:49:11.1285332Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1285386Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1285457Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1285630Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1285666Z graph_break [] 2025-12-04T10:49:11.1285737Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1285789Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1285859Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1286042Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1286079Z graph_break [] 2025-12-04T10:49:11.1286318Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-2bc6c2a722389482.xml - 2025-12-04T10:49:11.1286378Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1287010Z FAILED [0.4596s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1287014Z 2025-12-04T10:49:11.1287086Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1287370Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1287373Z 2025-12-04T10:49:11.1287456Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1287518Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1287584Z ================== 1 failed, 57 deselected, 2 rerun in 10.97s ================== 2025-12-04T10:49:11.1287620Z Got exit code 1 2025-12-04T10:49:11.1287856Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1287993Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1288188Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-3aff8dbeb87c2827.xml 2025-12-04T10:49:11.1288257Z ============================= test session starts ============================== 2025-12-04T10:49:11.1288367Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1288409Z cachedir: .pytest_cache 2025-12-04T10:49:11.1288567Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1288613Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1288652Z configfile: pytest.ini 2025-12-04T10:49:11.1288813Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1288886Z collecting ... collected 58 items / 36 deselected / 22 selected 2025-12-04T10:49:11.1288938Z stepcurrent: skipping 36 already run items. 2025-12-04T10:49:11.1288983Z Running 22 items in this shard 2025-12-04T10:49:11.1288985Z 2025-12-04T10:49:11.1289230Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [3.0429s] [ 4%] 2025-12-04T10:49:11.1289472Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.5059s] [ 4%] 2025-12-04T10:49:11.1289693Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 FAILED [0.5003s] [ 4%] 2025-12-04T10:49:11.1289707Z 2025-12-04T10:49:11.1289759Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1289906Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1289952Z Traceback (most recent call last): 2025-12-04T10:49:11.1290107Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1290147Z method(*args, **kwargs) 2025-12-04T10:49:11.1290298Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1290347Z method(*args, **kwargs) 2025-12-04T10:49:11.1290496Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1290533Z with policy(): 2025-12-04T10:49:11.1290686Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1290726Z raise RuntimeError(msg) 2025-12-04T10:49:11.1291118Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1291121Z 2025-12-04T10:49:11.1291193Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1291478Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1291480Z 2025-12-04T10:49:11.1291564Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1291648Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1291703Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1292011Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1292101Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1292136Z graph_break [] 2025-12-04T10:49:11.1292283Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1292327Z Traceback (most recent call last): 2025-12-04T10:49:11.1292481Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1292520Z method(*args, **kwargs) 2025-12-04T10:49:11.1292672Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1292710Z method(*args, **kwargs) 2025-12-04T10:49:11.1292863Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1292899Z with policy(): 2025-12-04T10:49:11.1293050Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1293090Z raise RuntimeError(msg) 2025-12-04T10:49:11.1293487Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1293491Z 2025-12-04T10:49:11.1293580Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1293865Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1293868Z 2025-12-04T10:49:11.1293953Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1294025Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1294082Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1294377Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1294450Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1294487Z graph_break [] 2025-12-04T10:49:11.1294558Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1294610Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1294681Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1294945Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1294983Z graph_break [] 2025-12-04T10:49:11.1295035Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1295184Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1295228Z Traceback (most recent call last): 2025-12-04T10:49:11.1295399Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1295437Z method(*args, **kwargs) 2025-12-04T10:49:11.1295587Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1295638Z method(*args, **kwargs) 2025-12-04T10:49:11.1295788Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1295825Z with policy(): 2025-12-04T10:49:11.1295976Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1296017Z raise RuntimeError(msg) 2025-12-04T10:49:11.1296419Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1296422Z 2025-12-04T10:49:11.1296494Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1296778Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1296780Z 2025-12-04T10:49:11.1296864Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1296936Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1296991Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1297285Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1297358Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1297396Z graph_break [] 2025-12-04T10:49:11.1297466Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1297520Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1297589Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1297867Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1297902Z graph_break [] 2025-12-04T10:49:11.1297972Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1298026Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1298096Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1298365Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1298403Z graph_break [] 2025-12-04T10:49:11.1298646Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-3aff8dbeb87c2827.xml - 2025-12-04T10:49:11.1298707Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1299330Z FAILED [0.5003s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1299355Z 2025-12-04T10:49:11.1299426Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1299709Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1299711Z 2025-12-04T10:49:11.1299795Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1299856Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1299922Z ================== 1 failed, 36 deselected, 2 rerun in 4.22s =================== 2025-12-04T10:49:11.1299960Z Got exit code 1 2025-12-04T10:49:11.1300000Z Retrying single test... 2025-12-04T10:49:11.1300194Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8ae7e716f26514b0.xml 2025-12-04T10:49:11.1300251Z ============================= test session starts ============================== 2025-12-04T10:49:11.1300362Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1300402Z cachedir: .pytest_cache 2025-12-04T10:49:11.1300561Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1300607Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1300646Z configfile: pytest.ini 2025-12-04T10:49:11.1300808Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1300890Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1301172Z stepcurrent: skipping 36 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1301217Z Running 1 items in this shard 2025-12-04T10:49:11.1301219Z 2025-12-04T10:49:11.1301589Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:31:38.230525695 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1301592Z 2025-12-04T10:49:11.1301745Z [W1204 10:31:45.089308070 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1301746Z 2025-12-04T10:49:11.1301933Z [W1204 10:31:45.089513516 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1301936Z 2025-12-04T10:49:11.1302084Z [W1204 10:31:45.093065751 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1302087Z 2025-12-04T10:49:11.1302233Z [W1204 10:31:45.093435904 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1302236Z 2025-12-04T10:49:11.1302385Z [W1204 10:31:45.093523043 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1302387Z 2025-12-04T10:49:11.1302532Z [W1204 10:31:45.096284852 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1302534Z 2025-12-04T10:49:11.1302682Z [W1204 10:31:45.096593956 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1302698Z 2025-12-04T10:49:11.1302845Z [W1204 10:31:45.096672965 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1302861Z 2025-12-04T10:49:11.1302911Z ('RERUN', {'yellow': True}) [10.1206s] [100%] 2025-12-04T10:49:11.1303266Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:31:46.996346532 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1303268Z 2025-12-04T10:49:11.1303414Z [W1204 10:31:46.996979280 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1303416Z 2025-12-04T10:49:11.1303565Z [W1204 10:31:46.997095208 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1303567Z 2025-12-04T10:49:11.1303718Z [W1204 10:31:46.999014643 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1303721Z 2025-12-04T10:49:11.1303868Z [W1204 10:31:46.999405816 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1303870Z 2025-12-04T10:49:11.1304017Z [W1204 10:31:46.999492384 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1304020Z 2025-12-04T10:49:11.1304165Z [W1204 10:31:46.002016168 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1304167Z 2025-12-04T10:49:11.1304329Z [W1204 10:31:46.002366282 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1304332Z 2025-12-04T10:49:11.1304478Z [W1204 10:31:46.002447070 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1304482Z 2025-12-04T10:49:11.1304530Z ('RERUN', {'yellow': True}) [0.7487s] [100%] 2025-12-04T10:49:11.1304882Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:31:47.726944255 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1304898Z 2025-12-04T10:49:11.1305044Z [W1204 10:31:47.727333318 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1305048Z 2025-12-04T10:49:11.1305196Z [W1204 10:31:47.727425976 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1305199Z 2025-12-04T10:49:11.1305346Z [W1204 10:31:47.728855950 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1305348Z 2025-12-04T10:49:11.1305495Z [W1204 10:31:47.729138455 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1305497Z 2025-12-04T10:49:11.1305645Z [W1204 10:31:47.729220273 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1305646Z 2025-12-04T10:49:11.1305797Z [W1204 10:31:47.731477882 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1305799Z 2025-12-04T10:49:11.1305947Z [W1204 10:31:47.731751277 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1305959Z 2025-12-04T10:49:11.1306105Z [W1204 10:31:47.731827885 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1306121Z 2025-12-04T10:49:11.1306160Z FAILED [0.7153s] [100%] 2025-12-04T10:49:11.1306163Z 2025-12-04T10:49:11.1306215Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1306361Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1306407Z Traceback (most recent call last): 2025-12-04T10:49:11.1306564Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1306604Z method(*args, **kwargs) 2025-12-04T10:49:11.1306757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1306798Z method(*args, **kwargs) 2025-12-04T10:49:11.1306947Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1306986Z with policy(): 2025-12-04T10:49:11.1307137Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1307177Z raise RuntimeError(msg) 2025-12-04T10:49:11.1307571Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1307573Z 2025-12-04T10:49:11.1307647Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1307949Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1307955Z 2025-12-04T10:49:11.1308041Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1308113Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1308167Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1308448Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1308521Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1308557Z graph_break [] 2025-12-04T10:49:11.1308628Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1308972Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1309017Z if out == self.unknown_value: 2025-12-04T10:49:11.1309164Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1309208Z Traceback (most recent call last): 2025-12-04T10:49:11.1309362Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1309401Z method(*args, **kwargs) 2025-12-04T10:49:11.1309553Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1309591Z method(*args, **kwargs) 2025-12-04T10:49:11.1309753Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1309789Z with policy(): 2025-12-04T10:49:11.1309953Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1309994Z raise RuntimeError(msg) 2025-12-04T10:49:11.1310396Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1310398Z 2025-12-04T10:49:11.1310470Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1310753Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1310756Z 2025-12-04T10:49:11.1310843Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1310914Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1310969Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1311239Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1311311Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1311347Z graph_break [] 2025-12-04T10:49:11.1311418Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1311769Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1311814Z if out == self.unknown_value: 2025-12-04T10:49:11.1311905Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1311959Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1312030Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1312319Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1312357Z graph_break [] 2025-12-04T10:49:11.1312408Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1312557Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1312602Z Traceback (most recent call last): 2025-12-04T10:49:11.1312757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1312795Z method(*args, **kwargs) 2025-12-04T10:49:11.1312947Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1312986Z method(*args, **kwargs) 2025-12-04T10:49:11.1313139Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1313175Z with policy(): 2025-12-04T10:49:11.1313329Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1313383Z raise RuntimeError(msg) 2025-12-04T10:49:11.1313780Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1313797Z 2025-12-04T10:49:11.1313871Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1314155Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1314157Z 2025-12-04T10:49:11.1314243Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1314314Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1314370Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1314640Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1314713Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1314748Z graph_break [] 2025-12-04T10:49:11.1314818Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1315159Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1315202Z if out == self.unknown_value: 2025-12-04T10:49:11.1315286Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1315341Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1315412Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1315679Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1315715Z graph_break [] 2025-12-04T10:49:11.1315785Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1315849Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1315919Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1316185Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1316222Z graph_break [] 2025-12-04T10:49:11.1316462Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8ae7e716f26514b0.xml - 2025-12-04T10:49:11.1316521Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1317154Z FAILED [0.7153s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1317157Z 2025-12-04T10:49:11.1317240Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1317521Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1317534Z 2025-12-04T10:49:11.1317619Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1317679Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1317749Z ================== 1 failed, 57 deselected, 2 rerun in 11.74s ================== 2025-12-04T10:49:11.1317786Z Got exit code 1 2025-12-04T10:49:11.1317827Z Retrying single test... 2025-12-04T10:49:11.1318022Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-cb0ad7c0a23d05e0.xml 2025-12-04T10:49:11.1318080Z ============================= test session starts ============================== 2025-12-04T10:49:11.1318193Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1318233Z cachedir: .pytest_cache 2025-12-04T10:49:11.1318392Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1318437Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1318477Z configfile: pytest.ini 2025-12-04T10:49:11.1318638Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1318711Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1319006Z stepcurrent: skipping 36 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1319052Z Running 1 items in this shard 2025-12-04T10:49:11.1319055Z 2025-12-04T10:49:11.1319415Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:31:57.475617553 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1319419Z 2025-12-04T10:49:11.1319576Z [W1204 10:32:05.099258259 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1319578Z 2025-12-04T10:49:11.1319741Z [W1204 10:32:05.099448236 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1319743Z 2025-12-04T10:49:11.1319889Z [W1204 10:32:05.102767835 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1319893Z 2025-12-04T10:49:11.1320040Z [W1204 10:32:05.105196690 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1320042Z 2025-12-04T10:49:11.1320189Z [W1204 10:32:05.105289899 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1320190Z 2025-12-04T10:49:11.1320338Z [W1204 10:32:05.107918701 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1320340Z 2025-12-04T10:49:11.1320489Z [W1204 10:32:05.108258674 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1320491Z 2025-12-04T10:49:11.1320637Z [W1204 10:32:05.108336913 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1320651Z 2025-12-04T10:49:11.1320703Z ('RERUN', {'yellow': True}) [10.6302s] [100%] 2025-12-04T10:49:11.1321151Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:32:06.700337988 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1321171Z 2025-12-04T10:49:11.1321321Z [W1204 10:32:06.700720391 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1321322Z 2025-12-04T10:49:11.1321469Z [W1204 10:32:06.700809809 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1321473Z 2025-12-04T10:49:11.1321618Z [W1204 10:32:06.702198404 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1321621Z 2025-12-04T10:49:11.1321768Z [W1204 10:32:06.702457719 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1321770Z 2025-12-04T10:49:11.1321953Z [W1204 10:32:06.702538818 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1321957Z 2025-12-04T10:49:11.1322105Z [W1204 10:32:06.704716318 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1322107Z 2025-12-04T10:49:11.1322253Z [W1204 10:32:06.704977513 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1322255Z 2025-12-04T10:49:11.1322401Z [W1204 10:32:06.705062771 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1322403Z 2025-12-04T10:49:11.1322467Z ('RERUN', {'yellow': True}) [0.4937s] [100%] 2025-12-04T10:49:11.1322822Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:32:06.219911609 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1322825Z 2025-12-04T10:49:11.1322973Z [W1204 10:32:06.220332791 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1322975Z 2025-12-04T10:49:11.1323134Z [W1204 10:32:06.220421819 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1323137Z 2025-12-04T10:49:11.1323285Z [W1204 10:32:06.221817614 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1323287Z 2025-12-04T10:49:11.1323439Z [W1204 10:32:06.222076029 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1323441Z 2025-12-04T10:49:11.1323587Z [W1204 10:32:06.222154018 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1323590Z 2025-12-04T10:49:11.1323738Z [W1204 10:32:06.224326488 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1323739Z 2025-12-04T10:49:11.1323885Z [W1204 10:32:06.224580163 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1323887Z 2025-12-04T10:49:11.1324035Z [W1204 10:32:06.224654782 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1324037Z 2025-12-04T10:49:11.1324075Z FAILED [0.4833s] [100%] 2025-12-04T10:49:11.1324091Z 2025-12-04T10:49:11.1324142Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1324290Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1324348Z Traceback (most recent call last): 2025-12-04T10:49:11.1324504Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1324545Z method(*args, **kwargs) 2025-12-04T10:49:11.1324698Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1324736Z method(*args, **kwargs) 2025-12-04T10:49:11.1324886Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1324923Z with policy(): 2025-12-04T10:49:11.1325076Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1325116Z raise RuntimeError(msg) 2025-12-04T10:49:11.1325508Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1325511Z 2025-12-04T10:49:11.1325584Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1325870Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1325873Z 2025-12-04T10:49:11.1325973Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1326045Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1326101Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1326373Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1326447Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1326483Z graph_break [] 2025-12-04T10:49:11.1326566Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1326907Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1326954Z if out == self.unknown_value: 2025-12-04T10:49:11.1327100Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1327146Z Traceback (most recent call last): 2025-12-04T10:49:11.1327297Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1327337Z method(*args, **kwargs) 2025-12-04T10:49:11.1327486Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1327526Z method(*args, **kwargs) 2025-12-04T10:49:11.1327675Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1327712Z with policy(): 2025-12-04T10:49:11.1327864Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1327914Z raise RuntimeError(msg) 2025-12-04T10:49:11.1328308Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1328323Z 2025-12-04T10:49:11.1328395Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1328680Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1328682Z 2025-12-04T10:49:11.1328767Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1328840Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1328895Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1329164Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1329236Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1329272Z graph_break [] 2025-12-04T10:49:11.1329343Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1329682Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1329726Z if out == self.unknown_value: 2025-12-04T10:49:11.1329809Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1329863Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1329933Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1330204Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1330240Z graph_break [] 2025-12-04T10:49:11.1330291Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1330449Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1330496Z Traceback (most recent call last): 2025-12-04T10:49:11.1330651Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1330692Z method(*args, **kwargs) 2025-12-04T10:49:11.1330842Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1330883Z method(*args, **kwargs) 2025-12-04T10:49:11.1331032Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1331068Z with policy(): 2025-12-04T10:49:11.1331220Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1331261Z raise RuntimeError(msg) 2025-12-04T10:49:11.1331659Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1331672Z 2025-12-04T10:49:11.1331743Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1332075Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1332095Z 2025-12-04T10:49:11.1332179Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1332251Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1332305Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1332577Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1332649Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1332685Z graph_break [] 2025-12-04T10:49:11.1332756Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1333094Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1333137Z if out == self.unknown_value: 2025-12-04T10:49:11.1333208Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1333262Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1333331Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1333610Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1333647Z graph_break [] 2025-12-04T10:49:11.1333719Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1333772Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1333842Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1334126Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1334164Z graph_break [] 2025-12-04T10:49:11.1334409Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-cb0ad7c0a23d05e0.xml - 2025-12-04T10:49:11.1334470Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1335093Z FAILED [0.4833s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1335097Z 2025-12-04T10:49:11.1335168Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1335451Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1335472Z 2025-12-04T10:49:11.1335556Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1335617Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1335696Z ================== 1 failed, 57 deselected, 2 rerun in 11.78s ================== 2025-12-04T10:49:11.1335734Z Got exit code 1 2025-12-04T10:49:11.1335970Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1336152Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1338459Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-a67bd3ed4314b80b.xml 2025-12-04T10:49:11.1338528Z ============================= test session starts ============================== 2025-12-04T10:49:11.1338645Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1338687Z cachedir: .pytest_cache 2025-12-04T10:49:11.1338850Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1338897Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1338937Z configfile: pytest.ini 2025-12-04T10:49:11.1339104Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1339182Z collecting ... collected 58 items / 37 deselected / 21 selected 2025-12-04T10:49:11.1339234Z stepcurrent: skipping 37 already run items. 2025-12-04T10:49:11.1339278Z Running 21 items in this shard 2025-12-04T10:49:11.1339280Z 2025-12-04T10:49:11.1339549Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.7233s] [ 4%] 2025-12-04T10:49:11.1339794Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6433s] [ 4%] 2025-12-04T10:49:11.1340017Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 FAILED [0.6137s] [ 4%] 2025-12-04T10:49:11.1340019Z 2025-12-04T10:49:11.1340072Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1340235Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1340281Z Traceback (most recent call last): 2025-12-04T10:49:11.1340439Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1340480Z method(*args, **kwargs) 2025-12-04T10:49:11.1340631Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1340673Z method(*args, **kwargs) 2025-12-04T10:49:11.1340824Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1340862Z with policy(): 2025-12-04T10:49:11.1341014Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1341059Z raise RuntimeError(msg) 2025-12-04T10:49:11.1341459Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1341473Z 2025-12-04T10:49:11.1341545Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1341834Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1341893Z 2025-12-04T10:49:11.1341978Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1342051Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1342109Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1342286Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1342358Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1342396Z graph_break [] 2025-12-04T10:49:11.1342544Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1342590Z Traceback (most recent call last): 2025-12-04T10:49:11.1342743Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1342782Z method(*args, **kwargs) 2025-12-04T10:49:11.1342931Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1342971Z method(*args, **kwargs) 2025-12-04T10:49:11.1343122Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1343159Z with policy(): 2025-12-04T10:49:11.1343330Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1343373Z raise RuntimeError(msg) 2025-12-04T10:49:11.1343777Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1343780Z 2025-12-04T10:49:11.1343850Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1344153Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1344156Z 2025-12-04T10:49:11.1344240Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1344315Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1344370Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1344544Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1344616Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1344652Z graph_break [] 2025-12-04T10:49:11.1344723Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1344778Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1344847Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1345021Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1345072Z graph_break [] 2025-12-04T10:49:11.1345123Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1345278Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1345341Z Traceback (most recent call last): 2025-12-04T10:49:11.1345495Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1345534Z method(*args, **kwargs) 2025-12-04T10:49:11.1345685Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1345725Z method(*args, **kwargs) 2025-12-04T10:49:11.1345874Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1345910Z with policy(): 2025-12-04T10:49:11.1346062Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1346102Z raise RuntimeError(msg) 2025-12-04T10:49:11.1346505Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1346509Z 2025-12-04T10:49:11.1346579Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1346865Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1346868Z 2025-12-04T10:49:11.1346967Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1347039Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1347094Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1347268Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1347339Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1347375Z graph_break [] 2025-12-04T10:49:11.1347447Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1347511Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1347582Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1347755Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1347793Z graph_break [] 2025-12-04T10:49:11.1347863Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1347917Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1347987Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1348159Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1348194Z graph_break [] 2025-12-04T10:49:11.1348439Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-a67bd3ed4314b80b.xml - 2025-12-04T10:49:11.1348497Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1349133Z FAILED [0.6137s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1349160Z 2025-12-04T10:49:11.1349231Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1349516Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1349519Z 2025-12-04T10:49:11.1349603Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1349665Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1349733Z ================== 1 failed, 37 deselected, 2 rerun in 4.14s =================== 2025-12-04T10:49:11.1349769Z Got exit code 1 2025-12-04T10:49:11.1349811Z Retrying single test... 2025-12-04T10:49:11.1350007Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d139ed75b7009a9d.xml 2025-12-04T10:49:11.1350064Z ============================= test session starts ============================== 2025-12-04T10:49:11.1350176Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1350217Z cachedir: .pytest_cache 2025-12-04T10:49:11.1350376Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1350421Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1350462Z configfile: pytest.ini 2025-12-04T10:49:11.1350638Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1350711Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1350995Z stepcurrent: skipping 37 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1351040Z Running 1 items in this shard 2025-12-04T10:49:11.1351042Z 2025-12-04T10:49:11.1351414Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:32:27.365565169 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1351417Z 2025-12-04T10:49:11.1351571Z [W1204 10:32:35.047993794 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1351574Z 2025-12-04T10:49:11.1351723Z [W1204 10:32:35.048140891 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1351727Z 2025-12-04T10:49:11.1351911Z [W1204 10:32:35.051295924 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1351913Z 2025-12-04T10:49:11.1352062Z [W1204 10:32:35.051588198 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1352065Z 2025-12-04T10:49:11.1352212Z [W1204 10:32:35.051665977 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1352214Z 2025-12-04T10:49:11.1352364Z [W1204 10:32:35.054142782 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1352390Z 2025-12-04T10:49:11.1352538Z [W1204 10:32:35.054405467 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1352556Z 2025-12-04T10:49:11.1352703Z [W1204 10:32:35.054482545 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1352704Z 2025-12-04T10:49:11.1352756Z ('RERUN', {'yellow': True}) [10.4350s] [100%] 2025-12-04T10:49:11.1353113Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:32:36.096118522 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1353115Z 2025-12-04T10:49:11.1353263Z [W1204 10:32:36.096486276 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1353266Z 2025-12-04T10:49:11.1353412Z [W1204 10:32:36.096567124 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1353415Z 2025-12-04T10:49:11.1353561Z [W1204 10:32:36.097925819 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1353563Z 2025-12-04T10:49:11.1353710Z [W1204 10:32:36.098253743 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1353711Z 2025-12-04T10:49:11.1353859Z [W1204 10:32:36.098333262 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1353862Z 2025-12-04T10:49:11.1354022Z [W1204 10:32:36.100624360 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1354025Z 2025-12-04T10:49:11.1354172Z [W1204 10:32:36.100886515 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1354177Z 2025-12-04T10:49:11.1354322Z [W1204 10:32:36.100963574 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1354324Z 2025-12-04T10:49:11.1354373Z ('RERUN', {'yellow': True}) [0.5300s] [100%] 2025-12-04T10:49:11.1354740Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:32:37.588788241 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1354742Z 2025-12-04T10:49:11.1354891Z [W1204 10:32:37.589155494 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1354894Z 2025-12-04T10:49:11.1355040Z [W1204 10:32:37.589240183 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1355043Z 2025-12-04T10:49:11.1355190Z [W1204 10:32:37.590618887 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1355192Z 2025-12-04T10:49:11.1355339Z [W1204 10:32:37.590938461 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1355340Z 2025-12-04T10:49:11.1355488Z [W1204 10:32:37.591021800 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1355490Z 2025-12-04T10:49:11.1355637Z [W1204 10:32:37.593290428 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1355651Z 2025-12-04T10:49:11.1355797Z [W1204 10:32:37.593547224 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1355811Z 2025-12-04T10:49:11.1355960Z [W1204 10:32:37.593622722 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1355962Z 2025-12-04T10:49:11.1356000Z FAILED [0.5097s] [100%] 2025-12-04T10:49:11.1356002Z 2025-12-04T10:49:11.1356054Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1356205Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1356249Z Traceback (most recent call last): 2025-12-04T10:49:11.1356405Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1356447Z method(*args, **kwargs) 2025-12-04T10:49:11.1356598Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1356638Z method(*args, **kwargs) 2025-12-04T10:49:11.1356789Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1356825Z with policy(): 2025-12-04T10:49:11.1356977Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1357016Z raise RuntimeError(msg) 2025-12-04T10:49:11.1357432Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1357438Z 2025-12-04T10:49:11.1357510Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1357797Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1357800Z 2025-12-04T10:49:11.1357885Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1357956Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1358022Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1358200Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1358272Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1358310Z graph_break [] 2025-12-04T10:49:11.1358382Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1358727Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1358773Z if out == self.unknown_value: 2025-12-04T10:49:11.1358920Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1358966Z Traceback (most recent call last): 2025-12-04T10:49:11.1359118Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1359158Z method(*args, **kwargs) 2025-12-04T10:49:11.1359308Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1359359Z method(*args, **kwargs) 2025-12-04T10:49:11.1359508Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1359556Z with policy(): 2025-12-04T10:49:11.1359708Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1359749Z raise RuntimeError(msg) 2025-12-04T10:49:11.1360153Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1360156Z 2025-12-04T10:49:11.1360227Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1360516Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1360519Z 2025-12-04T10:49:11.1360604Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1360675Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1360729Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1360905Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1360978Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1361013Z graph_break [] 2025-12-04T10:49:11.1361099Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1361440Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1361485Z if out == self.unknown_value: 2025-12-04T10:49:11.1361554Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1361608Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1361678Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1361903Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1361939Z graph_break [] 2025-12-04T10:49:11.1361991Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1362141Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1362187Z Traceback (most recent call last): 2025-12-04T10:49:11.1362338Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1362380Z method(*args, **kwargs) 2025-12-04T10:49:11.1362528Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1362568Z method(*args, **kwargs) 2025-12-04T10:49:11.1362720Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1362757Z with policy(): 2025-12-04T10:49:11.1362907Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1362949Z raise RuntimeError(msg) 2025-12-04T10:49:11.1363365Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1363385Z 2025-12-04T10:49:11.1363457Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1363742Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1363744Z 2025-12-04T10:49:11.1363833Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1363903Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1363959Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1364134Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1364205Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1364242Z graph_break [] 2025-12-04T10:49:11.1364312Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1364654Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1364700Z if out == self.unknown_value: 2025-12-04T10:49:11.1364770Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1364838Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1364910Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1365084Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1365121Z graph_break [] 2025-12-04T10:49:11.1365192Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1365245Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1365316Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1365505Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1365542Z graph_break [] 2025-12-04T10:49:11.1365781Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d139ed75b7009a9d.xml - 2025-12-04T10:49:11.1365844Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1366475Z FAILED [0.5097s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1366483Z 2025-12-04T10:49:11.1366553Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1366838Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1366852Z 2025-12-04T10:49:11.1366936Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1367007Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1367074Z ================== 1 failed, 57 deselected, 2 rerun in 11.64s ================== 2025-12-04T10:49:11.1367111Z Got exit code 1 2025-12-04T10:49:11.1367150Z Retrying single test... 2025-12-04T10:49:11.1367349Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-a41dfcab59f95213.xml 2025-12-04T10:49:11.1367404Z ============================= test session starts ============================== 2025-12-04T10:49:11.1367515Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1367555Z cachedir: .pytest_cache 2025-12-04T10:49:11.1367714Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1367760Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1367801Z configfile: pytest.ini 2025-12-04T10:49:11.1367962Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1368036Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1368319Z stepcurrent: skipping 37 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1368362Z Running 1 items in this shard 2025-12-04T10:49:11.1368364Z 2025-12-04T10:49:11.1368735Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:32:46.195052197 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1368739Z 2025-12-04T10:49:11.1368890Z [W1204 10:32:54.770449899 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1368892Z 2025-12-04T10:49:11.1369043Z [W1204 10:32:54.770627426 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1369045Z 2025-12-04T10:49:11.1369210Z [W1204 10:32:54.773850417 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1369212Z 2025-12-04T10:49:11.1369359Z [W1204 10:32:54.774196401 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1369361Z 2025-12-04T10:49:11.1369509Z [W1204 10:32:54.774278259 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1369512Z 2025-12-04T10:49:11.1369658Z [W1204 10:32:54.776753634 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1369661Z 2025-12-04T10:49:11.1369808Z [W1204 10:32:54.777048679 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1369810Z 2025-12-04T10:49:11.1369956Z [W1204 10:32:54.777127627 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1369960Z 2025-12-04T10:49:11.1370010Z ('RERUN', {'yellow': True}) [10.2326s] [100%] 2025-12-04T10:49:11.1370365Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:32:55.809912873 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1370378Z 2025-12-04T10:49:11.1370524Z [W1204 10:32:55.810320255 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1370538Z 2025-12-04T10:49:11.1370684Z [W1204 10:32:55.810411804 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1370686Z 2025-12-04T10:49:11.1370834Z [W1204 10:32:55.811809388 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1370837Z 2025-12-04T10:49:11.1370984Z [W1204 10:32:55.812152202 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1370986Z 2025-12-04T10:49:11.1371133Z [W1204 10:32:55.812235350 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1371136Z 2025-12-04T10:49:11.1371284Z [W1204 10:32:55.814451720 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1371287Z 2025-12-04T10:49:11.1371433Z [W1204 10:32:55.814712945 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1371435Z 2025-12-04T10:49:11.1371584Z [W1204 10:32:55.814791583 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1371586Z 2025-12-04T10:49:11.1371636Z ('RERUN', {'yellow': True}) [0.4840s] [100%] 2025-12-04T10:49:11.1372055Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:32:55.280437529 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1372058Z 2025-12-04T10:49:11.1372205Z [W1204 10:32:55.280852262 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1372208Z 2025-12-04T10:49:11.1372356Z [W1204 10:32:55.280953140 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1372359Z 2025-12-04T10:49:11.1372520Z [W1204 10:32:55.282384054 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1372522Z 2025-12-04T10:49:11.1372670Z [W1204 10:32:55.282730367 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1372672Z 2025-12-04T10:49:11.1372819Z [W1204 10:32:55.282810756 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1372822Z 2025-12-04T10:49:11.1372969Z [W1204 10:32:55.285029405 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1372972Z 2025-12-04T10:49:11.1373119Z [W1204 10:32:55.285294060 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1373121Z 2025-12-04T10:49:11.1373269Z [W1204 10:32:55.285371579 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1373272Z 2025-12-04T10:49:11.1373310Z FAILED [0.4700s] [100%] 2025-12-04T10:49:11.1373312Z 2025-12-04T10:49:11.1373363Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1373515Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1373572Z Traceback (most recent call last): 2025-12-04T10:49:11.1373728Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1373784Z method(*args, **kwargs) 2025-12-04T10:49:11.1373934Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1373974Z method(*args, **kwargs) 2025-12-04T10:49:11.1374124Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1374162Z with policy(): 2025-12-04T10:49:11.1374314Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1374354Z raise RuntimeError(msg) 2025-12-04T10:49:11.1374751Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1374755Z 2025-12-04T10:49:11.1374828Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1375113Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1375116Z 2025-12-04T10:49:11.1375202Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1375274Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1375329Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1375520Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1375592Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1375630Z graph_break [] 2025-12-04T10:49:11.1375700Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1376054Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1376096Z if out == self.unknown_value: 2025-12-04T10:49:11.1376246Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1376290Z Traceback (most recent call last): 2025-12-04T10:49:11.1376444Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1376483Z method(*args, **kwargs) 2025-12-04T10:49:11.1376633Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1376673Z method(*args, **kwargs) 2025-12-04T10:49:11.1376822Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1376858Z with policy(): 2025-12-04T10:49:11.1377011Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1377050Z raise RuntimeError(msg) 2025-12-04T10:49:11.1377456Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1377470Z 2025-12-04T10:49:11.1377542Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1377838Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1377840Z 2025-12-04T10:49:11.1377926Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1377997Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1378053Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1378227Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1378299Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1378334Z graph_break [] 2025-12-04T10:49:11.1378405Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1378745Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1378789Z if out == self.unknown_value: 2025-12-04T10:49:11.1378861Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1378915Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1378985Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1379177Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1379215Z graph_break [] 2025-12-04T10:49:11.1379265Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1379414Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1379458Z Traceback (most recent call last): 2025-12-04T10:49:11.1379610Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1379648Z method(*args, **kwargs) 2025-12-04T10:49:11.1379927Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1379969Z method(*args, **kwargs) 2025-12-04T10:49:11.1380120Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1380158Z with policy(): 2025-12-04T10:49:11.1380308Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1380350Z raise RuntimeError(msg) 2025-12-04T10:49:11.1380754Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1380758Z 2025-12-04T10:49:11.1380830Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1381116Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1381129Z 2025-12-04T10:49:11.1381214Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1381285Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1381351Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1381526Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1381599Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1381635Z graph_break [] 2025-12-04T10:49:11.1381705Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1382089Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1382133Z if out == self.unknown_value: 2025-12-04T10:49:11.1382205Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1382260Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1382331Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1382504Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1382541Z graph_break [] 2025-12-04T10:49:11.1382613Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1382665Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1382736Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1382922Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1382960Z graph_break [] 2025-12-04T10:49:11.1383200Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-a41dfcab59f95213.xml - 2025-12-04T10:49:11.1383261Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1383902Z FAILED [0.4700s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1383907Z 2025-12-04T10:49:11.1383979Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1384263Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1384266Z 2025-12-04T10:49:11.1384350Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1384412Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1384478Z ================== 1 failed, 57 deselected, 2 rerun in 11.35s ================== 2025-12-04T10:49:11.1384514Z Got exit code 1 2025-12-04T10:49:11.1384752Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1384893Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1385088Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-9f7f31c874ac7fb8.xml 2025-12-04T10:49:11.1385157Z ============================= test session starts ============================== 2025-12-04T10:49:11.1385267Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1385309Z cachedir: .pytest_cache 2025-12-04T10:49:11.1385468Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1385514Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1385556Z configfile: pytest.ini 2025-12-04T10:49:11.1385717Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1385791Z collecting ... collected 58 items / 38 deselected / 20 selected 2025-12-04T10:49:11.1385843Z stepcurrent: skipping 38 already run items. 2025-12-04T10:49:11.1385886Z Running 20 items in this shard 2025-12-04T10:49:11.1385893Z 2025-12-04T10:49:11.1386138Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.6528s] [ 5%] 2025-12-04T10:49:11.1386380Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6326s] [ 5%] 2025-12-04T10:49:11.1386601Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 FAILED [0.6220s] [ 5%] 2025-12-04T10:49:11.1386603Z 2025-12-04T10:49:11.1386667Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1386814Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1386860Z Traceback (most recent call last): 2025-12-04T10:49:11.1387016Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1387056Z method(*args, **kwargs) 2025-12-04T10:49:11.1387207Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1387246Z method(*args, **kwargs) 2025-12-04T10:49:11.1387405Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1387442Z with policy(): 2025-12-04T10:49:11.1387594Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1387636Z raise RuntimeError(msg) 2025-12-04T10:49:11.1388028Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1388031Z 2025-12-04T10:49:11.1388102Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1388390Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1388392Z 2025-12-04T10:49:11.1388476Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1388548Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1388616Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1388795Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1388875Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1388913Z graph_break [] 2025-12-04T10:49:11.1389058Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1389104Z Traceback (most recent call last): 2025-12-04T10:49:11.1389258Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1389297Z method(*args, **kwargs) 2025-12-04T10:49:11.1389447Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1389487Z method(*args, **kwargs) 2025-12-04T10:49:11.1389635Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1389672Z with policy(): 2025-12-04T10:49:11.1389823Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1389862Z raise RuntimeError(msg) 2025-12-04T10:49:11.1390259Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1390261Z 2025-12-04T10:49:11.1390332Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1390625Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1390629Z 2025-12-04T10:49:11.1390714Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1390785Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1390841Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1391027Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1391099Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1391135Z graph_break [] 2025-12-04T10:49:11.1391207Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1391261Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1391331Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1391504Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1391541Z graph_break [] 2025-12-04T10:49:11.1391591Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1391738Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1391782Z Traceback (most recent call last): 2025-12-04T10:49:11.1391985Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1392023Z method(*args, **kwargs) 2025-12-04T10:49:11.1392174Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1392228Z method(*args, **kwargs) 2025-12-04T10:49:11.1392377Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1392434Z with policy(): 2025-12-04T10:49:11.1392585Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1392625Z raise RuntimeError(msg) 2025-12-04T10:49:11.1393024Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1393026Z 2025-12-04T10:49:11.1393098Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1393386Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1393389Z 2025-12-04T10:49:11.1393478Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1393549Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1393604Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1393779Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1393851Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1393886Z graph_break [] 2025-12-04T10:49:11.1393957Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1394023Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1394094Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1394265Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1394303Z graph_break [] 2025-12-04T10:49:11.1394373Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1394428Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1394497Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1394682Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1394719Z graph_break [] 2025-12-04T10:49:11.1394960Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-9f7f31c874ac7fb8.xml - 2025-12-04T10:49:11.1395019Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1395641Z FAILED [0.6220s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1395643Z 2025-12-04T10:49:11.1395714Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1395998Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1396013Z 2025-12-04T10:49:11.1396097Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1396168Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1396233Z ================== 1 failed, 38 deselected, 2 rerun in 4.05s =================== 2025-12-04T10:49:11.1396270Z Got exit code 1 2025-12-04T10:49:11.1396310Z Retrying single test... 2025-12-04T10:49:11.1396509Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8acbee9ccdfbb183.xml 2025-12-04T10:49:11.1396565Z ============================= test session starts ============================== 2025-12-04T10:49:11.1396677Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1396719Z cachedir: .pytest_cache 2025-12-04T10:49:11.1396875Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1396920Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1396961Z configfile: pytest.ini 2025-12-04T10:49:11.1397121Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1397193Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1397477Z stepcurrent: skipping 38 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1397521Z Running 1 items in this shard 2025-12-04T10:49:11.1397523Z 2025-12-04T10:49:11.1397891Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:33:16.492211234 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1397895Z 2025-12-04T10:49:11.1398048Z [W1204 10:33:24.024675707 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1398051Z 2025-12-04T10:49:11.1398201Z [W1204 10:33:24.024849104 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1398203Z 2025-12-04T10:49:11.1398363Z [W1204 10:33:24.028142184 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1398365Z 2025-12-04T10:49:11.1398514Z [W1204 10:33:24.028474548 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1398518Z 2025-12-04T10:49:11.1398665Z [W1204 10:33:24.028558746 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1398667Z 2025-12-04T10:49:11.1398813Z [W1204 10:33:24.031186668 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1398815Z 2025-12-04T10:49:11.1398963Z [W1204 10:33:24.031469763 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1398964Z 2025-12-04T10:49:11.1399111Z [W1204 10:33:24.031550382 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1399113Z 2025-12-04T10:49:11.1399164Z ('RERUN', {'yellow': True}) [10.3757s] [100%] 2025-12-04T10:49:11.1399522Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:33:25.222576466 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1399537Z 2025-12-04T10:49:11.1399702Z [W1204 10:33:25.223096906 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1399703Z 2025-12-04T10:49:11.1399850Z [W1204 10:33:25.223212684 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1399853Z 2025-12-04T10:49:11.1399999Z [W1204 10:33:25.224678717 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1400001Z 2025-12-04T10:49:11.1400148Z [W1204 10:33:25.225071040 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1400151Z 2025-12-04T10:49:11.1400298Z [W1204 10:33:25.225160629 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1400301Z 2025-12-04T10:49:11.1400447Z [W1204 10:33:25.227464666 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1400450Z 2025-12-04T10:49:11.1400596Z [W1204 10:33:25.227753271 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1400598Z 2025-12-04T10:49:11.1400746Z [W1204 10:33:25.227833170 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1400748Z 2025-12-04T10:49:11.1400796Z ('RERUN', {'yellow': True}) [0.6446s] [100%] 2025-12-04T10:49:11.1401155Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:33:26.867109277 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1401158Z 2025-12-04T10:49:11.1401308Z [W1204 10:33:26.867531969 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1401309Z 2025-12-04T10:49:11.1401456Z [W1204 10:33:26.867626018 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1401457Z 2025-12-04T10:49:11.1401615Z [W1204 10:33:26.869019082 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1401617Z 2025-12-04T10:49:11.1401767Z [W1204 10:33:26.869355356 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1401768Z 2025-12-04T10:49:11.1401961Z [W1204 10:33:26.869437355 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1401963Z 2025-12-04T10:49:11.1402109Z [W1204 10:33:26.871668694 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1402113Z 2025-12-04T10:49:11.1402260Z [W1204 10:33:26.871936679 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1402261Z 2025-12-04T10:49:11.1402408Z [W1204 10:33:26.872019857 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1402410Z 2025-12-04T10:49:11.1402448Z FAILED [0.6435s] [100%] 2025-12-04T10:49:11.1402450Z 2025-12-04T10:49:11.1402501Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1402648Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1402708Z Traceback (most recent call last): 2025-12-04T10:49:11.1402865Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1402920Z method(*args, **kwargs) 2025-12-04T10:49:11.1403073Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1403111Z method(*args, **kwargs) 2025-12-04T10:49:11.1403261Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1403298Z with policy(): 2025-12-04T10:49:11.1403450Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1403490Z raise RuntimeError(msg) 2025-12-04T10:49:11.1403885Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1403890Z 2025-12-04T10:49:11.1403963Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1404248Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1404250Z 2025-12-04T10:49:11.1404337Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1404408Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1404475Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1404651Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1404725Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1404761Z graph_break [] 2025-12-04T10:49:11.1404832Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1405190Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1405235Z if out == self.unknown_value: 2025-12-04T10:49:11.1405384Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1405428Z Traceback (most recent call last): 2025-12-04T10:49:11.1405582Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1405620Z method(*args, **kwargs) 2025-12-04T10:49:11.1405771Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1405810Z method(*args, **kwargs) 2025-12-04T10:49:11.1405960Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1405996Z with policy(): 2025-12-04T10:49:11.1406149Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1406189Z raise RuntimeError(msg) 2025-12-04T10:49:11.1406586Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1406608Z 2025-12-04T10:49:11.1406689Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1406972Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1406974Z 2025-12-04T10:49:11.1407059Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1407131Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1407186Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1407360Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1407432Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1407468Z graph_break [] 2025-12-04T10:49:11.1407538Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1407880Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1407924Z if out == self.unknown_value: 2025-12-04T10:49:11.1407998Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1408053Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1408126Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1408314Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1408352Z graph_break [] 2025-12-04T10:49:11.1408404Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1408551Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1408596Z Traceback (most recent call last): 2025-12-04T10:49:11.1408749Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1408789Z method(*args, **kwargs) 2025-12-04T10:49:11.1408949Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1408988Z method(*args, **kwargs) 2025-12-04T10:49:11.1409138Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1409175Z with policy(): 2025-12-04T10:49:11.1409326Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1409367Z raise RuntimeError(msg) 2025-12-04T10:49:11.1409763Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1409766Z 2025-12-04T10:49:11.1409838Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1410122Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1410139Z 2025-12-04T10:49:11.1410223Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1410295Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1410362Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1410538Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1410609Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1410645Z graph_break [] 2025-12-04T10:49:11.1410716Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1411055Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1411099Z if out == self.unknown_value: 2025-12-04T10:49:11.1411168Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1411223Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1411292Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1411466Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1411501Z graph_break [] 2025-12-04T10:49:11.1411572Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1411625Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1411694Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1411913Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1411951Z graph_break [] 2025-12-04T10:49:11.1412193Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8acbee9ccdfbb183.xml - 2025-12-04T10:49:11.1412253Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1412890Z FAILED [0.6435s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1412894Z 2025-12-04T10:49:11.1412965Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1413249Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1413252Z 2025-12-04T10:49:11.1413335Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1413396Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1413461Z ================== 1 failed, 57 deselected, 2 rerun in 11.84s ================== 2025-12-04T10:49:11.1413498Z Got exit code 1 2025-12-04T10:49:11.1413537Z Retrying single test... 2025-12-04T10:49:11.1413734Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8f3943e80bcff383.xml 2025-12-04T10:49:11.1413804Z ============================= test session starts ============================== 2025-12-04T10:49:11.1413914Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1413970Z cachedir: .pytest_cache 2025-12-04T10:49:11.1414127Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1414172Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1414211Z configfile: pytest.ini 2025-12-04T10:49:11.1414373Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1414445Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1414733Z stepcurrent: skipping 38 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1414777Z Running 1 items in this shard 2025-12-04T10:49:11.1414779Z 2025-12-04T10:49:11.1415134Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:33:36.738037965 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1415138Z 2025-12-04T10:49:11.1415287Z [W1204 10:33:43.209986020 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1415291Z 2025-12-04T10:49:11.1415440Z [W1204 10:33:43.210161617 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1415442Z 2025-12-04T10:49:11.1415603Z [W1204 10:33:43.213348499 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1415606Z 2025-12-04T10:49:11.1415754Z [W1204 10:33:43.213686163 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1415758Z 2025-12-04T10:49:11.1415904Z [W1204 10:33:43.213766361 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1415906Z 2025-12-04T10:49:11.1416052Z [W1204 10:33:43.216260206 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1416055Z 2025-12-04T10:49:11.1416212Z [W1204 10:33:43.216542300 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1416214Z 2025-12-04T10:49:11.1416364Z [W1204 10:33:43.216620969 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1416367Z 2025-12-04T10:49:11.1416417Z ('RERUN', {'yellow': True}) [10.1827s] [100%] 2025-12-04T10:49:11.1416769Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:33:44.244336744 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1416773Z 2025-12-04T10:49:11.1416919Z [W1204 10:33:44.244738356 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1416921Z 2025-12-04T10:49:11.1417069Z [W1204 10:33:44.244831095 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1417072Z 2025-12-04T10:49:11.1417220Z [W1204 10:33:44.246209160 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1417231Z 2025-12-04T10:49:11.1417378Z [W1204 10:33:44.246553163 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1417389Z 2025-12-04T10:49:11.1417537Z [W1204 10:33:44.246633302 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1417538Z 2025-12-04T10:49:11.1417684Z [W1204 10:33:44.248824782 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1417686Z 2025-12-04T10:49:11.1417834Z [W1204 10:33:44.249088617 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1417835Z 2025-12-04T10:49:11.1417982Z [W1204 10:33:44.249166736 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1417986Z 2025-12-04T10:49:11.1418034Z ('RERUN', {'yellow': True}) [0.4449s] [100%] 2025-12-04T10:49:11.1418387Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:33:45.681535807 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1418390Z 2025-12-04T10:49:11.1418538Z [W1204 10:33:45.681927700 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1418540Z 2025-12-04T10:49:11.1418688Z [W1204 10:33:45.682035318 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1418690Z 2025-12-04T10:49:11.1418839Z [W1204 10:33:45.683407393 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1418852Z 2025-12-04T10:49:11.1419000Z [W1204 10:33:45.683744966 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1419001Z 2025-12-04T10:49:11.1419150Z [W1204 10:33:45.683825355 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1419152Z 2025-12-04T10:49:11.1419298Z [W1204 10:33:45.686030345 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1419300Z 2025-12-04T10:49:11.1419464Z [W1204 10:33:45.686295390 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1419466Z 2025-12-04T10:49:11.1419612Z [W1204 10:33:45.686372018 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1419615Z 2025-12-04T10:49:11.1419655Z FAILED [0.4536s] [100%] 2025-12-04T10:49:11.1419657Z 2025-12-04T10:49:11.1419708Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1419855Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1419901Z Traceback (most recent call last): 2025-12-04T10:49:11.1420057Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1420097Z method(*args, **kwargs) 2025-12-04T10:49:11.1420249Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1420289Z method(*args, **kwargs) 2025-12-04T10:49:11.1420438Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1420475Z with policy(): 2025-12-04T10:49:11.1420640Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1420680Z raise RuntimeError(msg) 2025-12-04T10:49:11.1421082Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1421085Z 2025-12-04T10:49:11.1421159Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1421446Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1421448Z 2025-12-04T10:49:11.1421534Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1421606Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1421660Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1421836Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1421946Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1421984Z graph_break [] 2025-12-04T10:49:11.1422055Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1422397Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1422456Z if out == self.unknown_value: 2025-12-04T10:49:11.1422603Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1422647Z Traceback (most recent call last): 2025-12-04T10:49:11.1422801Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1422840Z method(*args, **kwargs) 2025-12-04T10:49:11.1422990Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1423030Z method(*args, **kwargs) 2025-12-04T10:49:11.1423193Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1423231Z with policy(): 2025-12-04T10:49:11.1423382Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1423425Z raise RuntimeError(msg) 2025-12-04T10:49:11.1423820Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1423823Z 2025-12-04T10:49:11.1423895Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1424177Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1424179Z 2025-12-04T10:49:11.1424265Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1424336Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1424406Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1424579Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1424666Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1424702Z graph_break [] 2025-12-04T10:49:11.1424773Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1425114Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1425156Z if out == self.unknown_value: 2025-12-04T10:49:11.1425228Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1425283Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1425354Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1425529Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1425567Z graph_break [] 2025-12-04T10:49:11.1425618Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1425765Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1425810Z Traceback (most recent call last): 2025-12-04T10:49:11.1425962Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1426001Z method(*args, **kwargs) 2025-12-04T10:49:11.1426162Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1426202Z method(*args, **kwargs) 2025-12-04T10:49:11.1426352Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1426389Z with policy(): 2025-12-04T10:49:11.1426540Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1426579Z raise RuntimeError(msg) 2025-12-04T10:49:11.1426988Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1426991Z 2025-12-04T10:49:11.1427063Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1427346Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1427349Z 2025-12-04T10:49:11.1427435Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1427505Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1427560Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1427738Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1427810Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1427846Z graph_break [] 2025-12-04T10:49:11.1427916Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1428266Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1428320Z if out == self.unknown_value: 2025-12-04T10:49:11.1428391Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1428444Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1428514Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1428689Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1428726Z graph_break [] 2025-12-04T10:49:11.1428796Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1428851Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1428919Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1429091Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1429127Z graph_break [] 2025-12-04T10:49:11.1429367Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8f3943e80bcff383.xml - 2025-12-04T10:49:11.1429425Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1430059Z FAILED [0.4536s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1430063Z 2025-12-04T10:49:11.1430135Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1430416Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1430418Z 2025-12-04T10:49:11.1430513Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1430574Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1430640Z ================== 1 failed, 57 deselected, 2 rerun in 11.25s ================== 2025-12-04T10:49:11.1430676Z Got exit code 1 2025-12-04T10:49:11.1430911Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1431039Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1431236Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ddd7b9321727b9d3.xml 2025-12-04T10:49:11.1431291Z ============================= test session starts ============================== 2025-12-04T10:49:11.1431403Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1431444Z cachedir: .pytest_cache 2025-12-04T10:49:11.1431602Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1431649Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1431720Z configfile: pytest.ini 2025-12-04T10:49:11.1431926Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1432013Z collecting ... collected 58 items / 39 deselected / 19 selected 2025-12-04T10:49:11.1432065Z stepcurrent: skipping 39 already run items. 2025-12-04T10:49:11.1432108Z Running 19 items in this shard 2025-12-04T10:49:11.1432110Z 2025-12-04T10:49:11.1432356Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [3.0780s] [ 5%] 2025-12-04T10:49:11.1432595Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.4777s] [ 5%] 2025-12-04T10:49:11.1432814Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 FAILED [0.4707s] [ 5%] 2025-12-04T10:49:11.1432818Z 2025-12-04T10:49:11.1432868Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1433019Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1433063Z Traceback (most recent call last): 2025-12-04T10:49:11.1433219Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1433259Z method(*args, **kwargs) 2025-12-04T10:49:11.1433411Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1433449Z method(*args, **kwargs) 2025-12-04T10:49:11.1433616Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1433656Z with policy(): 2025-12-04T10:49:11.1433806Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1433848Z raise RuntimeError(msg) 2025-12-04T10:49:11.1434243Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1434245Z 2025-12-04T10:49:11.1434332Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1434617Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1434620Z 2025-12-04T10:49:11.1434706Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1434776Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1434832Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1435103Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1435176Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1435213Z graph_break [] 2025-12-04T10:49:11.1435360Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1435404Z Traceback (most recent call last): 2025-12-04T10:49:11.1435558Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1435609Z method(*args, **kwargs) 2025-12-04T10:49:11.1435758Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1435808Z method(*args, **kwargs) 2025-12-04T10:49:11.1435957Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1435994Z with policy(): 2025-12-04T10:49:11.1436146Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1436187Z raise RuntimeError(msg) 2025-12-04T10:49:11.1436585Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1436588Z 2025-12-04T10:49:11.1436661Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1436947Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1436949Z 2025-12-04T10:49:11.1437033Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1437105Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1437159Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1437671Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1437743Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1437780Z graph_break [] 2025-12-04T10:49:11.1437851Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1437905Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1437974Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1438254Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1438290Z graph_break [] 2025-12-04T10:49:11.1438342Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1438489Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1438535Z Traceback (most recent call last): 2025-12-04T10:49:11.1438688Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1438728Z method(*args, **kwargs) 2025-12-04T10:49:11.1438880Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1438919Z method(*args, **kwargs) 2025-12-04T10:49:11.1439069Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1439107Z with policy(): 2025-12-04T10:49:11.1439258Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1439298Z raise RuntimeError(msg) 2025-12-04T10:49:11.1439695Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1439716Z 2025-12-04T10:49:11.1439787Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1440071Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1440073Z 2025-12-04T10:49:11.1440157Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1440228Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1440284Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1440555Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1440627Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1440662Z graph_break [] 2025-12-04T10:49:11.1440733Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1440786Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1440857Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1441127Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1441173Z graph_break [] 2025-12-04T10:49:11.1441243Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1441297Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1441367Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1441638Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1441674Z graph_break [] 2025-12-04T10:49:11.1441975Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ddd7b9321727b9d3.xml - 2025-12-04T10:49:11.1442033Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1442653Z FAILED [0.4707s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1442657Z 2025-12-04T10:49:11.1442728Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1443011Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1443013Z 2025-12-04T10:49:11.1443096Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1443159Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1443239Z ================== 1 failed, 39 deselected, 2 rerun in 4.19s =================== 2025-12-04T10:49:11.1443275Z Got exit code 1 2025-12-04T10:49:11.1443330Z Retrying single test... 2025-12-04T10:49:11.1443524Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-1d1ac4ab92009038.xml 2025-12-04T10:49:11.1443581Z ============================= test session starts ============================== 2025-12-04T10:49:11.1443691Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1443732Z cachedir: .pytest_cache 2025-12-04T10:49:11.1443890Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1443935Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1443975Z configfile: pytest.ini 2025-12-04T10:49:11.1444137Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1444210Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1444492Z stepcurrent: skipping 39 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1444536Z Running 1 items in this shard 2025-12-04T10:49:11.1444538Z 2025-12-04T10:49:11.1444894Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:34:06.919885198 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1444896Z 2025-12-04T10:49:11.1445060Z [W1204 10:34:13.325436854 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1445064Z 2025-12-04T10:49:11.1445215Z [W1204 10:34:13.325578062 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1445218Z 2025-12-04T10:49:11.1445366Z [W1204 10:34:13.330114899 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1445368Z 2025-12-04T10:49:11.1445516Z [W1204 10:34:13.330405414 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1445532Z 2025-12-04T10:49:11.1445679Z [W1204 10:34:13.330483422 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1445681Z 2025-12-04T10:49:11.1445828Z [W1204 10:34:13.333024206 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1445831Z 2025-12-04T10:49:11.1445978Z [W1204 10:34:13.333299571 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1445981Z 2025-12-04T10:49:11.1446127Z [W1204 10:34:13.333375689 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1446129Z 2025-12-04T10:49:11.1446179Z ('RERUN', {'yellow': True}) [10.3914s] [100%] 2025-12-04T10:49:11.1446531Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:34:14.946148410 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1446533Z 2025-12-04T10:49:11.1446683Z [W1204 10:34:14.946539553 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1446698Z 2025-12-04T10:49:11.1446845Z [W1204 10:34:14.946634201 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1446865Z 2025-12-04T10:49:11.1447011Z [W1204 10:34:14.948038486 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1447013Z 2025-12-04T10:49:11.1447160Z [W1204 10:34:14.948302231 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1447162Z 2025-12-04T10:49:11.1447309Z [W1204 10:34:14.948379569 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1447311Z 2025-12-04T10:49:11.1447462Z [W1204 10:34:14.950581709 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1447465Z 2025-12-04T10:49:11.1447611Z [W1204 10:34:14.950842334 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1447615Z 2025-12-04T10:49:11.1447761Z [W1204 10:34:14.950917933 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1447763Z 2025-12-04T10:49:11.1447812Z ('RERUN', {'yellow': True}) [0.4754s] [100%] 2025-12-04T10:49:11.1448161Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:34:14.394483923 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1448164Z 2025-12-04T10:49:11.1448321Z [W1204 10:34:14.394853127 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1448323Z 2025-12-04T10:49:11.1448470Z [W1204 10:34:14.394938415 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1448473Z 2025-12-04T10:49:11.1448619Z [W1204 10:34:14.396462417 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1448621Z 2025-12-04T10:49:11.1448769Z [W1204 10:34:14.396720523 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1448771Z 2025-12-04T10:49:11.1448929Z [W1204 10:34:14.396796441 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1448931Z 2025-12-04T10:49:11.1449079Z [W1204 10:34:14.398968182 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1449081Z 2025-12-04T10:49:11.1449228Z [W1204 10:34:14.399233187 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1449231Z 2025-12-04T10:49:11.1449379Z [W1204 10:34:14.399312485 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1449381Z 2025-12-04T10:49:11.1449420Z FAILED [0.4396s] [100%] 2025-12-04T10:49:11.1449422Z 2025-12-04T10:49:11.1449473Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1449621Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1449666Z Traceback (most recent call last): 2025-12-04T10:49:11.1449821Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1449873Z method(*args, **kwargs) 2025-12-04T10:49:11.1450026Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1450065Z method(*args, **kwargs) 2025-12-04T10:49:11.1450226Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1450263Z with policy(): 2025-12-04T10:49:11.1450414Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1450454Z raise RuntimeError(msg) 2025-12-04T10:49:11.1450848Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1450852Z 2025-12-04T10:49:11.1450925Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1451212Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1451215Z 2025-12-04T10:49:11.1451302Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1451373Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1451428Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1451700Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1451782Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1451819Z graph_break [] 2025-12-04T10:49:11.1451926Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1452269Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1452313Z if out == self.unknown_value: 2025-12-04T10:49:11.1452459Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1452516Z Traceback (most recent call last): 2025-12-04T10:49:11.1452670Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1452710Z method(*args, **kwargs) 2025-12-04T10:49:11.1452862Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1452904Z method(*args, **kwargs) 2025-12-04T10:49:11.1453053Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1453090Z with policy(): 2025-12-04T10:49:11.1453240Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1453280Z raise RuntimeError(msg) 2025-12-04T10:49:11.1453674Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1453677Z 2025-12-04T10:49:11.1453749Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1454049Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1454064Z 2025-12-04T10:49:11.1454149Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1454220Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1454274Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1454543Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1454615Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1454652Z graph_break [] 2025-12-04T10:49:11.1454724Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1455063Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1455107Z if out == self.unknown_value: 2025-12-04T10:49:11.1455177Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1455231Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1455302Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1455585Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1455622Z graph_break [] 2025-12-04T10:49:11.1455673Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1455818Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1455864Z Traceback (most recent call last): 2025-12-04T10:49:11.1456019Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1456058Z method(*args, **kwargs) 2025-12-04T10:49:11.1456217Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1456258Z method(*args, **kwargs) 2025-12-04T10:49:11.1456407Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1456444Z with policy(): 2025-12-04T10:49:11.1456597Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1456636Z raise RuntimeError(msg) 2025-12-04T10:49:11.1457032Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1457034Z 2025-12-04T10:49:11.1457105Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1457390Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1457392Z 2025-12-04T10:49:11.1457477Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1457558Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1457612Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1457892Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1457963Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1457999Z graph_break [] 2025-12-04T10:49:11.1458069Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1458413Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1458458Z if out == self.unknown_value: 2025-12-04T10:49:11.1458527Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1458582Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1458651Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1458920Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1458955Z graph_break [] 2025-12-04T10:49:11.1459027Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1459079Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1459149Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1459425Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1459464Z graph_break [] 2025-12-04T10:49:11.1459703Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-1d1ac4ab92009038.xml - 2025-12-04T10:49:11.1459764Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1460397Z FAILED [0.4396s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1460400Z 2025-12-04T10:49:11.1460471Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1460756Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1460758Z 2025-12-04T10:49:11.1460841Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1460903Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1460969Z ================== 1 failed, 57 deselected, 2 rerun in 11.47s ================== 2025-12-04T10:49:11.1461005Z Got exit code 1 2025-12-04T10:49:11.1461046Z Retrying single test... 2025-12-04T10:49:11.1461241Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ccb56d5a295b1203.xml 2025-12-04T10:49:11.1461313Z ============================= test session starts ============================== 2025-12-04T10:49:11.1461435Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1461476Z cachedir: .pytest_cache 2025-12-04T10:49:11.1461634Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1461680Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1461719Z configfile: pytest.ini 2025-12-04T10:49:11.1461909Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1461982Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1462266Z stepcurrent: skipping 39 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1462310Z Running 1 items in this shard 2025-12-04T10:49:11.1462313Z 2025-12-04T10:49:11.1462669Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:34:25.552915208 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1462671Z 2025-12-04T10:49:11.1462824Z [W1204 10:34:32.009609910 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1462826Z 2025-12-04T10:49:11.1462975Z [W1204 10:34:32.009818716 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1462977Z 2025-12-04T10:49:11.1463138Z [W1204 10:34:32.013315972 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1463140Z 2025-12-04T10:49:11.1463287Z [W1204 10:34:32.013680805 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1463290Z 2025-12-04T10:49:11.1463436Z [W1204 10:34:32.013760784 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1463438Z 2025-12-04T10:49:11.1463595Z [W1204 10:34:32.016535633 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1463597Z 2025-12-04T10:49:11.1463743Z [W1204 10:34:32.016841407 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1463745Z 2025-12-04T10:49:11.1463892Z [W1204 10:34:32.016917216 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1463895Z 2025-12-04T10:49:11.1463944Z ('RERUN', {'yellow': True}) [10.4517s] [100%] 2025-12-04T10:49:11.1464300Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:34:33.646187599 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1464302Z 2025-12-04T10:49:11.1464450Z [W1204 10:34:33.646572702 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1464453Z 2025-12-04T10:49:11.1464600Z [W1204 10:34:33.646658200 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1464602Z 2025-12-04T10:49:11.1464764Z [W1204 10:34:33.648069175 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1464766Z 2025-12-04T10:49:11.1464913Z [W1204 10:34:33.648326150 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1464929Z 2025-12-04T10:49:11.1465077Z [W1204 10:34:33.648400418 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1465079Z 2025-12-04T10:49:11.1465225Z [W1204 10:34:33.650693327 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1465227Z 2025-12-04T10:49:11.1465374Z [W1204 10:34:33.650948252 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1465376Z 2025-12-04T10:49:11.1465523Z [W1204 10:34:33.651027531 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1465526Z 2025-12-04T10:49:11.1465574Z ('RERUN', {'yellow': True}) [0.5070s] [100%] 2025-12-04T10:49:11.1465925Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:34:33.158757592 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1465927Z 2025-12-04T10:49:11.1466075Z [W1204 10:34:33.159152965 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1466077Z 2025-12-04T10:49:11.1466224Z [W1204 10:34:33.159247873 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1466226Z 2025-12-04T10:49:11.1466383Z [W1204 10:34:33.160652298 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1466387Z 2025-12-04T10:49:11.1466533Z [W1204 10:34:33.160934553 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1466536Z 2025-12-04T10:49:11.1466684Z [W1204 10:34:33.161020771 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1466686Z 2025-12-04T10:49:11.1466842Z [W1204 10:34:33.163344598 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1466844Z 2025-12-04T10:49:11.1466996Z [W1204 10:34:33.163608334 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1466998Z 2025-12-04T10:49:11.1467147Z [W1204 10:34:33.163682902 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1467150Z 2025-12-04T10:49:11.1467187Z FAILED [0.4807s] [100%] 2025-12-04T10:49:11.1467189Z 2025-12-04T10:49:11.1467240Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1467387Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1467432Z Traceback (most recent call last): 2025-12-04T10:49:11.1467587Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1467628Z method(*args, **kwargs) 2025-12-04T10:49:11.1467779Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1467819Z method(*args, **kwargs) 2025-12-04T10:49:11.1467970Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1468017Z with policy(): 2025-12-04T10:49:11.1468167Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1468217Z raise RuntimeError(msg) 2025-12-04T10:49:11.1468606Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1468610Z 2025-12-04T10:49:11.1468682Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1468969Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1468973Z 2025-12-04T10:49:11.1469057Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1469129Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1469184Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1469455Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1469527Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1469564Z graph_break [] 2025-12-04T10:49:11.1469633Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1469983Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1470027Z if out == self.unknown_value: 2025-12-04T10:49:11.1470174Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1470218Z Traceback (most recent call last): 2025-12-04T10:49:11.1470369Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1470409Z method(*args, **kwargs) 2025-12-04T10:49:11.1470567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1470607Z method(*args, **kwargs) 2025-12-04T10:49:11.1470756Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1470794Z with policy(): 2025-12-04T10:49:11.1470944Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1470985Z raise RuntimeError(msg) 2025-12-04T10:49:11.1471386Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1471389Z 2025-12-04T10:49:11.1471462Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1471746Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1471758Z 2025-12-04T10:49:11.1471881Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1471953Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1472022Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1472292Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1472362Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1472400Z graph_break [] 2025-12-04T10:49:11.1472469Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1472810Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1472853Z if out == self.unknown_value: 2025-12-04T10:49:11.1472923Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1472978Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1473048Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1473316Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1473353Z graph_break [] 2025-12-04T10:49:11.1473403Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1473551Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1473613Z Traceback (most recent call last): 2025-12-04T10:49:11.1473767Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1473807Z method(*args, **kwargs) 2025-12-04T10:49:11.1473958Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1473997Z method(*args, **kwargs) 2025-12-04T10:49:11.1474145Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1474182Z with policy(): 2025-12-04T10:49:11.1474354Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1474395Z raise RuntimeError(msg) 2025-12-04T10:49:11.1474791Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1474795Z 2025-12-04T10:49:11.1474868Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1475149Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1475151Z 2025-12-04T10:49:11.1475237Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1475307Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1475362Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1475638Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1475721Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1475767Z graph_break [] 2025-12-04T10:49:11.1475837Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1476177Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1476219Z if out == self.unknown_value: 2025-12-04T10:49:11.1476290Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1476343Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1476414Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1476683Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1476721Z graph_break [] 2025-12-04T10:49:11.1476789Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1476843Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1476912Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1477179Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1477215Z graph_break [] 2025-12-04T10:49:11.1477465Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ccb56d5a295b1203.xml - 2025-12-04T10:49:11.1477525Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1478160Z FAILED [0.4807s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1478163Z 2025-12-04T10:49:11.1478236Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1478518Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1478522Z 2025-12-04T10:49:11.1478606Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1478667Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1478732Z ================== 1 failed, 57 deselected, 2 rerun in 11.61s ================== 2025-12-04T10:49:11.1478769Z Got exit code 1 2025-12-04T10:49:11.1479007Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1479135Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1479332Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-cc49f7e929dcd2ab.xml 2025-12-04T10:49:11.1479398Z ============================= test session starts ============================== 2025-12-04T10:49:11.1479508Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1479559Z cachedir: .pytest_cache 2025-12-04T10:49:11.1479716Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1479762Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1479801Z configfile: pytest.ini 2025-12-04T10:49:11.1479965Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1480037Z collecting ... collected 58 items / 40 deselected / 18 selected 2025-12-04T10:49:11.1480089Z stepcurrent: skipping 40 already run items. 2025-12-04T10:49:11.1480133Z Running 18 items in this shard 2025-12-04T10:49:11.1480136Z 2025-12-04T10:49:11.1480388Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.8873s] [ 5%] 2025-12-04T10:49:11.1480636Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.7697s] [ 5%] 2025-12-04T10:49:11.1480858Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 FAILED [0.7830s] [ 5%] 2025-12-04T10:49:11.1480860Z 2025-12-04T10:49:11.1480912Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1481070Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1481116Z Traceback (most recent call last): 2025-12-04T10:49:11.1481273Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1481315Z method(*args, **kwargs) 2025-12-04T10:49:11.1481465Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1481505Z method(*args, **kwargs) 2025-12-04T10:49:11.1481653Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1481691Z with policy(): 2025-12-04T10:49:11.1481904Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1481946Z raise RuntimeError(msg) 2025-12-04T10:49:11.1482352Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1482357Z 2025-12-04T10:49:11.1482429Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1482719Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1482721Z 2025-12-04T10:49:11.1482807Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1482878Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1482932Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1483107Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1483190Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1483241Z graph_break [] 2025-12-04T10:49:11.1483389Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1483434Z Traceback (most recent call last): 2025-12-04T10:49:11.1483586Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1483626Z method(*args, **kwargs) 2025-12-04T10:49:11.1483774Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1483814Z method(*args, **kwargs) 2025-12-04T10:49:11.1483963Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1484000Z with policy(): 2025-12-04T10:49:11.1484151Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1484192Z raise RuntimeError(msg) 2025-12-04T10:49:11.1484605Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1484608Z 2025-12-04T10:49:11.1484679Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1484978Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1484982Z 2025-12-04T10:49:11.1485066Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1485138Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1485193Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1485368Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1485439Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1485484Z graph_break [] 2025-12-04T10:49:11.1485555Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1485608Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1485679Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1485853Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1485889Z graph_break [] 2025-12-04T10:49:11.1485941Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1486093Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1486136Z Traceback (most recent call last): 2025-12-04T10:49:11.1486290Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1486328Z method(*args, **kwargs) 2025-12-04T10:49:11.1486478Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1486516Z method(*args, **kwargs) 2025-12-04T10:49:11.1486666Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1486713Z with policy(): 2025-12-04T10:49:11.1486864Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1486915Z raise RuntimeError(msg) 2025-12-04T10:49:11.1487324Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1487326Z 2025-12-04T10:49:11.1487398Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1487685Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1487688Z 2025-12-04T10:49:11.1487772Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1487843Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1487897Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1488069Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1488140Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1488176Z graph_break [] 2025-12-04T10:49:11.1488246Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1488299Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1488384Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1488559Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1488597Z graph_break [] 2025-12-04T10:49:11.1488667Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1488722Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1488793Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1488973Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1489010Z graph_break [] 2025-12-04T10:49:11.1489253Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-cc49f7e929dcd2ab.xml - 2025-12-04T10:49:11.1489313Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1489952Z FAILED [0.7830s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1489956Z 2025-12-04T10:49:11.1490028Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1490314Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1490329Z 2025-12-04T10:49:11.1490412Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1490472Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1490547Z ================== 1 failed, 40 deselected, 2 rerun in 4.59s =================== 2025-12-04T10:49:11.1490583Z Got exit code 1 2025-12-04T10:49:11.1490624Z Retrying single test... 2025-12-04T10:49:11.1490820Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-0f59b358b6890424.xml 2025-12-04T10:49:11.1490878Z ============================= test session starts ============================== 2025-12-04T10:49:11.1490988Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1491028Z cachedir: .pytest_cache 2025-12-04T10:49:11.1491186Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1491231Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1491272Z configfile: pytest.ini 2025-12-04T10:49:11.1491433Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1491506Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1491790Z stepcurrent: skipping 40 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1491835Z Running 1 items in this shard 2025-12-04T10:49:11.1491837Z 2025-12-04T10:49:11.1492266Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:34:54.508539297 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1492270Z 2025-12-04T10:49:11.1492421Z [W1204 10:35:02.036233355 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1492424Z 2025-12-04T10:49:11.1492573Z [W1204 10:35:02.036421672 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1492578Z 2025-12-04T10:49:11.1492724Z [W1204 10:35:02.040451158 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1492739Z 2025-12-04T10:49:11.1492887Z [W1204 10:35:02.040967409 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1492889Z 2025-12-04T10:49:11.1493036Z [W1204 10:35:02.041070357 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1493040Z 2025-12-04T10:49:11.1493188Z [W1204 10:35:02.043680289 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1493191Z 2025-12-04T10:49:11.1493337Z [W1204 10:35:02.043982734 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1493339Z 2025-12-04T10:49:11.1493484Z [W1204 10:35:02.044065352 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1493486Z 2025-12-04T10:49:11.1493537Z ('RERUN', {'yellow': True}) [10.3017s] [100%] 2025-12-04T10:49:11.1493898Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:35:03.175317201 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1493915Z 2025-12-04T10:49:11.1494064Z [W1204 10:35:03.175701144 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1494079Z 2025-12-04T10:49:11.1494226Z [W1204 10:35:03.175785142 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1494228Z 2025-12-04T10:49:11.1494374Z [W1204 10:35:03.177164457 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1494377Z 2025-12-04T10:49:11.1494523Z [W1204 10:35:03.177493801 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1494525Z 2025-12-04T10:49:11.1494672Z [W1204 10:35:03.177572910 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1494674Z 2025-12-04T10:49:11.1494822Z [W1204 10:35:03.179752160 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1494825Z 2025-12-04T10:49:11.1494972Z [W1204 10:35:03.180013295 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1494974Z 2025-12-04T10:49:11.1495122Z [W1204 10:35:03.180090924 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1495124Z 2025-12-04T10:49:11.1495177Z ('RERUN', {'yellow': True}) [0.5949s] [100%] 2025-12-04T10:49:11.1495548Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:35:04.761661972 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1495552Z 2025-12-04T10:49:11.1495699Z [W1204 10:35:04.762047475 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1495702Z 2025-12-04T10:49:11.1495849Z [W1204 10:35:04.762137733 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1495851Z 2025-12-04T10:49:11.1495997Z [W1204 10:35:04.763515178 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1495999Z 2025-12-04T10:49:11.1496155Z [W1204 10:35:04.763838153 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1496156Z 2025-12-04T10:49:11.1496304Z [W1204 10:35:04.763916951 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1496306Z 2025-12-04T10:49:11.1496453Z [W1204 10:35:04.766085902 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1496456Z 2025-12-04T10:49:11.1496603Z [W1204 10:35:04.766338757 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1496608Z 2025-12-04T10:49:11.1496753Z [W1204 10:35:04.766413706 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1496755Z 2025-12-04T10:49:11.1496793Z FAILED [0.5882s] [100%] 2025-12-04T10:49:11.1496795Z 2025-12-04T10:49:11.1496846Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1496997Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1497052Z Traceback (most recent call last): 2025-12-04T10:49:11.1497208Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1497259Z method(*args, **kwargs) 2025-12-04T10:49:11.1497411Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1497449Z method(*args, **kwargs) 2025-12-04T10:49:11.1497600Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1497636Z with policy(): 2025-12-04T10:49:11.1497790Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1497829Z raise RuntimeError(msg) 2025-12-04T10:49:11.1498234Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1498238Z 2025-12-04T10:49:11.1498311Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1498600Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1498602Z 2025-12-04T10:49:11.1498689Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1498760Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1498817Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1499001Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1499074Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1499110Z graph_break [] 2025-12-04T10:49:11.1499181Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1499528Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1499581Z if out == self.unknown_value: 2025-12-04T10:49:11.1499732Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1499776Z Traceback (most recent call last): 2025-12-04T10:49:11.1499929Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1499969Z method(*args, **kwargs) 2025-12-04T10:49:11.1500119Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1500158Z method(*args, **kwargs) 2025-12-04T10:49:11.1500308Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1500343Z with policy(): 2025-12-04T10:49:11.1500494Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1500534Z raise RuntimeError(msg) 2025-12-04T10:49:11.1500944Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1500956Z 2025-12-04T10:49:11.1501028Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1501334Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1501336Z 2025-12-04T10:49:11.1501421Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1501493Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1501548Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1501727Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1501800Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1501836Z graph_break [] 2025-12-04T10:49:11.1501943Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1502284Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1502328Z if out == self.unknown_value: 2025-12-04T10:49:11.1502397Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1502453Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1502523Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1502710Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1502747Z graph_break [] 2025-12-04T10:49:11.1502798Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1502947Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1502993Z Traceback (most recent call last): 2025-12-04T10:49:11.1503146Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1503185Z method(*args, **kwargs) 2025-12-04T10:49:11.1503349Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1503389Z method(*args, **kwargs) 2025-12-04T10:49:11.1503538Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1503577Z with policy(): 2025-12-04T10:49:11.1503729Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1503769Z raise RuntimeError(msg) 2025-12-04T10:49:11.1504180Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1504183Z 2025-12-04T10:49:11.1504256Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1504544Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1504558Z 2025-12-04T10:49:11.1504644Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1504715Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1504784Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1504957Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1505028Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1505064Z graph_break [] 2025-12-04T10:49:11.1505135Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1505476Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1505521Z if out == self.unknown_value: 2025-12-04T10:49:11.1505591Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1505644Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1505715Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1505889Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1505925Z graph_break [] 2025-12-04T10:49:11.1505996Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1506049Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1506122Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1508163Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1508205Z graph_break [] 2025-12-04T10:49:11.1508447Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-0f59b358b6890424.xml - 2025-12-04T10:49:11.1508507Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1509161Z FAILED [0.5882s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1509164Z 2025-12-04T10:49:11.1509238Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1509527Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1509530Z 2025-12-04T10:49:11.1509616Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1509677Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1509743Z ================== 1 failed, 57 deselected, 2 rerun in 11.64s ================== 2025-12-04T10:49:11.1509783Z Got exit code 1 2025-12-04T10:49:11.1509824Z Retrying single test... 2025-12-04T10:49:11.1510021Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d4d855f32174f6cb.xml 2025-12-04T10:49:11.1510082Z ============================= test session starts ============================== 2025-12-04T10:49:11.1510205Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1510246Z cachedir: .pytest_cache 2025-12-04T10:49:11.1510418Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1510464Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1510504Z configfile: pytest.ini 2025-12-04T10:49:11.1510666Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1510743Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1511029Z stepcurrent: skipping 40 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1511075Z Running 1 items in this shard 2025-12-04T10:49:11.1511077Z 2025-12-04T10:49:11.1511441Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:35:13.102833864 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1511444Z 2025-12-04T10:49:11.1511597Z [W1204 10:35:21.715254593 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1511599Z 2025-12-04T10:49:11.1511751Z [W1204 10:35:21.715462759 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1511753Z 2025-12-04T10:49:11.1511940Z [W1204 10:35:21.719885408 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1511962Z 2025-12-04T10:49:11.1512109Z [W1204 10:35:21.720335790 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1512111Z 2025-12-04T10:49:11.1512263Z [W1204 10:35:21.720431488 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1512265Z 2025-12-04T10:49:11.1512412Z [W1204 10:35:21.723301096 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1512414Z 2025-12-04T10:49:11.1512572Z [W1204 10:35:21.723607660 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1512575Z 2025-12-04T10:49:11.1512721Z [W1204 10:35:21.723687149 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1512724Z 2025-12-04T10:49:11.1512777Z ('RERUN', {'yellow': True}) [10.4656s] [100%] 2025-12-04T10:49:11.1513135Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:35:22.089616672 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1513138Z 2025-12-04T10:49:11.1513286Z [W1204 10:35:22.090032235 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1513288Z 2025-12-04T10:49:11.1513438Z [W1204 10:35:22.090119813 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1513440Z 2025-12-04T10:49:11.1513587Z [W1204 10:35:22.091551667 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1513602Z 2025-12-04T10:49:11.1513749Z [W1204 10:35:22.091888441 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1513751Z 2025-12-04T10:49:11.1513913Z [W1204 10:35:22.091973829 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1513915Z 2025-12-04T10:49:11.1514061Z [W1204 10:35:22.094305747 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1514063Z 2025-12-04T10:49:11.1514209Z [W1204 10:35:22.094567192 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1514212Z 2025-12-04T10:49:11.1514361Z [W1204 10:35:22.094645241 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1514363Z 2025-12-04T10:49:11.1514415Z ('RERUN', {'yellow': True}) [0.8507s] [100%] 2025-12-04T10:49:11.1514773Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:35:23.887739324 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1514776Z 2025-12-04T10:49:11.1514924Z [W1204 10:35:23.888186256 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1514926Z 2025-12-04T10:49:11.1515074Z [W1204 10:35:23.888293044 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1515076Z 2025-12-04T10:49:11.1515223Z [W1204 10:35:23.889755487 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1515225Z 2025-12-04T10:49:11.1515382Z [W1204 10:35:23.890118620 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1515384Z 2025-12-04T10:49:11.1515530Z [W1204 10:35:23.890203959 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1515533Z 2025-12-04T10:49:11.1515680Z [W1204 10:35:23.892567346 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1515682Z 2025-12-04T10:49:11.1515837Z [W1204 10:35:23.892837001 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1515839Z 2025-12-04T10:49:11.1515986Z [W1204 10:35:23.892925979 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1515988Z 2025-12-04T10:49:11.1516027Z FAILED [0.7648s] [100%] 2025-12-04T10:49:11.1516031Z 2025-12-04T10:49:11.1516082Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1516233Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1516280Z Traceback (most recent call last): 2025-12-04T10:49:11.1516437Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1516477Z method(*args, **kwargs) 2025-12-04T10:49:11.1516631Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1516670Z method(*args, **kwargs) 2025-12-04T10:49:11.1516820Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1516858Z with policy(): 2025-12-04T10:49:11.1517031Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1517071Z raise RuntimeError(msg) 2025-12-04T10:49:11.1517472Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1517486Z 2025-12-04T10:49:11.1517559Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1517851Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1517853Z 2025-12-04T10:49:11.1517941Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1518014Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1518069Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1518245Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1518319Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1518355Z graph_break [] 2025-12-04T10:49:11.1518426Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1518771Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1518826Z if out == self.unknown_value: 2025-12-04T10:49:11.1518977Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1519022Z Traceback (most recent call last): 2025-12-04T10:49:11.1519176Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1519216Z method(*args, **kwargs) 2025-12-04T10:49:11.1519367Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1519406Z method(*args, **kwargs) 2025-12-04T10:49:11.1519569Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1519607Z with policy(): 2025-12-04T10:49:11.1519758Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1519800Z raise RuntimeError(msg) 2025-12-04T10:49:11.1520210Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1520213Z 2025-12-04T10:49:11.1520284Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1520575Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1520577Z 2025-12-04T10:49:11.1520661Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1520734Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1520798Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1520977Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1521061Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1521098Z graph_break [] 2025-12-04T10:49:11.1521167Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1521511Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1521555Z if out == self.unknown_value: 2025-12-04T10:49:11.1521625Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1521681Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1521752Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1521962Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1521999Z graph_break [] 2025-12-04T10:49:11.1522050Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1522200Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1522248Z Traceback (most recent call last): 2025-12-04T10:49:11.1522400Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1522440Z method(*args, **kwargs) 2025-12-04T10:49:11.1522604Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1522645Z method(*args, **kwargs) 2025-12-04T10:49:11.1522793Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1522831Z with policy(): 2025-12-04T10:49:11.1522982Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1523024Z raise RuntimeError(msg) 2025-12-04T10:49:11.1523447Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1523450Z 2025-12-04T10:49:11.1523523Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1523811Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1523814Z 2025-12-04T10:49:11.1523898Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1523969Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1524023Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1524198Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1524268Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1524304Z graph_break [] 2025-12-04T10:49:11.1524375Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1524726Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1524785Z if out == self.unknown_value: 2025-12-04T10:49:11.1524856Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1524910Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1524980Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1525154Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1525189Z graph_break [] 2025-12-04T10:49:11.1525260Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1525315Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1525385Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1525555Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1525592Z graph_break [] 2025-12-04T10:49:11.1525833Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d4d855f32174f6cb.xml - 2025-12-04T10:49:11.1525892Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1526539Z FAILED [0.7648s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1526545Z 2025-12-04T10:49:11.1526616Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1526902Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1526904Z 2025-12-04T10:49:11.1526999Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1527060Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1527125Z ================== 1 failed, 57 deselected, 2 rerun in 12.23s ================== 2025-12-04T10:49:11.1527164Z Got exit code 1 2025-12-04T10:49:11.1527404Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1527533Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1527726Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8b37ed6375151a97.xml 2025-12-04T10:49:11.1527784Z ============================= test session starts ============================== 2025-12-04T10:49:11.1527895Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1527937Z cachedir: .pytest_cache 2025-12-04T10:49:11.1528093Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1528153Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1528193Z configfile: pytest.ini 2025-12-04T10:49:11.1528353Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1528439Z collecting ... collected 58 items / 41 deselected / 17 selected 2025-12-04T10:49:11.1528490Z stepcurrent: skipping 41 already run items. 2025-12-04T10:49:11.1528535Z Running 17 items in this shard 2025-12-04T10:49:11.1528537Z 2025-12-04T10:49:11.1528784Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.7129s] [ 5%] 2025-12-04T10:49:11.1529028Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.5183s] [ 5%] 2025-12-04T10:49:11.1529250Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 FAILED [0.4898s] [ 5%] 2025-12-04T10:49:11.1529252Z 2025-12-04T10:49:11.1529304Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1529451Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1529496Z Traceback (most recent call last): 2025-12-04T10:49:11.1529653Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1529694Z method(*args, **kwargs) 2025-12-04T10:49:11.1529845Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1529883Z method(*args, **kwargs) 2025-12-04T10:49:11.1530041Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1530079Z with policy(): 2025-12-04T10:49:11.1530230Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1530272Z raise RuntimeError(msg) 2025-12-04T10:49:11.1530804Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 65536 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1530807Z 2025-12-04T10:49:11.1530878Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1531166Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1531169Z 2025-12-04T10:49:11.1531254Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1531327Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1531383Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1531557Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1531629Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1531667Z graph_break [] 2025-12-04T10:49:11.1531814Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1531889Z Traceback (most recent call last): 2025-12-04T10:49:11.1532044Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1532098Z method(*args, **kwargs) 2025-12-04T10:49:11.1532249Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1532302Z method(*args, **kwargs) 2025-12-04T10:49:11.1532452Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1532488Z with policy(): 2025-12-04T10:49:11.1532639Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1532679Z raise RuntimeError(msg) 2025-12-04T10:49:11.1533083Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 65536 and is now reported as 131072 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1533086Z 2025-12-04T10:49:11.1533156Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1533444Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1533446Z 2025-12-04T10:49:11.1533531Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1533603Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1533657Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1533852Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1533925Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1533962Z graph_break [] 2025-12-04T10:49:11.1534034Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1534088Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1534158Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1534330Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1534367Z graph_break [] 2025-12-04T10:49:11.1534430Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1534577Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1534621Z Traceback (most recent call last): 2025-12-04T10:49:11.1534776Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1534816Z method(*args, **kwargs) 2025-12-04T10:49:11.1534966Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1535005Z method(*args, **kwargs) 2025-12-04T10:49:11.1535154Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1535190Z with policy(): 2025-12-04T10:49:11.1535341Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1535383Z raise RuntimeError(msg) 2025-12-04T10:49:11.1535786Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1535798Z 2025-12-04T10:49:11.1535870Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1536167Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1536169Z 2025-12-04T10:49:11.1536254Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1536325Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1536380Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1536553Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1536625Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1536660Z graph_break [] 2025-12-04T10:49:11.1536731Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1536786Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1536855Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1537026Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1537063Z graph_break [] 2025-12-04T10:49:11.1537133Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1537186Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1537256Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1537439Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1537475Z graph_break [] 2025-12-04T10:49:11.1537717Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8b37ed6375151a97.xml - 2025-12-04T10:49:11.1537776Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1538420Z FAILED [0.4898s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1538425Z 2025-12-04T10:49:11.1538496Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1538783Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1538786Z 2025-12-04T10:49:11.1538869Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1538930Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1538996Z ================== 1 failed, 41 deselected, 2 rerun in 3.88s =================== 2025-12-04T10:49:11.1539034Z Got exit code 1 2025-12-04T10:49:11.1539073Z Retrying single test... 2025-12-04T10:49:11.1539270Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-88fbc6393e914740.xml 2025-12-04T10:49:11.1539336Z ============================= test session starts ============================== 2025-12-04T10:49:11.1539448Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1539500Z cachedir: .pytest_cache 2025-12-04T10:49:11.1539657Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1539702Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1539742Z configfile: pytest.ini 2025-12-04T10:49:11.1539903Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1539976Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1540262Z stepcurrent: skipping 41 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1540306Z Running 1 items in this shard 2025-12-04T10:49:11.1540308Z 2025-12-04T10:49:11.1540673Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:35:43.270433552 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1540676Z 2025-12-04T10:49:11.1540831Z [W1204 10:35:51.754951031 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1540833Z 2025-12-04T10:49:11.1540984Z [W1204 10:35:51.755124268 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1540986Z 2025-12-04T10:49:11.1541142Z [W1204 10:35:51.758827340 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1541146Z 2025-12-04T10:49:11.1541292Z [W1204 10:35:51.759256342 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1541296Z 2025-12-04T10:49:11.1541442Z [W1204 10:35:51.759343581 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1541444Z 2025-12-04T10:49:11.1541599Z [W1204 10:35:51.761927934 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1541601Z 2025-12-04T10:49:11.1541748Z [W1204 10:35:51.762223628 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1541750Z 2025-12-04T10:49:11.1541926Z [W1204 10:35:51.762304207 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1541929Z 2025-12-04T10:49:11.1541980Z ('RERUN', {'yellow': True}) [10.2178s] [100%] 2025-12-04T10:49:11.1542335Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:35:52.926438920 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1542339Z 2025-12-04T10:49:11.1542486Z [W1204 10:35:52.926828063 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1542488Z 2025-12-04T10:49:11.1542636Z [W1204 10:35:52.926907351 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1542638Z 2025-12-04T10:49:11.1542786Z [W1204 10:35:52.928879945 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1542801Z 2025-12-04T10:49:11.1542949Z [W1204 10:35:52.929233049 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1542968Z 2025-12-04T10:49:11.1543115Z [W1204 10:35:52.929314118 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1543117Z 2025-12-04T10:49:11.1543262Z [W1204 10:35:52.931525077 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1543265Z 2025-12-04T10:49:11.1543412Z [W1204 10:35:52.931801762 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1543415Z 2025-12-04T10:49:11.1543561Z [W1204 10:35:52.931879301 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1543564Z 2025-12-04T10:49:11.1543613Z ('RERUN', {'yellow': True}) [0.6461s] [100%] 2025-12-04T10:49:11.1543966Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:35:52.503961579 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1543970Z 2025-12-04T10:49:11.1544117Z [W1204 10:35:52.504338923 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1544120Z 2025-12-04T10:49:11.1544267Z [W1204 10:35:52.504419811 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1544268Z 2025-12-04T10:49:11.1544434Z [W1204 10:35:52.505794286 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1544437Z 2025-12-04T10:49:11.1544585Z [W1204 10:35:52.506120280 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1544588Z 2025-12-04T10:49:11.1544735Z [W1204 10:35:52.506199039 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1544738Z 2025-12-04T10:49:11.1544885Z [W1204 10:35:52.508398419 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1544887Z 2025-12-04T10:49:11.1545046Z [W1204 10:35:52.508654654 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1545047Z 2025-12-04T10:49:11.1545195Z [W1204 10:35:52.508729513 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1545198Z 2025-12-04T10:49:11.1545237Z FAILED [0.5812s] [100%] 2025-12-04T10:49:11.1545239Z 2025-12-04T10:49:11.1545289Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1545438Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1545483Z Traceback (most recent call last): 2025-12-04T10:49:11.1545638Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1545678Z method(*args, **kwargs) 2025-12-04T10:49:11.1545830Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1545869Z method(*args, **kwargs) 2025-12-04T10:49:11.1546019Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1546067Z with policy(): 2025-12-04T10:49:11.1546218Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1546272Z raise RuntimeError(msg) 2025-12-04T10:49:11.1546664Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 65536 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1546666Z 2025-12-04T10:49:11.1546740Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1547030Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1547033Z 2025-12-04T10:49:11.1547120Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1547191Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1547248Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1547422Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1547495Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1547532Z graph_break [] 2025-12-04T10:49:11.1547603Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1547958Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1548002Z if out == self.unknown_value: 2025-12-04T10:49:11.1548150Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1548195Z Traceback (most recent call last): 2025-12-04T10:49:11.1548346Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1548384Z method(*args, **kwargs) 2025-12-04T10:49:11.1548534Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1548581Z method(*args, **kwargs) 2025-12-04T10:49:11.1548732Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1548768Z with policy(): 2025-12-04T10:49:11.1548920Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1548960Z raise RuntimeError(msg) 2025-12-04T10:49:11.1549366Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 65536 and is now reported as 131072 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1549369Z 2025-12-04T10:49:11.1549442Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1549729Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1549732Z 2025-12-04T10:49:11.1549818Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1549901Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1549956Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1550143Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1550214Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1550250Z graph_break [] 2025-12-04T10:49:11.1550320Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1550666Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1550709Z if out == self.unknown_value: 2025-12-04T10:49:11.1550782Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1550836Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1550907Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1551085Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1551120Z graph_break [] 2025-12-04T10:49:11.1551172Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1551322Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1551367Z Traceback (most recent call last): 2025-12-04T10:49:11.1551520Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1551570Z method(*args, **kwargs) 2025-12-04T10:49:11.1551721Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1551761Z method(*args, **kwargs) 2025-12-04T10:49:11.1551979Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1552016Z with policy(): 2025-12-04T10:49:11.1552167Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1552208Z raise RuntimeError(msg) 2025-12-04T10:49:11.1552632Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1552636Z 2025-12-04T10:49:11.1552709Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1552997Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1553000Z 2025-12-04T10:49:11.1553084Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1553155Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1553208Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1553383Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1553454Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1553932Z graph_break [] 2025-12-04T10:49:11.1554003Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1554344Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1554403Z if out == self.unknown_value: 2025-12-04T10:49:11.1554474Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1554527Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1554599Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1554771Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1554808Z graph_break [] 2025-12-04T10:49:11.1554878Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1554932Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1555003Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1555178Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1555214Z graph_break [] 2025-12-04T10:49:11.1555455Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-88fbc6393e914740.xml - 2025-12-04T10:49:11.1555514Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1556159Z FAILED [0.5812s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1556164Z 2025-12-04T10:49:11.1556236Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1556535Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1556538Z 2025-12-04T10:49:11.1556623Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1556684Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1556750Z ================== 1 failed, 57 deselected, 2 rerun in 11.59s ================== 2025-12-04T10:49:11.1556787Z Got exit code 1 2025-12-04T10:49:11.1556827Z Retrying single test... 2025-12-04T10:49:11.1557024Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d3eab7fbdcaef3cb.xml 2025-12-04T10:49:11.1557081Z ============================= test session starts ============================== 2025-12-04T10:49:11.1557191Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1557231Z cachedir: .pytest_cache 2025-12-04T10:49:11.1557391Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1557435Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1557476Z configfile: pytest.ini 2025-12-04T10:49:11.1557638Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1557725Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1558007Z stepcurrent: skipping 41 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1558068Z Running 1 items in this shard 2025-12-04T10:49:11.1558073Z 2025-12-04T10:49:11.1558436Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:36:02.254419143 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1558438Z 2025-12-04T10:49:11.1558590Z [W1204 10:36:10.801200480 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1558594Z 2025-12-04T10:49:11.1558743Z [W1204 10:36:10.801382777 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1558745Z 2025-12-04T10:49:11.1558894Z [W1204 10:36:10.804773205 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1558896Z 2025-12-04T10:49:11.1559043Z [W1204 10:36:10.805122579 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1559045Z 2025-12-04T10:49:11.1559193Z [W1204 10:36:10.805207327 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1559195Z 2025-12-04T10:49:11.1559342Z [W1204 10:36:10.807803470 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1559345Z 2025-12-04T10:49:11.1559503Z [W1204 10:36:10.808090515 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1559506Z 2025-12-04T10:49:11.1559653Z [W1204 10:36:10.808174543 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1559655Z 2025-12-04T10:49:11.1559707Z ('RERUN', {'yellow': True}) [10.2284s] [100%] 2025-12-04T10:49:11.1560070Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:36:11.844760637 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1560074Z 2025-12-04T10:49:11.1560220Z [W1204 10:36:11.845264437 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1560224Z 2025-12-04T10:49:11.1560373Z [W1204 10:36:11.845373275 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1560375Z 2025-12-04T10:49:11.1560522Z [W1204 10:36:11.846782950 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1560524Z 2025-12-04T10:49:11.1560671Z [W1204 10:36:11.847177073 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1560673Z 2025-12-04T10:49:11.1560820Z [W1204 10:36:11.847267261 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1560822Z 2025-12-04T10:49:11.1560968Z [W1204 10:36:11.849616978 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1560970Z 2025-12-04T10:49:11.1561128Z [W1204 10:36:11.849915633 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1561130Z 2025-12-04T10:49:11.1561275Z [W1204 10:36:11.849997171 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1561290Z 2025-12-04T10:49:11.1561341Z ('RERUN', {'yellow': True}) [0.5207s] [100%] 2025-12-04T10:49:11.1561693Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:36:11.331053702 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1561696Z 2025-12-04T10:49:11.1561876Z [W1204 10:36:11.331451485 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1561877Z 2025-12-04T10:49:11.1562026Z [W1204 10:36:11.331543693 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1562028Z 2025-12-04T10:49:11.1562173Z [W1204 10:36:11.332954467 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1562176Z 2025-12-04T10:49:11.1562323Z [W1204 10:36:11.333306401 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1562325Z 2025-12-04T10:49:11.1562472Z [W1204 10:36:11.333396609 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1562474Z 2025-12-04T10:49:11.1562622Z [W1204 10:36:11.335695047 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1562624Z 2025-12-04T10:49:11.1562785Z [W1204 10:36:11.335961962 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1562788Z 2025-12-04T10:49:11.1562934Z [W1204 10:36:11.336043301 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1562938Z 2025-12-04T10:49:11.1562976Z FAILED [0.4925s] [100%] 2025-12-04T10:49:11.1562978Z 2025-12-04T10:49:11.1563028Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1563178Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1563235Z Traceback (most recent call last): 2025-12-04T10:49:11.1563392Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1563431Z method(*args, **kwargs) 2025-12-04T10:49:11.1563584Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1563623Z method(*args, **kwargs) 2025-12-04T10:49:11.1563776Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1563813Z with policy(): 2025-12-04T10:49:11.1563966Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1564005Z raise RuntimeError(msg) 2025-12-04T10:49:11.1564400Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 65536 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1564402Z 2025-12-04T10:49:11.1564477Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1564778Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1564794Z 2025-12-04T10:49:11.1564881Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1564952Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1565007Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1565184Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1565256Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1565293Z graph_break [] 2025-12-04T10:49:11.1565365Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1565707Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1565752Z if out == self.unknown_value: 2025-12-04T10:49:11.1565901Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1565945Z Traceback (most recent call last): 2025-12-04T10:49:11.1566098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1566137Z method(*args, **kwargs) 2025-12-04T10:49:11.1566286Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1566334Z method(*args, **kwargs) 2025-12-04T10:49:11.1566485Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1566521Z with policy(): 2025-12-04T10:49:11.1566673Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1566712Z raise RuntimeError(msg) 2025-12-04T10:49:11.1567130Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 65536 and is now reported as 131072 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1567132Z 2025-12-04T10:49:11.1567206Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1567493Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1567496Z 2025-12-04T10:49:11.1567581Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1567652Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1567707Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1567879Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1567951Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1567987Z graph_break [] 2025-12-04T10:49:11.1568058Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1568398Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1568451Z if out == self.unknown_value: 2025-12-04T10:49:11.1568533Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1568588Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1568658Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1568832Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1568869Z graph_break [] 2025-12-04T10:49:11.1568921Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1569070Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1569116Z Traceback (most recent call last): 2025-12-04T10:49:11.1569268Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1569309Z method(*args, **kwargs) 2025-12-04T10:49:11.1569459Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1569497Z method(*args, **kwargs) 2025-12-04T10:49:11.1569645Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1569682Z with policy(): 2025-12-04T10:49:11.1569833Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1569873Z raise RuntimeError(msg) 2025-12-04T10:49:11.1570286Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1570290Z 2025-12-04T10:49:11.1570362Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1570648Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1570652Z 2025-12-04T10:49:11.1570744Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1570816Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1570870Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1571043Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1571115Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1571152Z graph_break [] 2025-12-04T10:49:11.1571223Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1571562Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1571606Z if out == self.unknown_value: 2025-12-04T10:49:11.1571676Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1571731Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1571802Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1572029Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1572080Z graph_break [] 2025-12-04T10:49:11.1572150Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1572203Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1572273Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1572445Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1572481Z graph_break [] 2025-12-04T10:49:11.1572723Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d3eab7fbdcaef3cb.xml - 2025-12-04T10:49:11.1572785Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1573413Z FAILED [0.4925s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1573418Z 2025-12-04T10:49:11.1573490Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1573775Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1573790Z 2025-12-04T10:49:11.1573873Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1573934Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1574001Z ================== 1 failed, 57 deselected, 2 rerun in 11.40s ================== 2025-12-04T10:49:11.1574038Z Got exit code 1 2025-12-04T10:49:11.1574274Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1574414Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1574609Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8c8d5d76c7564418.xml 2025-12-04T10:49:11.1574666Z ============================= test session starts ============================== 2025-12-04T10:49:11.1574777Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1574818Z cachedir: .pytest_cache 2025-12-04T10:49:11.1574974Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1575020Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1575060Z configfile: pytest.ini 2025-12-04T10:49:11.1575220Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1575294Z collecting ... collected 58 items / 42 deselected / 16 selected 2025-12-04T10:49:11.1575345Z stepcurrent: skipping 42 already run items. 2025-12-04T10:49:11.1575390Z Running 16 items in this shard 2025-12-04T10:49:11.1575392Z 2025-12-04T10:49:11.1575643Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [3.0096s] [ 6%] 2025-12-04T10:49:11.1575899Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.5305s] [ 6%] 2025-12-04T10:49:11.1576130Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 FAILED [0.5355s] [ 6%] 2025-12-04T10:49:11.1576133Z 2025-12-04T10:49:11.1576184Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1576332Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1576376Z Traceback (most recent call last): 2025-12-04T10:49:11.1576533Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1576572Z method(*args, **kwargs) 2025-12-04T10:49:11.1576724Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1576764Z method(*args, **kwargs) 2025-12-04T10:49:11.1576914Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1576950Z with policy(): 2025-12-04T10:49:11.1577101Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1577142Z raise RuntimeError(msg) 2025-12-04T10:49:11.1577550Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1577553Z 2025-12-04T10:49:11.1577624Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1577916Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1577920Z 2025-12-04T10:49:11.1578006Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1578076Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1578140Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1578412Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1578485Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1578520Z graph_break [] 2025-12-04T10:49:11.1578670Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1578715Z Traceback (most recent call last): 2025-12-04T10:49:11.1578869Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1578908Z method(*args, **kwargs) 2025-12-04T10:49:11.1579058Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1579097Z method(*args, **kwargs) 2025-12-04T10:49:11.1579247Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1579283Z with policy(): 2025-12-04T10:49:11.1579447Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1579487Z raise RuntimeError(msg) 2025-12-04T10:49:11.1579890Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1579904Z 2025-12-04T10:49:11.1579975Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1580263Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1580265Z 2025-12-04T10:49:11.1580350Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1580421Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1580477Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1580748Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1580819Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1580854Z graph_break [] 2025-12-04T10:49:11.1580926Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1580979Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1581049Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1581326Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1581365Z graph_break [] 2025-12-04T10:49:11.1581417Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1581564Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1581609Z Traceback (most recent call last): 2025-12-04T10:49:11.1581769Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1581809Z method(*args, **kwargs) 2025-12-04T10:49:11.1582005Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1582044Z method(*args, **kwargs) 2025-12-04T10:49:11.1582196Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1582232Z with policy(): 2025-12-04T10:49:11.1582384Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1582426Z raise RuntimeError(msg) 2025-12-04T10:49:11.1582827Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1582831Z 2025-12-04T10:49:11.1582901Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1583186Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1583207Z 2025-12-04T10:49:11.1583291Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1583373Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1583427Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1583697Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1583767Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1583803Z graph_break [] 2025-12-04T10:49:11.1583872Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1583927Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1583997Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1584264Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1584301Z graph_break [] 2025-12-04T10:49:11.1584371Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1584423Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1584494Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1584762Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1584811Z graph_break [] 2025-12-04T10:49:11.1585053Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8c8d5d76c7564418.xml - 2025-12-04T10:49:11.1585114Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1585761Z FAILED [0.5355s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1585764Z 2025-12-04T10:49:11.1585835Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1586121Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1586124Z 2025-12-04T10:49:11.1586208Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1586267Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1586333Z ================== 1 failed, 42 deselected, 2 rerun in 4.22s =================== 2025-12-04T10:49:11.1586368Z Got exit code 1 2025-12-04T10:49:11.1586409Z Retrying single test... 2025-12-04T10:49:11.1586606Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-4e7f0275c0c9d8a2.xml 2025-12-04T10:49:11.1586661Z ============================= test session starts ============================== 2025-12-04T10:49:11.1586782Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1586825Z cachedir: .pytest_cache 2025-12-04T10:49:11.1586981Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1587038Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1587078Z configfile: pytest.ini 2025-12-04T10:49:11.1587239Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1587312Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1587596Z stepcurrent: skipping 42 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1587640Z Running 1 items in this shard 2025-12-04T10:49:11.1587643Z 2025-12-04T10:49:11.1588001Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:36:33.366425215 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1588004Z 2025-12-04T10:49:11.1588155Z [W1204 10:36:41.015585633 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1588157Z 2025-12-04T10:49:11.1588307Z [W1204 10:36:41.015761862 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1588309Z 2025-12-04T10:49:11.1588457Z [W1204 10:36:41.019620347 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1588461Z 2025-12-04T10:49:11.1588617Z [W1204 10:36:41.020022305 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1588621Z 2025-12-04T10:49:11.1588769Z [W1204 10:36:41.020106284 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1588772Z 2025-12-04T10:49:11.1588919Z [W1204 10:36:41.022861417 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1588921Z 2025-12-04T10:49:11.1589066Z [W1204 10:36:41.023184494 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1589079Z 2025-12-04T10:49:11.1589226Z [W1204 10:36:41.023265914 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1589228Z 2025-12-04T10:49:11.1589277Z ('RERUN', {'yellow': True}) [10.8295s] [100%] 2025-12-04T10:49:11.1589634Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:36:42.861786108 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1589637Z 2025-12-04T10:49:11.1589784Z [W1204 10:36:42.862166766 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1589786Z 2025-12-04T10:49:11.1589934Z [W1204 10:36:42.862251605 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1589936Z 2025-12-04T10:49:11.1590082Z [W1204 10:36:42.863677456 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1590084Z 2025-12-04T10:49:11.1590231Z [W1204 10:36:42.863932654 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1590244Z 2025-12-04T10:49:11.1590391Z [W1204 10:36:42.864017704 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1590403Z 2025-12-04T10:49:11.1590551Z [W1204 10:36:42.866226480 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1590553Z 2025-12-04T10:49:11.1590700Z [W1204 10:36:42.866481258 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1590703Z 2025-12-04T10:49:11.1590851Z [W1204 10:36:42.866554698 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1590853Z 2025-12-04T10:49:11.1590901Z ('RERUN', {'yellow': True}) [0.6840s] [100%] 2025-12-04T10:49:11.1591259Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:36:43.540381203 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1591263Z 2025-12-04T10:49:11.1591410Z [W1204 10:36:43.540768461 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1591412Z 2025-12-04T10:49:11.1591558Z [W1204 10:36:43.540851880 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1591560Z 2025-12-04T10:49:11.1591707Z [W1204 10:36:43.542269181 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1591708Z 2025-12-04T10:49:11.1591900Z [W1204 10:36:43.542529009 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1591904Z 2025-12-04T10:49:11.1592051Z [W1204 10:36:43.542605209 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1592054Z 2025-12-04T10:49:11.1592199Z [W1204 10:36:43.544815884 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1592202Z 2025-12-04T10:49:11.1592347Z [W1204 10:36:43.545078373 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1592349Z 2025-12-04T10:49:11.1592508Z [W1204 10:36:43.545156112 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1592510Z 2025-12-04T10:49:11.1592548Z FAILED [0.7092s] [100%] 2025-12-04T10:49:11.1592550Z 2025-12-04T10:49:11.1592602Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1592752Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1592798Z Traceback (most recent call last): 2025-12-04T10:49:11.1592954Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1592995Z method(*args, **kwargs) 2025-12-04T10:49:11.1593145Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1593186Z method(*args, **kwargs) 2025-12-04T10:49:11.1593336Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1593374Z with policy(): 2025-12-04T10:49:11.1593526Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1593579Z raise RuntimeError(msg) 2025-12-04T10:49:11.1593972Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1593990Z 2025-12-04T10:49:11.1594062Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1594350Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1594352Z 2025-12-04T10:49:11.1594437Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1594509Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1594564Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1594834Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1594907Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1594944Z graph_break [] 2025-12-04T10:49:11.1595015Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1595359Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1595403Z if out == self.unknown_value: 2025-12-04T10:49:11.1595561Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1595607Z Traceback (most recent call last): 2025-12-04T10:49:11.1595760Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1595800Z method(*args, **kwargs) 2025-12-04T10:49:11.1595950Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1595991Z method(*args, **kwargs) 2025-12-04T10:49:11.1596149Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1596186Z with policy(): 2025-12-04T10:49:11.1596337Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1596378Z raise RuntimeError(msg) 2025-12-04T10:49:11.1596778Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1596781Z 2025-12-04T10:49:11.1596853Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1597141Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1597144Z 2025-12-04T10:49:11.1597228Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1597301Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1597366Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1597638Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1597719Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1597756Z graph_break [] 2025-12-04T10:49:11.1597826Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1598166Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1598208Z if out == self.unknown_value: 2025-12-04T10:49:11.1598279Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1598334Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1598405Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1598678Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1598714Z graph_break [] 2025-12-04T10:49:11.1598765Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1598917Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1598962Z Traceback (most recent call last): 2025-12-04T10:49:11.1599115Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1599170Z method(*args, **kwargs) 2025-12-04T10:49:11.1599320Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1599359Z method(*args, **kwargs) 2025-12-04T10:49:11.1599511Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1599548Z with policy(): 2025-12-04T10:49:11.1599699Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1599740Z raise RuntimeError(msg) 2025-12-04T10:49:11.1600158Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1600161Z 2025-12-04T10:49:11.1600233Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1600519Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1600523Z 2025-12-04T10:49:11.1600607Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1600678Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1600733Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1601001Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1601081Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1601118Z graph_break [] 2025-12-04T10:49:11.1601188Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1601537Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1601579Z if out == self.unknown_value: 2025-12-04T10:49:11.1601651Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1601705Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1601776Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1602085Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1602124Z graph_break [] 2025-12-04T10:49:11.1602195Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1602249Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1602318Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1602583Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1602620Z graph_break [] 2025-12-04T10:49:11.1602861Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-4e7f0275c0c9d8a2.xml - 2025-12-04T10:49:11.1602934Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1603563Z FAILED [0.7092s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1603567Z 2025-12-04T10:49:11.1603650Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1603936Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1603938Z 2025-12-04T10:49:11.1604023Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1604084Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1604150Z ================== 1 failed, 57 deselected, 2 rerun in 12.37s ================== 2025-12-04T10:49:11.1604187Z Got exit code 1 2025-12-04T10:49:11.1604226Z Retrying single test... 2025-12-04T10:49:11.1604421Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-02e101d703b1ea9d.xml 2025-12-04T10:49:11.1604477Z ============================= test session starts ============================== 2025-12-04T10:49:11.1604589Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1604630Z cachedir: .pytest_cache 2025-12-04T10:49:11.1604788Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1604847Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1604887Z configfile: pytest.ini 2025-12-04T10:49:11.1605048Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1605134Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1605417Z stepcurrent: skipping 42 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1605460Z Running 1 items in this shard 2025-12-04T10:49:11.1605464Z 2025-12-04T10:49:11.1605822Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:36:53.357045533 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1605825Z 2025-12-04T10:49:11.1605976Z [W1204 10:37:01.987020571 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1605979Z 2025-12-04T10:49:11.1606129Z [W1204 10:37:01.987174880 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1606131Z 2025-12-04T10:49:11.1606278Z [W1204 10:37:01.990563567 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1606281Z 2025-12-04T10:49:11.1606428Z [W1204 10:37:01.990876725 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1606430Z 2025-12-04T10:49:11.1606586Z [W1204 10:37:01.990953264 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1606590Z 2025-12-04T10:49:11.1606737Z [W1204 10:37:01.993455837 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1606740Z 2025-12-04T10:49:11.1606886Z [W1204 10:37:01.993726025 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1606888Z 2025-12-04T10:49:11.1607033Z [W1204 10:37:01.993802195 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1607035Z 2025-12-04T10:49:11.1607097Z ('RERUN', {'yellow': True}) [10.7572s] [100%] 2025-12-04T10:49:11.1607453Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:37:02.760203050 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1607456Z 2025-12-04T10:49:11.1607602Z [W1204 10:37:02.760594388 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1607605Z 2025-12-04T10:49:11.1607752Z [W1204 10:37:02.760691097 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1607754Z 2025-12-04T10:49:11.1607899Z [W1204 10:37:02.762111687 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1607900Z 2025-12-04T10:49:11.1608048Z [W1204 10:37:02.762380235 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1608050Z 2025-12-04T10:49:11.1608198Z [W1204 10:37:02.762460355 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1608210Z 2025-12-04T10:49:11.1608359Z [W1204 10:37:02.764682350 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1608361Z 2025-12-04T10:49:11.1608518Z [W1204 10:37:02.764942348 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1608520Z 2025-12-04T10:49:11.1608666Z [W1204 10:37:02.765024037 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1608668Z 2025-12-04T10:49:11.1608716Z ('RERUN', {'yellow': True}) [0.6225s] [100%] 2025-12-04T10:49:11.1609069Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:37:02.387356398 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1609074Z 2025-12-04T10:49:11.1609220Z [W1204 10:37:02.387761656 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1609222Z 2025-12-04T10:49:11.1609369Z [W1204 10:37:02.387853535 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1609371Z 2025-12-04T10:49:11.1609516Z [W1204 10:37:02.389262075 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1609518Z 2025-12-04T10:49:11.1609667Z [W1204 10:37:02.389515064 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1609668Z 2025-12-04T10:49:11.1609815Z [W1204 10:37:02.389591163 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1609828Z 2025-12-04T10:49:11.1609975Z [W1204 10:37:02.391810918 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1609977Z 2025-12-04T10:49:11.1610124Z [W1204 10:37:02.392070376 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1610126Z 2025-12-04T10:49:11.1610275Z [W1204 10:37:02.392149236 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1610277Z 2025-12-04T10:49:11.1610316Z FAILED [0.6213s] [100%] 2025-12-04T10:49:11.1610318Z 2025-12-04T10:49:11.1610378Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1610527Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1610572Z Traceback (most recent call last): 2025-12-04T10:49:11.1610732Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1610771Z method(*args, **kwargs) 2025-12-04T10:49:11.1610923Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1610963Z method(*args, **kwargs) 2025-12-04T10:49:11.1611114Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1611151Z with policy(): 2025-12-04T10:49:11.1611302Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1611343Z raise RuntimeError(msg) 2025-12-04T10:49:11.1611738Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1611751Z 2025-12-04T10:49:11.1611824Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1612173Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1612175Z 2025-12-04T10:49:11.1612261Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1612332Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1612389Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1612665Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1612738Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1612775Z graph_break [] 2025-12-04T10:49:11.1612846Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1613188Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1613232Z if out == self.unknown_value: 2025-12-04T10:49:11.1613381Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1613426Z Traceback (most recent call last): 2025-12-04T10:49:11.1613591Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1613631Z method(*args, **kwargs) 2025-12-04T10:49:11.1613782Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1613821Z method(*args, **kwargs) 2025-12-04T10:49:11.1613970Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1614006Z with policy(): 2025-12-04T10:49:11.1614157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1614213Z raise RuntimeError(msg) 2025-12-04T10:49:11.1614614Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1614617Z 2025-12-04T10:49:11.1614689Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1614977Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1614979Z 2025-12-04T10:49:11.1615064Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1615136Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1615191Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1615460Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1615544Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1615579Z graph_break [] 2025-12-04T10:49:11.1615650Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1616003Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1616047Z if out == self.unknown_value: 2025-12-04T10:49:11.1616117Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1616173Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1616245Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1616514Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1616551Z graph_break [] 2025-12-04T10:49:11.1616605Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1616755Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1616799Z Traceback (most recent call last): 2025-12-04T10:49:11.1616955Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1616994Z method(*args, **kwargs) 2025-12-04T10:49:11.1617145Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1617183Z method(*args, **kwargs) 2025-12-04T10:49:11.1617343Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1617381Z with policy(): 2025-12-04T10:49:11.1617532Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1617572Z raise RuntimeError(msg) 2025-12-04T10:49:11.1617988Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1617991Z 2025-12-04T10:49:11.1618063Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1618351Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1618354Z 2025-12-04T10:49:11.1618439Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1618510Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1618564Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1618832Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1618904Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1618940Z graph_break [] 2025-12-04T10:49:11.1619011Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1619354Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1619419Z if out == self.unknown_value: 2025-12-04T10:49:11.1619488Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1619542Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1619611Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1619879Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1619915Z graph_break [] 2025-12-04T10:49:11.1619985Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1620040Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1620109Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1620376Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1620415Z graph_break [] 2025-12-04T10:49:11.1620654Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-02e101d703b1ea9d.xml - 2025-12-04T10:49:11.1620713Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1621358Z FAILED [0.6213s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1621362Z 2025-12-04T10:49:11.1621434Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1621717Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1621729Z 2025-12-04T10:49:11.1621814Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1621909Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1621975Z ================== 1 failed, 57 deselected, 2 rerun in 12.15s ================== 2025-12-04T10:49:11.1622012Z Got exit code 1 2025-12-04T10:49:11.1622250Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1622376Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1622574Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-522b71eaadaaaa46.xml 2025-12-04T10:49:11.1622629Z ============================= test session starts ============================== 2025-12-04T10:49:11.1622742Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1622782Z cachedir: .pytest_cache 2025-12-04T10:49:11.1622940Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1622998Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1623038Z configfile: pytest.ini 2025-12-04T10:49:11.1623198Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1623285Z collecting ... collected 58 items / 43 deselected / 15 selected 2025-12-04T10:49:11.1623338Z stepcurrent: skipping 43 already run items. 2025-12-04T10:49:11.1623381Z Running 15 items in this shard 2025-12-04T10:49:11.1623383Z 2025-12-04T10:49:11.1623637Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.6536s] [ 6%] 2025-12-04T10:49:11.1623882Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.5458s] [ 6%] 2025-12-04T10:49:11.1624104Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 FAILED [0.6527s] [ 6%] 2025-12-04T10:49:11.1624107Z 2025-12-04T10:49:11.1624158Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1624306Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1624350Z Traceback (most recent call last): 2025-12-04T10:49:11.1624507Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1624548Z method(*args, **kwargs) 2025-12-04T10:49:11.1624698Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1624738Z method(*args, **kwargs) 2025-12-04T10:49:11.1624901Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1624938Z with policy(): 2025-12-04T10:49:11.1625089Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1625131Z raise RuntimeError(msg) 2025-12-04T10:49:11.1625539Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1625542Z 2025-12-04T10:49:11.1625614Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1625904Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1625907Z 2025-12-04T10:49:11.1625992Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1626063Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1626118Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1626294Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1626367Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1626404Z graph_break [] 2025-12-04T10:49:11.1626553Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1626597Z Traceback (most recent call last): 2025-12-04T10:49:11.1626768Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1626807Z method(*args, **kwargs) 2025-12-04T10:49:11.1626957Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1627006Z method(*args, **kwargs) 2025-12-04T10:49:11.1627154Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1627191Z with policy(): 2025-12-04T10:49:11.1627342Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1627383Z raise RuntimeError(msg) 2025-12-04T10:49:11.1627787Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1627790Z 2025-12-04T10:49:11.1627862Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1628156Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1628158Z 2025-12-04T10:49:11.1628242Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1628315Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1628369Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1628553Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1628625Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1628662Z graph_break [] 2025-12-04T10:49:11.1628732Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1628786Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1628855Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1629028Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1629074Z graph_break [] 2025-12-04T10:49:11.1629126Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1629274Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1629322Z Traceback (most recent call last): 2025-12-04T10:49:11.1629474Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1629514Z method(*args, **kwargs) 2025-12-04T10:49:11.1629665Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1629704Z method(*args, **kwargs) 2025-12-04T10:49:11.1629853Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1629889Z with policy(): 2025-12-04T10:49:11.1630043Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1630083Z raise RuntimeError(msg) 2025-12-04T10:49:11.1630490Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1630511Z 2025-12-04T10:49:11.1630582Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1630870Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1630873Z 2025-12-04T10:49:11.1630958Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1631029Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1631083Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1631259Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1631330Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1631367Z graph_break [] 2025-12-04T10:49:11.1631438Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1631490Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1631560Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1631732Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1631769Z graph_break [] 2025-12-04T10:49:11.1631839Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1631938Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1632020Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1632195Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1632232Z graph_break [] 2025-12-04T10:49:11.1632477Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-522b71eaadaaaa46.xml - 2025-12-04T10:49:11.1632535Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1633185Z FAILED [0.6527s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1633188Z 2025-12-04T10:49:11.1633260Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1633546Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1633548Z 2025-12-04T10:49:11.1633632Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1633692Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1633758Z ================== 1 failed, 43 deselected, 2 rerun in 4.03s =================== 2025-12-04T10:49:11.1633794Z Got exit code 1 2025-12-04T10:49:11.1633834Z Retrying single test... 2025-12-04T10:49:11.1634032Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-5891c7a6c72c224e.xml 2025-12-04T10:49:11.1634100Z ============================= test session starts ============================== 2025-12-04T10:49:11.1634225Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1634266Z cachedir: .pytest_cache 2025-12-04T10:49:11.1634424Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1634470Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1634509Z configfile: pytest.ini 2025-12-04T10:49:11.1634671Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1634744Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1635027Z stepcurrent: skipping 43 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1635075Z Running 1 items in this shard 2025-12-04T10:49:11.1635078Z 2025-12-04T10:49:11.1635437Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:37:23.466297230 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1635439Z 2025-12-04T10:49:11.1635592Z [W1204 10:37:31.151470500 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1635593Z 2025-12-04T10:49:11.1635743Z [W1204 10:37:31.151627749 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1635746Z 2025-12-04T10:49:11.1635902Z [W1204 10:37:31.155153153 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1635904Z 2025-12-04T10:49:11.1636052Z [W1204 10:37:31.155626899 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1636055Z 2025-12-04T10:49:11.1636203Z [W1204 10:37:31.155715258 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1636205Z 2025-12-04T10:49:11.1636364Z [W1204 10:37:31.158332809 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1636366Z 2025-12-04T10:49:11.1636513Z [W1204 10:37:31.158629127 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1636514Z 2025-12-04T10:49:11.1636664Z [W1204 10:37:31.158706656 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1636666Z 2025-12-04T10:49:11.1636717Z ('RERUN', {'yellow': True}) [10.3365s] [100%] 2025-12-04T10:49:11.1637075Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:37:32.192591842 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1637077Z 2025-12-04T10:49:11.1637227Z [W1204 10:37:32.193009669 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1637229Z 2025-12-04T10:49:11.1637376Z [W1204 10:37:32.193095918 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1637378Z 2025-12-04T10:49:11.1637536Z [W1204 10:37:32.194469858 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1637538Z 2025-12-04T10:49:11.1637684Z [W1204 10:37:32.194795156 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1637699Z 2025-12-04T10:49:11.1637846Z [W1204 10:37:32.194874195 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1637848Z 2025-12-04T10:49:11.1637996Z [W1204 10:37:32.197080329 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1637997Z 2025-12-04T10:49:11.1638144Z [W1204 10:37:32.197337877 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1638145Z 2025-12-04T10:49:11.1638293Z [W1204 10:37:32.197414426 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1638296Z 2025-12-04T10:49:11.1638345Z ('RERUN', {'yellow': True}) [0.5459s] [100%] 2025-12-04T10:49:11.1638704Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:37:33.752652078 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1638706Z 2025-12-04T10:49:11.1638854Z [W1204 10:37:33.753051975 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1638855Z 2025-12-04T10:49:11.1639001Z [W1204 10:37:33.753139594 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1639003Z 2025-12-04T10:49:11.1639161Z [W1204 10:37:33.754555444 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1639164Z 2025-12-04T10:49:11.1639312Z [W1204 10:37:33.754894291 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1639315Z 2025-12-04T10:49:11.1639462Z [W1204 10:37:33.754973210 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1639464Z 2025-12-04T10:49:11.1639631Z [W1204 10:37:33.757328693 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1639633Z 2025-12-04T10:49:11.1639781Z [W1204 10:37:33.757750050 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1639782Z 2025-12-04T10:49:11.1639931Z [W1204 10:37:33.757832059 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1639934Z 2025-12-04T10:49:11.1639971Z FAILED [0.5418s] [100%] 2025-12-04T10:49:11.1639973Z 2025-12-04T10:49:11.1640026Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1640175Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1640220Z Traceback (most recent call last): 2025-12-04T10:49:11.1640376Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1640417Z method(*args, **kwargs) 2025-12-04T10:49:11.1640572Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1640610Z method(*args, **kwargs) 2025-12-04T10:49:11.1640762Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1640815Z with policy(): 2025-12-04T10:49:11.1640966Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1641021Z raise RuntimeError(msg) 2025-12-04T10:49:11.1641420Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1641422Z 2025-12-04T10:49:11.1641494Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1641784Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1641787Z 2025-12-04T10:49:11.1641914Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1641987Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1642041Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1642216Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1642288Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1642325Z graph_break [] 2025-12-04T10:49:11.1642397Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1642756Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1642801Z if out == self.unknown_value: 2025-12-04T10:49:11.1642950Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1642996Z Traceback (most recent call last): 2025-12-04T10:49:11.1643149Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1643189Z method(*args, **kwargs) 2025-12-04T10:49:11.1643352Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1643392Z method(*args, **kwargs) 2025-12-04T10:49:11.1643540Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1643577Z with policy(): 2025-12-04T10:49:11.1643729Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1643770Z raise RuntimeError(msg) 2025-12-04T10:49:11.1644176Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1644180Z 2025-12-04T10:49:11.1644251Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1644539Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1644541Z 2025-12-04T10:49:11.1644638Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1644710Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1644764Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1644954Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1645024Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1645061Z graph_break [] 2025-12-04T10:49:11.1645132Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1645474Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1645519Z if out == self.unknown_value: 2025-12-04T10:49:11.1645590Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1645645Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1645716Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1645891Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1645927Z graph_break [] 2025-12-04T10:49:11.1645978Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1646128Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1646173Z Traceback (most recent call last): 2025-12-04T10:49:11.1646335Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1646377Z method(*args, **kwargs) 2025-12-04T10:49:11.1646526Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1646567Z method(*args, **kwargs) 2025-12-04T10:49:11.1646716Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1646752Z with policy(): 2025-12-04T10:49:11.1646903Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1646955Z raise RuntimeError(msg) 2025-12-04T10:49:11.1647363Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1647367Z 2025-12-04T10:49:11.1647439Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1647726Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1647729Z 2025-12-04T10:49:11.1647814Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1647884Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1647940Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1648112Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1648185Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1648232Z graph_break [] 2025-12-04T10:49:11.1648302Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1648642Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1648696Z if out == self.unknown_value: 2025-12-04T10:49:11.1648766Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1648820Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1648891Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1649066Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1649103Z graph_break [] 2025-12-04T10:49:11.1649174Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1649228Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1649297Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1649469Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1649505Z graph_break [] 2025-12-04T10:49:11.1649746Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-5891c7a6c72c224e.xml - 2025-12-04T10:49:11.1649806Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1650454Z FAILED [0.5418s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1650459Z 2025-12-04T10:49:11.1650531Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1650827Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1650831Z 2025-12-04T10:49:11.1650914Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1650975Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1651042Z ================== 1 failed, 57 deselected, 2 rerun in 11.60s ================== 2025-12-04T10:49:11.1651078Z Got exit code 1 2025-12-04T10:49:11.1651118Z Retrying single test... 2025-12-04T10:49:11.1651314Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-f86bf80852151ab7.xml 2025-12-04T10:49:11.1651372Z ============================= test session starts ============================== 2025-12-04T10:49:11.1651486Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1651526Z cachedir: .pytest_cache 2025-12-04T10:49:11.1651684Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1651728Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1651770Z configfile: pytest.ini 2025-12-04T10:49:11.1651976Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1652063Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1652361Z stepcurrent: skipping 43 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1652406Z Running 1 items in this shard 2025-12-04T10:49:11.1652408Z 2025-12-04T10:49:11.1652768Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:37:42.148390416 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1652770Z 2025-12-04T10:49:11.1652921Z [W1204 10:37:50.818935661 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1652924Z 2025-12-04T10:49:11.1653074Z [W1204 10:37:50.819105830 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1653077Z 2025-12-04T10:49:11.1653224Z [W1204 10:37:50.822398785 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1653226Z 2025-12-04T10:49:11.1653373Z [W1204 10:37:50.822731552 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1653377Z 2025-12-04T10:49:11.1653525Z [W1204 10:37:50.822823201 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1653527Z 2025-12-04T10:49:11.1653692Z [W1204 10:37:50.825441121 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1653695Z 2025-12-04T10:49:11.1653843Z [W1204 10:37:50.825723259 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1653846Z 2025-12-04T10:49:11.1653991Z [W1204 10:37:50.825804958 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1653993Z 2025-12-04T10:49:11.1654044Z ('RERUN', {'yellow': True}) [10.4382s] [100%] 2025-12-04T10:49:11.1654414Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:37:51.908754895 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1654417Z 2025-12-04T10:49:11.1654566Z [W1204 10:37:51.909182522 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1654569Z 2025-12-04T10:49:11.1654717Z [W1204 10:37:51.909272651 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1654720Z 2025-12-04T10:49:11.1654867Z [W1204 10:37:51.910696650 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1654869Z 2025-12-04T10:49:11.1655016Z [W1204 10:37:51.911035958 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1655018Z 2025-12-04T10:49:11.1655164Z [W1204 10:37:51.911118027 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1655167Z 2025-12-04T10:49:11.1655314Z [W1204 10:37:51.913482909 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1655325Z 2025-12-04T10:49:11.1655472Z [W1204 10:37:51.913749837 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1655485Z 2025-12-04T10:49:11.1655631Z [W1204 10:37:51.913829716 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1655633Z 2025-12-04T10:49:11.1655682Z ('RERUN', {'yellow': True}) [0.5246s] [100%] 2025-12-04T10:49:11.1656040Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:37:51.448547641 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1656042Z 2025-12-04T10:49:11.1656191Z [W1204 10:37:51.448930068 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1656194Z 2025-12-04T10:49:11.1656342Z [W1204 10:37:51.449020317 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1656345Z 2025-12-04T10:49:11.1656491Z [W1204 10:37:51.450418567 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1656493Z 2025-12-04T10:49:11.1656640Z [W1204 10:37:51.450748494 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1656642Z 2025-12-04T10:49:11.1656788Z [W1204 10:37:51.450827733 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1656790Z 2025-12-04T10:49:11.1656938Z [W1204 10:37:51.453128295 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1656950Z 2025-12-04T10:49:11.1657098Z [W1204 10:37:51.453395883 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1657100Z 2025-12-04T10:49:11.1657248Z [W1204 10:37:51.453473453 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1657249Z 2025-12-04T10:49:11.1657288Z FAILED [0.5404s] [100%] 2025-12-04T10:49:11.1657291Z 2025-12-04T10:49:11.1657342Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1657504Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1657549Z Traceback (most recent call last): 2025-12-04T10:49:11.1657707Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1657748Z method(*args, **kwargs) 2025-12-04T10:49:11.1657902Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1657940Z method(*args, **kwargs) 2025-12-04T10:49:11.1658092Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1658128Z with policy(): 2025-12-04T10:49:11.1658281Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1658320Z raise RuntimeError(msg) 2025-12-04T10:49:11.1658719Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1658732Z 2025-12-04T10:49:11.1658806Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1659094Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1659107Z 2025-12-04T10:49:11.1659193Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1659264Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1659320Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1659495Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1659567Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1659607Z graph_break [] 2025-12-04T10:49:11.1659678Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1660023Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1660068Z if out == self.unknown_value: 2025-12-04T10:49:11.1660219Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1660263Z Traceback (most recent call last): 2025-12-04T10:49:11.1660417Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1660456Z method(*args, **kwargs) 2025-12-04T10:49:11.1660616Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1660656Z method(*args, **kwargs) 2025-12-04T10:49:11.1660808Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1660845Z with policy(): 2025-12-04T10:49:11.1660996Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1661036Z raise RuntimeError(msg) 2025-12-04T10:49:11.1661452Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1661455Z 2025-12-04T10:49:11.1661527Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1661818Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1661821Z 2025-12-04T10:49:11.1661958Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1662028Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1662083Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1662261Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1662333Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1662369Z graph_break [] 2025-12-04T10:49:11.1662439Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1662796Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1662852Z if out == self.unknown_value: 2025-12-04T10:49:11.1662922Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1662976Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1663046Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1663222Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1663257Z graph_break [] 2025-12-04T10:49:11.1663308Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1663458Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1663504Z Traceback (most recent call last): 2025-12-04T10:49:11.1663657Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1663697Z method(*args, **kwargs) 2025-12-04T10:49:11.1663847Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1663885Z method(*args, **kwargs) 2025-12-04T10:49:11.1664034Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1664070Z with policy(): 2025-12-04T10:49:11.1664222Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1664261Z raise RuntimeError(msg) 2025-12-04T10:49:11.1664687Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1664690Z 2025-12-04T10:49:11.1664761Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1665060Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1665063Z 2025-12-04T10:49:11.1665147Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1665219Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1665274Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1665448Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1665520Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1665555Z graph_break [] 2025-12-04T10:49:11.1665625Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1665968Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1666011Z if out == self.unknown_value: 2025-12-04T10:49:11.1666084Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1666138Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1666218Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1666391Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1666442Z graph_break [] 2025-12-04T10:49:11.1666515Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1666567Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1666638Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1666811Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1666848Z graph_break [] 2025-12-04T10:49:11.1667089Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-f86bf80852151ab7.xml - 2025-12-04T10:49:11.1667150Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1667787Z FAILED [0.5404s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1667790Z 2025-12-04T10:49:11.1667861Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1668159Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1668164Z 2025-12-04T10:49:11.1668247Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1668309Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1668374Z ================== 1 failed, 57 deselected, 2 rerun in 11.65s ================== 2025-12-04T10:49:11.1668410Z Got exit code 1 2025-12-04T10:49:11.1668648Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1668785Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1668984Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-dea0ef6010352230.xml 2025-12-04T10:49:11.1669042Z ============================= test session starts ============================== 2025-12-04T10:49:11.1669153Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1669195Z cachedir: .pytest_cache 2025-12-04T10:49:11.1669352Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1669396Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1669436Z configfile: pytest.ini 2025-12-04T10:49:11.1669599Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1669671Z collecting ... collected 58 items / 44 deselected / 14 selected 2025-12-04T10:49:11.1669725Z stepcurrent: skipping 44 already run items. 2025-12-04T10:49:11.1669769Z Running 14 items in this shard 2025-12-04T10:49:11.1669771Z 2025-12-04T10:49:11.1670017Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.7039s] [ 7%] 2025-12-04T10:49:11.1670268Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6369s] [ 7%] 2025-12-04T10:49:11.1670497Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 FAILED [0.6151s] [ 7%] 2025-12-04T10:49:11.1670499Z 2025-12-04T10:49:11.1670550Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1670701Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1670745Z Traceback (most recent call last): 2025-12-04T10:49:11.1670904Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1670944Z method(*args, **kwargs) 2025-12-04T10:49:11.1671096Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1671136Z method(*args, **kwargs) 2025-12-04T10:49:11.1671286Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1671322Z with policy(): 2025-12-04T10:49:11.1671475Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1671515Z raise RuntimeError(msg) 2025-12-04T10:49:11.1671966Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1671970Z 2025-12-04T10:49:11.1672042Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1672328Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1672330Z 2025-12-04T10:49:11.1672415Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1672498Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1672554Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1672729Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1672801Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1672837Z graph_break [] 2025-12-04T10:49:11.1672986Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1673031Z Traceback (most recent call last): 2025-12-04T10:49:11.1673186Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1673226Z method(*args, **kwargs) 2025-12-04T10:49:11.1673377Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1673415Z method(*args, **kwargs) 2025-12-04T10:49:11.1673565Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1673601Z with policy(): 2025-12-04T10:49:11.1673766Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1673805Z raise RuntimeError(msg) 2025-12-04T10:49:11.1674202Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1674219Z 2025-12-04T10:49:11.1674290Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1674575Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1674577Z 2025-12-04T10:49:11.1674662Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1674734Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1674789Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1674962Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1675034Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1675070Z graph_break [] 2025-12-04T10:49:11.1675141Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1675195Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1675265Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1675451Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1675488Z graph_break [] 2025-12-04T10:49:11.1675540Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1675689Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1675734Z Traceback (most recent call last): 2025-12-04T10:49:11.1675886Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1675926Z method(*args, **kwargs) 2025-12-04T10:49:11.1676087Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1676127Z method(*args, **kwargs) 2025-12-04T10:49:11.1676275Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1676314Z with policy(): 2025-12-04T10:49:11.1676465Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1676505Z raise RuntimeError(msg) 2025-12-04T10:49:11.1676903Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1676905Z 2025-12-04T10:49:11.1676978Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1677262Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1677266Z 2025-12-04T10:49:11.1677361Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1679071Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1679151Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1679329Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1679401Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1679437Z graph_break [] 2025-12-04T10:49:11.1679512Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1679565Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1679636Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1679809Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1679847Z graph_break [] 2025-12-04T10:49:11.1679917Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1679972Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1680041Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1680212Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1680248Z graph_break [] 2025-12-04T10:49:11.1680491Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-dea0ef6010352230.xml - 2025-12-04T10:49:11.1680551Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1681188Z FAILED [0.6151s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1681195Z 2025-12-04T10:49:11.1681267Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1681567Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1681569Z 2025-12-04T10:49:11.1681655Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1681717Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1681785Z ================== 1 failed, 44 deselected, 2 rerun in 4.10s =================== 2025-12-04T10:49:11.1681822Z Got exit code 1 2025-12-04T10:49:11.1682109Z Retrying single test... 2025-12-04T10:49:11.1682306Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c02a83b16912c492.xml 2025-12-04T10:49:11.1682362Z ============================= test session starts ============================== 2025-12-04T10:49:11.1682476Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1682516Z cachedir: .pytest_cache 2025-12-04T10:49:11.1682697Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1682742Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1682809Z configfile: pytest.ini 2025-12-04T10:49:11.1682971Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1683061Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1683342Z stepcurrent: skipping 44 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1683387Z Running 1 items in this shard 2025-12-04T10:49:11.1683389Z 2025-12-04T10:49:11.1683750Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:38:13.595773265 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1683752Z 2025-12-04T10:49:11.1683907Z [W1204 10:38:20.345028283 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1683909Z 2025-12-04T10:49:11.1684059Z [W1204 10:38:20.345220852 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1684062Z 2025-12-04T10:49:11.1684209Z [W1204 10:38:20.349708284 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1684211Z 2025-12-04T10:49:11.1684360Z [W1204 10:38:20.350162481 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1684363Z 2025-12-04T10:49:11.1684510Z [W1204 10:38:20.350257400 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1684512Z 2025-12-04T10:49:11.1684673Z [W1204 10:38:20.352982268 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1684676Z 2025-12-04T10:49:11.1684822Z [W1204 10:38:20.353283625 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1684825Z 2025-12-04T10:49:11.1684972Z [W1204 10:38:20.353364714 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1684974Z 2025-12-04T10:49:11.1685026Z ('RERUN', {'yellow': True}) [10.4667s] [100%] 2025-12-04T10:49:11.1685395Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:38:21.381990164 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1685397Z 2025-12-04T10:49:11.1685548Z [W1204 10:38:21.382411950 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1685551Z 2025-12-04T10:49:11.1685697Z [W1204 10:38:21.382511549 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1685700Z 2025-12-04T10:49:11.1685851Z [W1204 10:38:21.383930858 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1685853Z 2025-12-04T10:49:11.1686002Z [W1204 10:38:21.384301865 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1686004Z 2025-12-04T10:49:11.1686151Z [W1204 10:38:21.384389354 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1686153Z 2025-12-04T10:49:11.1686301Z [W1204 10:38:21.386665375 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1686313Z 2025-12-04T10:49:11.1686459Z [W1204 10:38:21.386936783 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1686474Z 2025-12-04T10:49:11.1686621Z [W1204 10:38:21.387022172 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1686623Z 2025-12-04T10:49:11.1686673Z ('RERUN', {'yellow': True}) [0.4852s] [100%] 2025-12-04T10:49:11.1687026Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:38:22.899494269 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1687028Z 2025-12-04T10:49:11.1687177Z [W1204 10:38:22.899873696 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1687180Z 2025-12-04T10:49:11.1687325Z [W1204 10:38:22.899960835 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1687328Z 2025-12-04T10:49:11.1687476Z [W1204 10:38:22.901364594 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1687478Z 2025-12-04T10:49:11.1687624Z [W1204 10:38:22.901699391 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1687626Z 2025-12-04T10:49:11.1687772Z [W1204 10:38:22.901778160 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1687774Z 2025-12-04T10:49:11.1687930Z [W1204 10:38:22.904067471 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1687933Z 2025-12-04T10:49:11.1688081Z [W1204 10:38:22.904328819 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1688084Z 2025-12-04T10:49:11.1688230Z [W1204 10:38:22.904406569 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1688232Z 2025-12-04T10:49:11.1688270Z FAILED [0.5135s] [100%] 2025-12-04T10:49:11.1688276Z 2025-12-04T10:49:11.1688326Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1688485Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1688530Z Traceback (most recent call last): 2025-12-04T10:49:11.1688689Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1688729Z method(*args, **kwargs) 2025-12-04T10:49:11.1688881Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1688921Z method(*args, **kwargs) 2025-12-04T10:49:11.1689073Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1689109Z with policy(): 2025-12-04T10:49:11.1689260Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1689300Z raise RuntimeError(msg) 2025-12-04T10:49:11.1689695Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1689708Z 2025-12-04T10:49:11.1689780Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1690067Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1690081Z 2025-12-04T10:49:11.1690167Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1690238Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1690295Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1690470Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1690543Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1690579Z graph_break [] 2025-12-04T10:49:11.1690650Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1690995Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1691040Z if out == self.unknown_value: 2025-12-04T10:49:11.1691189Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1691235Z Traceback (most recent call last): 2025-12-04T10:49:11.1691385Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1691425Z method(*args, **kwargs) 2025-12-04T10:49:11.1691586Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1691627Z method(*args, **kwargs) 2025-12-04T10:49:11.1691774Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1691813Z with policy(): 2025-12-04T10:49:11.1691996Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1692036Z raise RuntimeError(msg) 2025-12-04T10:49:11.1692448Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1692450Z 2025-12-04T10:49:11.1692523Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1692810Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1692814Z 2025-12-04T10:49:11.1692900Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1692972Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1693027Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1693201Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1693272Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1693308Z graph_break [] 2025-12-04T10:49:11.1693379Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1693732Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1693790Z if out == self.unknown_value: 2025-12-04T10:49:11.1693859Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1693914Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1693984Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1694158Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1694194Z graph_break [] 2025-12-04T10:49:11.1694246Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1694396Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1694441Z Traceback (most recent call last): 2025-12-04T10:49:11.1694596Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1694635Z method(*args, **kwargs) 2025-12-04T10:49:11.1694786Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1694825Z method(*args, **kwargs) 2025-12-04T10:49:11.1694975Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1695012Z with policy(): 2025-12-04T10:49:11.1695162Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1695215Z raise RuntimeError(msg) 2025-12-04T10:49:11.1695612Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1695617Z 2025-12-04T10:49:11.1695688Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1695989Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1695992Z 2025-12-04T10:49:11.1696076Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1696149Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1696203Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1696379Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1696450Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1696487Z graph_break [] 2025-12-04T10:49:11.1696557Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1696903Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1696946Z if out == self.unknown_value: 2025-12-04T10:49:11.1697016Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1697083Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1697153Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1697327Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1697377Z graph_break [] 2025-12-04T10:49:11.1697447Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1697499Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1697569Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1697743Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1697779Z graph_break [] 2025-12-04T10:49:11.1698021Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c02a83b16912c492.xml - 2025-12-04T10:49:11.1698082Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1698709Z FAILED [0.5135s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1698712Z 2025-12-04T10:49:11.1698783Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1699084Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1699088Z 2025-12-04T10:49:11.1699171Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1699234Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1699299Z ================== 1 failed, 57 deselected, 2 rerun in 11.64s ================== 2025-12-04T10:49:11.1699336Z Got exit code 1 2025-12-04T10:49:11.1699375Z Retrying single test... 2025-12-04T10:49:11.1699580Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ea881206b12128b7.xml 2025-12-04T10:49:11.1699636Z ============================= test session starts ============================== 2025-12-04T10:49:11.1699749Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1699790Z cachedir: .pytest_cache 2025-12-04T10:49:11.1699949Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1699994Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1700035Z configfile: pytest.ini 2025-12-04T10:49:11.1700195Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1700268Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1700552Z stepcurrent: skipping 44 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1700596Z Running 1 items in this shard 2025-12-04T10:49:11.1700598Z 2025-12-04T10:49:11.1700956Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:38:31.350955400 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1700977Z 2025-12-04T10:49:11.1701142Z [W1204 10:38:39.904532065 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1701144Z 2025-12-04T10:49:11.1701296Z [W1204 10:38:39.904678464 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1701297Z 2025-12-04T10:49:11.1701447Z [W1204 10:38:39.907844086 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1701450Z 2025-12-04T10:49:11.1701595Z [W1204 10:38:39.908166124 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1701598Z 2025-12-04T10:49:11.1701746Z [W1204 10:38:39.908246773 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1701747Z 2025-12-04T10:49:11.1701930Z [W1204 10:38:39.910703732 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1701933Z 2025-12-04T10:49:11.1702080Z [W1204 10:38:39.910978020 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1702081Z 2025-12-04T10:49:11.1702228Z [W1204 10:38:39.911061799 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1702230Z 2025-12-04T10:49:11.1702280Z ('RERUN', {'yellow': True}) [10.2277s] [100%] 2025-12-04T10:49:11.1702659Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:38:40.080112681 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1702663Z 2025-12-04T10:49:11.1702811Z [W1204 10:38:40.080538587 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1702815Z 2025-12-04T10:49:11.1702962Z [W1204 10:38:40.080637737 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1702964Z 2025-12-04T10:49:11.1703133Z [W1204 10:38:40.082052134 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1703134Z 2025-12-04T10:49:11.1703284Z [W1204 10:38:40.082413531 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1703286Z 2025-12-04T10:49:11.1703435Z [W1204 10:38:40.082495471 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1703437Z 2025-12-04T10:49:11.1703584Z [W1204 10:38:40.084819821 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1703587Z 2025-12-04T10:49:11.1703733Z [W1204 10:38:40.085087698 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1703735Z 2025-12-04T10:49:11.1703882Z [W1204 10:38:40.085172118 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1703884Z 2025-12-04T10:49:11.1703933Z ('RERUN', {'yellow': True}) [0.6676s] [100%] 2025-12-04T10:49:11.1704288Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:38:41.763334337 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1704312Z 2025-12-04T10:49:11.1704460Z [W1204 10:38:41.763749523 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1704475Z 2025-12-04T10:49:11.1704622Z [W1204 10:38:41.763841752 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1704624Z 2025-12-04T10:49:11.1704771Z [W1204 10:38:41.765263950 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1704773Z 2025-12-04T10:49:11.1704919Z [W1204 10:38:41.765603017 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1704921Z 2025-12-04T10:49:11.1705068Z [W1204 10:38:41.765682347 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1705072Z 2025-12-04T10:49:11.1705218Z [W1204 10:38:41.768006477 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1705220Z 2025-12-04T10:49:11.1705368Z [W1204 10:38:41.768270744 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1705370Z 2025-12-04T10:49:11.1705518Z [W1204 10:38:41.768348284 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1705520Z 2025-12-04T10:49:11.1705559Z FAILED [0.6662s] [100%] 2025-12-04T10:49:11.1705561Z 2025-12-04T10:49:11.1705612Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1705772Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1705819Z Traceback (most recent call last): 2025-12-04T10:49:11.1705976Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1706017Z method(*args, **kwargs) 2025-12-04T10:49:11.1706168Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1706207Z method(*args, **kwargs) 2025-12-04T10:49:11.1706367Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1706404Z with policy(): 2025-12-04T10:49:11.1706555Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1706595Z raise RuntimeError(msg) 2025-12-04T10:49:11.1706988Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1706993Z 2025-12-04T10:49:11.1707066Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1707353Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1707356Z 2025-12-04T10:49:11.1707442Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1707513Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1707569Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1707760Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1707833Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1707882Z graph_break [] 2025-12-04T10:49:11.1707952Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1708295Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1708338Z if out == self.unknown_value: 2025-12-04T10:49:11.1708486Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1708530Z Traceback (most recent call last): 2025-12-04T10:49:11.1708684Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1708724Z method(*args, **kwargs) 2025-12-04T10:49:11.1708874Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1708913Z method(*args, **kwargs) 2025-12-04T10:49:11.1709064Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1709100Z with policy(): 2025-12-04T10:49:11.1709251Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1709291Z raise RuntimeError(msg) 2025-12-04T10:49:11.1709709Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1709712Z 2025-12-04T10:49:11.1709785Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1710073Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1710075Z 2025-12-04T10:49:11.1710160Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1710240Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1710296Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1710471Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1710545Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1710580Z graph_break [] 2025-12-04T10:49:11.1710651Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1710992Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1711036Z if out == self.unknown_value: 2025-12-04T10:49:11.1711107Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1711161Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1711231Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1711406Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1711452Z graph_break [] 2025-12-04T10:49:11.1711502Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1711663Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1711707Z Traceback (most recent call last): 2025-12-04T10:49:11.1711892Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1711931Z method(*args, **kwargs) 2025-12-04T10:49:11.1712083Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1712122Z method(*args, **kwargs) 2025-12-04T10:49:11.1712273Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1712310Z with policy(): 2025-12-04T10:49:11.1712461Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1712501Z raise RuntimeError(msg) 2025-12-04T10:49:11.1712900Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1712902Z 2025-12-04T10:49:11.1712975Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1713275Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1713278Z 2025-12-04T10:49:11.1713363Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1713434Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1713490Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1713662Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1713732Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1713767Z graph_break [] 2025-12-04T10:49:11.1713851Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1714194Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1714237Z if out == self.unknown_value: 2025-12-04T10:49:11.1714307Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1714363Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1714432Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1714605Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1714641Z graph_break [] 2025-12-04T10:49:11.1714711Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1714764Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1714834Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1715007Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1715056Z graph_break [] 2025-12-04T10:49:11.1715296Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ea881206b12128b7.xml - 2025-12-04T10:49:11.1715376Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1715999Z FAILED [0.6662s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1716003Z 2025-12-04T10:49:11.1716074Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1716360Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1716363Z 2025-12-04T10:49:11.1716447Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1716508Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1716575Z ================== 1 failed, 57 deselected, 2 rerun in 11.74s ================== 2025-12-04T10:49:11.1716611Z Got exit code 1 2025-12-04T10:49:11.1716850Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1716988Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1717185Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-e0035f49889a92f7.xml 2025-12-04T10:49:11.1717242Z ============================= test session starts ============================== 2025-12-04T10:49:11.1717353Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1717393Z cachedir: .pytest_cache 2025-12-04T10:49:11.1717565Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1717611Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1717651Z configfile: pytest.ini 2025-12-04T10:49:11.1717812Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1717888Z collecting ... collected 58 items / 45 deselected / 13 selected 2025-12-04T10:49:11.1717939Z stepcurrent: skipping 45 already run items. 2025-12-04T10:49:11.1717983Z Running 13 items in this shard 2025-12-04T10:49:11.1717986Z 2025-12-04T10:49:11.1718232Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [3.1682s] [ 7%] 2025-12-04T10:49:11.1718476Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6465s] [ 7%] 2025-12-04T10:49:11.1718698Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 FAILED [0.6777s] [ 7%] 2025-12-04T10:49:11.1718700Z 2025-12-04T10:49:11.1718751Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1718912Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1718956Z Traceback (most recent call last): 2025-12-04T10:49:11.1719123Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1719162Z method(*args, **kwargs) 2025-12-04T10:49:11.1719313Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1719352Z method(*args, **kwargs) 2025-12-04T10:49:11.1719502Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1719538Z with policy(): 2025-12-04T10:49:11.1719692Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1719733Z raise RuntimeError(msg) 2025-12-04T10:49:11.1720124Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1720128Z 2025-12-04T10:49:11.1720200Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1720488Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1720490Z 2025-12-04T10:49:11.1720575Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1720661Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1720717Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1720990Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1721063Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1721099Z graph_break [] 2025-12-04T10:49:11.1721247Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1721303Z Traceback (most recent call last): 2025-12-04T10:49:11.1721456Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1721495Z method(*args, **kwargs) 2025-12-04T10:49:11.1721646Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1721686Z method(*args, **kwargs) 2025-12-04T10:49:11.1721835Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1721940Z with policy(): 2025-12-04T10:49:11.1722815Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1723006Z raise RuntimeError(msg) 2025-12-04T10:49:11.1723470Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1723475Z 2025-12-04T10:49:11.1723566Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1724230Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1724309Z 2025-12-04T10:49:11.1724404Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1724500Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1724560Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1724843Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1724922Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1724963Z graph_break [] 2025-12-04T10:49:11.1725044Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1725101Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1725175Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1725449Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1725492Z graph_break [] 2025-12-04T10:49:11.1725548Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1725709Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1725760Z Traceback (most recent call last): 2025-12-04T10:49:11.1725990Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1726036Z method(*args, **kwargs) 2025-12-04T10:49:11.1726192Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1726237Z method(*args, **kwargs) 2025-12-04T10:49:11.1726390Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1726430Z with policy(): 2025-12-04T10:49:11.1726585Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1726701Z raise RuntimeError(msg) 2025-12-04T10:49:11.1727113Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1727118Z 2025-12-04T10:49:11.1727198Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1727489Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1727491Z 2025-12-04T10:49:11.1727584Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1727660Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1727720Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1727994Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1728084Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1728123Z graph_break [] 2025-12-04T10:49:11.1728200Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1728273Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1728348Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1728626Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1728664Z graph_break [] 2025-12-04T10:49:11.1728739Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1728794Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1728872Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1729141Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1729184Z graph_break [] 2025-12-04T10:49:11.1729434Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-e0035f49889a92f7.xml - 2025-12-04T10:49:11.1729499Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1730171Z FAILED [0.6777s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1730180Z 2025-12-04T10:49:11.1730254Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1730545Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1730548Z 2025-12-04T10:49:11.1730648Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1730715Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1730786Z ================== 1 failed, 45 deselected, 2 rerun in 4.68s =================== 2025-12-04T10:49:11.1730829Z Got exit code 1 2025-12-04T10:49:11.1730874Z Retrying single test... 2025-12-04T10:49:11.1731078Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-daabaf0caa5254a1.xml 2025-12-04T10:49:11.1731139Z ============================= test session starts ============================== 2025-12-04T10:49:11.1731261Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1731304Z cachedir: .pytest_cache 2025-12-04T10:49:11.1731470Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1731522Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1731569Z configfile: pytest.ini 2025-12-04T10:49:11.1731737Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1731817Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1732175Z stepcurrent: skipping 45 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1732243Z Running 1 items in this shard 2025-12-04T10:49:11.1732246Z 2025-12-04T10:49:11.1732613Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:39:04.732949923 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1732617Z 2025-12-04T10:49:11.1732772Z [W1204 10:39:11.444654433 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1732774Z 2025-12-04T10:49:11.1732931Z [W1204 10:39:11.444841831 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1732934Z 2025-12-04T10:49:11.1733085Z [W1204 10:39:11.448221841 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1733092Z 2025-12-04T10:49:11.1733241Z [W1204 10:39:11.448561568 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1733243Z 2025-12-04T10:49:11.1733397Z [W1204 10:39:11.448636187 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1733399Z 2025-12-04T10:49:11.1733549Z [W1204 10:39:11.451175254 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1733551Z 2025-12-04T10:49:11.1733718Z [W1204 10:39:11.451442482 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1733721Z 2025-12-04T10:49:11.1733870Z [W1204 10:39:11.451517401 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1733877Z 2025-12-04T10:49:11.1733931Z ('RERUN', {'yellow': True}) [10.8040s] [100%] 2025-12-04T10:49:11.1734293Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:39:12.141323692 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1734295Z 2025-12-04T10:49:11.1734462Z [W1204 10:39:12.141718978 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1734464Z 2025-12-04T10:49:11.1734620Z [W1204 10:39:12.141814337 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1734623Z 2025-12-04T10:49:11.1734773Z [W1204 10:39:12.143356963 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1734776Z 2025-12-04T10:49:11.1734929Z [W1204 10:39:12.143666030 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1734931Z 2025-12-04T10:49:11.1735084Z [W1204 10:39:12.143746950 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1735087Z 2025-12-04T10:49:11.1735240Z [W1204 10:39:12.145987119 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1735242Z 2025-12-04T10:49:11.1735400Z [W1204 10:39:12.146260437 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1735414Z 2025-12-04T10:49:11.1735565Z [W1204 10:39:12.146337136 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1735580Z 2025-12-04T10:49:11.1735637Z ('RERUN', {'yellow': True}) [0.5420s] [100%] 2025-12-04T10:49:11.1735999Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:39:13.662272822 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1736001Z 2025-12-04T10:49:11.1736152Z [W1204 10:39:13.662657018 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1736154Z 2025-12-04T10:49:11.1736309Z [W1204 10:39:13.662748037 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1736312Z 2025-12-04T10:49:11.1736461Z [W1204 10:39:13.664142615 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1736464Z 2025-12-04T10:49:11.1736619Z [W1204 10:39:13.664396152 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1736621Z 2025-12-04T10:49:11.1736773Z [W1204 10:39:13.664490551 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1736775Z 2025-12-04T10:49:11.1736926Z [W1204 10:39:13.666650632 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1736928Z 2025-12-04T10:49:11.1737080Z [W1204 10:39:13.666905769 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1737112Z 2025-12-04T10:49:11.1737262Z [W1204 10:39:13.666980939 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1737264Z 2025-12-04T10:49:11.1737311Z FAILED [0.5302s] [100%] 2025-12-04T10:49:11.1737313Z 2025-12-04T10:49:11.1737369Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1737528Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1737578Z Traceback (most recent call last): 2025-12-04T10:49:11.1737754Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1737799Z method(*args, **kwargs) 2025-12-04T10:49:11.1737957Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1738005Z method(*args, **kwargs) 2025-12-04T10:49:11.1738159Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1738203Z with policy(): 2025-12-04T10:49:11.1738359Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1738406Z raise RuntimeError(msg) 2025-12-04T10:49:11.1738801Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1738804Z 2025-12-04T10:49:11.1738884Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1739175Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1739193Z 2025-12-04T10:49:11.1739286Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1739376Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1739438Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1739717Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1739793Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1739837Z graph_break [] 2025-12-04T10:49:11.1739912Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1740266Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1740313Z if out == self.unknown_value: 2025-12-04T10:49:11.1740470Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1740518Z Traceback (most recent call last): 2025-12-04T10:49:11.1740677Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1740721Z method(*args, **kwargs) 2025-12-04T10:49:11.1740877Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1740922Z method(*args, **kwargs) 2025-12-04T10:49:11.1741088Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1741129Z with policy(): 2025-12-04T10:49:11.1741287Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1741331Z raise RuntimeError(msg) 2025-12-04T10:49:11.1741747Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1741751Z 2025-12-04T10:49:11.1741830Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1742161Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1742164Z 2025-12-04T10:49:11.1742256Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1742333Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1742395Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1742667Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1742747Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1742786Z graph_break [] 2025-12-04T10:49:11.1742864Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1743207Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1743279Z if out == self.unknown_value: 2025-12-04T10:49:11.1743372Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1743429Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1743506Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1743777Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1743819Z graph_break [] 2025-12-04T10:49:11.1743874Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1744031Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1744080Z Traceback (most recent call last): 2025-12-04T10:49:11.1744240Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1744283Z method(*args, **kwargs) 2025-12-04T10:49:11.1744440Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1744482Z method(*args, **kwargs) 2025-12-04T10:49:11.1744640Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1744679Z with policy(): 2025-12-04T10:49:11.1744837Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1744880Z raise RuntimeError(msg) 2025-12-04T10:49:11.1745300Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1745304Z 2025-12-04T10:49:11.1745383Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1745686Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1745688Z 2025-12-04T10:49:11.1745782Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1745856Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1745917Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1746190Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1746268Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1746307Z graph_break [] 2025-12-04T10:49:11.1746384Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1746728Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1746778Z if out == self.unknown_value: 2025-12-04T10:49:11.1746854Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1746922Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1746999Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1747267Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1747324Z graph_break [] 2025-12-04T10:49:11.1747397Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1747457Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1747529Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1747802Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1747841Z graph_break [] 2025-12-04T10:49:11.1748091Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-daabaf0caa5254a1.xml - 2025-12-04T10:49:11.1748154Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1748784Z FAILED [0.5302s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1748786Z 2025-12-04T10:49:11.1748874Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1749163Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1749167Z 2025-12-04T10:49:11.1749258Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1749321Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1749394Z ================== 1 failed, 57 deselected, 2 rerun in 12.04s ================== 2025-12-04T10:49:11.1749433Z Got exit code 1 2025-12-04T10:49:11.1749496Z Retrying single test... 2025-12-04T10:49:11.1749696Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-55f495db05f11a1a.xml 2025-12-04T10:49:11.1749761Z ============================= test session starts ============================== 2025-12-04T10:49:11.1749878Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1749926Z cachedir: .pytest_cache 2025-12-04T10:49:11.1750087Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1750141Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1750185Z configfile: pytest.ini 2025-12-04T10:49:11.1750354Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1750433Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1750718Z stepcurrent: skipping 45 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1750770Z Running 1 items in this shard 2025-12-04T10:49:11.1750789Z 2025-12-04T10:49:11.1751146Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:39:23.911353729 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1751174Z 2025-12-04T10:49:11.1751332Z [W1204 10:39:30.558627097 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1751335Z 2025-12-04T10:49:11.1751488Z [W1204 10:39:30.558770176 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1751494Z 2025-12-04T10:49:11.1751646Z [W1204 10:39:30.561922087 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1751648Z 2025-12-04T10:49:11.1751804Z [W1204 10:39:30.562241234 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1751806Z 2025-12-04T10:49:11.1751995Z [W1204 10:39:30.562322073 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1751998Z 2025-12-04T10:49:11.1752152Z [W1204 10:39:30.564841469 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1752154Z 2025-12-04T10:49:11.1752305Z [W1204 10:39:30.565126157 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1752307Z 2025-12-04T10:49:11.1752460Z [W1204 10:39:30.565206956 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1752462Z 2025-12-04T10:49:11.1752538Z ('RERUN', {'yellow': True}) [9.5775s] [100%] 2025-12-04T10:49:11.1752896Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:39:30.145857551 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1752900Z 2025-12-04T10:49:11.1753054Z [W1204 10:39:30.146248138 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1753056Z 2025-12-04T10:49:11.1753219Z [W1204 10:39:30.146351407 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1753220Z 2025-12-04T10:49:11.1753374Z [W1204 10:39:30.147749944 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1753376Z 2025-12-04T10:49:11.1753529Z [W1204 10:39:30.148034281 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1753532Z 2025-12-04T10:49:11.1753681Z [W1204 10:39:30.148120710 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1753684Z 2025-12-04T10:49:11.1753837Z [W1204 10:39:30.150284410 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1753839Z 2025-12-04T10:49:11.1753990Z [W1204 10:39:30.150567267 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1753991Z 2025-12-04T10:49:11.1754150Z [W1204 10:39:30.150648247 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1754152Z 2025-12-04T10:49:11.1754203Z ('RERUN', {'yellow': True}) [0.4603s] [100%] 2025-12-04T10:49:11.1754582Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:39:31.625229681 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1754600Z 2025-12-04T10:49:11.1754754Z [W1204 10:39:31.625613098 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1754756Z 2025-12-04T10:49:11.1754906Z [W1204 10:39:31.625711457 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1754908Z 2025-12-04T10:49:11.1755062Z [W1204 10:39:31.627109104 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1755063Z 2025-12-04T10:49:11.1755214Z [W1204 10:39:31.627388291 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1755221Z 2025-12-04T10:49:11.1755370Z [W1204 10:39:31.627471050 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1755374Z 2025-12-04T10:49:11.1755528Z [W1204 10:39:31.629668240 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1755531Z 2025-12-04T10:49:11.1755680Z [W1204 10:39:31.629940117 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1755683Z 2025-12-04T10:49:11.1755836Z [W1204 10:39:31.630027686 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1755837Z 2025-12-04T10:49:11.1755878Z FAILED [0.4736s] [100%] 2025-12-04T10:49:11.1755880Z 2025-12-04T10:49:11.1755952Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1756104Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1756159Z Traceback (most recent call last): 2025-12-04T10:49:11.1756323Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1756367Z method(*args, **kwargs) 2025-12-04T10:49:11.1756525Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1756610Z method(*args, **kwargs) 2025-12-04T10:49:11.1756768Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1756809Z with policy(): 2025-12-04T10:49:11.1756970Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1757014Z raise RuntimeError(msg) 2025-12-04T10:49:11.1757412Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1757416Z 2025-12-04T10:49:11.1757492Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1757786Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1757788Z 2025-12-04T10:49:11.1757877Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1757969Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1758031Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1758304Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1758399Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1758438Z graph_break [] 2025-12-04T10:49:11.1758517Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1758865Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1758917Z if out == self.unknown_value: 2025-12-04T10:49:11.1759072Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1759124Z Traceback (most recent call last): 2025-12-04T10:49:11.1759279Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1759326Z method(*args, **kwargs) 2025-12-04T10:49:11.1759479Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1759528Z method(*args, **kwargs) 2025-12-04T10:49:11.1759682Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1759726Z with policy(): 2025-12-04T10:49:11.1759882Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1759944Z raise RuntimeError(msg) 2025-12-04T10:49:11.1760452Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1760458Z 2025-12-04T10:49:11.1760533Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1760841Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1760843Z 2025-12-04T10:49:11.1760933Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1761013Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1761072Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1761350Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1761427Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1761470Z graph_break [] 2025-12-04T10:49:11.1761544Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1762122Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1762173Z if out == self.unknown_value: 2025-12-04T10:49:11.1762248Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1762332Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1762406Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1762725Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1762764Z graph_break [] 2025-12-04T10:49:11.1762824Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1762976Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1763028Z Traceback (most recent call last): 2025-12-04T10:49:11.1763187Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1763234Z method(*args, **kwargs) 2025-12-04T10:49:11.1763385Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1763433Z method(*args, **kwargs) 2025-12-04T10:49:11.1763585Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1763629Z with policy(): 2025-12-04T10:49:11.1763783Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1763830Z raise RuntimeError(msg) 2025-12-04T10:49:11.1764246Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1764253Z 2025-12-04T10:49:11.1764329Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1764619Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1764623Z 2025-12-04T10:49:11.1764711Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1764788Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1764859Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1765136Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1765211Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1765255Z graph_break [] 2025-12-04T10:49:11.1765329Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1765676Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1765726Z if out == self.unknown_value: 2025-12-04T10:49:11.1765798Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1765860Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1765932Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1766206Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1766266Z graph_break [] 2025-12-04T10:49:11.1766342Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1766413Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1766489Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1766759Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1766801Z graph_break [] 2025-12-04T10:49:11.1767050Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-55f495db05f11a1a.xml - 2025-12-04T10:49:11.1767119Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1767748Z FAILED [0.4736s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1767752Z 2025-12-04T10:49:11.1767828Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1768118Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1768136Z 2025-12-04T10:49:11.1768224Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1768292Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1768361Z ================== 1 failed, 57 deselected, 2 rerun in 10.69s ================== 2025-12-04T10:49:11.1768425Z Got exit code 1 2025-12-04T10:49:11.1768665Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1768816Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1769017Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d345b5674bb470d0.xml 2025-12-04T10:49:11.1769076Z ============================= test session starts ============================== 2025-12-04T10:49:11.1769197Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1769241Z cachedir: .pytest_cache 2025-12-04T10:49:11.1769406Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1769455Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1769501Z configfile: pytest.ini 2025-12-04T10:49:11.1769666Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1769748Z collecting ... collected 58 items / 46 deselected / 12 selected 2025-12-04T10:49:11.1769803Z stepcurrent: skipping 46 already run items. 2025-12-04T10:49:11.1769853Z Running 12 items in this shard 2025-12-04T10:49:11.1769856Z 2025-12-04T10:49:11.1770109Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.6525s] [ 8%] 2025-12-04T10:49:11.1770372Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.5290s] [ 8%] 2025-12-04T10:49:11.1770608Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 FAILED [0.5278s] [ 8%] 2025-12-04T10:49:11.1770614Z 2025-12-04T10:49:11.1770668Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1770826Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1770874Z Traceback (most recent call last): 2025-12-04T10:49:11.1771040Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1771084Z method(*args, **kwargs) 2025-12-04T10:49:11.1771241Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1771284Z method(*args, **kwargs) 2025-12-04T10:49:11.1771440Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1771480Z with policy(): 2025-12-04T10:49:11.1771638Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1771683Z raise RuntimeError(msg) 2025-12-04T10:49:11.1772155Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1772158Z 2025-12-04T10:49:11.1772234Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1772541Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1772544Z 2025-12-04T10:49:11.1772635Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1772710Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1772785Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1772964Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1773043Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1773083Z graph_break [] 2025-12-04T10:49:11.1773244Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1773292Z Traceback (most recent call last): 2025-12-04T10:49:11.1773450Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1773492Z method(*args, **kwargs) 2025-12-04T10:49:11.1773647Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1773690Z method(*args, **kwargs) 2025-12-04T10:49:11.1773845Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1773885Z with policy(): 2025-12-04T10:49:11.1774044Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1774104Z raise RuntimeError(msg) 2025-12-04T10:49:11.1774513Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1774533Z 2025-12-04T10:49:11.1774612Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1774903Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1774906Z 2025-12-04T10:49:11.1774997Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1775073Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1775133Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1775308Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1775387Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1775426Z graph_break [] 2025-12-04T10:49:11.1775503Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1775559Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1775636Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1775810Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1775868Z graph_break [] 2025-12-04T10:49:11.1775923Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1776078Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1776131Z Traceback (most recent call last): 2025-12-04T10:49:11.1776288Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1776334Z method(*args, **kwargs) 2025-12-04T10:49:11.1776504Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1776550Z method(*args, **kwargs) 2025-12-04T10:49:11.1776701Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1776744Z with policy(): 2025-12-04T10:49:11.1776898Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1776946Z raise RuntimeError(msg) 2025-12-04T10:49:11.1777350Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1777353Z 2025-12-04T10:49:11.1777432Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1777720Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1777727Z 2025-12-04T10:49:11.1777815Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1777904Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1777963Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1778155Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1778228Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1778272Z graph_break [] 2025-12-04T10:49:11.1778344Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1778405Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1778477Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1778656Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1778696Z graph_break [] 2025-12-04T10:49:11.1778773Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1778829Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1778906Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1779081Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1779124Z graph_break [] 2025-12-04T10:49:11.1779368Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d345b5674bb470d0.xml - 2025-12-04T10:49:11.1779434Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1780085Z FAILED [0.5278s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1780091Z 2025-12-04T10:49:11.1780165Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1780473Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1780476Z 2025-12-04T10:49:11.1780562Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1780630Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1780699Z ================== 1 failed, 46 deselected, 2 rerun in 3.87s =================== 2025-12-04T10:49:11.1780743Z Got exit code 1 2025-12-04T10:49:11.1780786Z Retrying single test... 2025-12-04T10:49:11.1780989Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-3812d43f994aee70.xml 2025-12-04T10:49:11.1781048Z ============================= test session starts ============================== 2025-12-04T10:49:11.1781167Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1781211Z cachedir: .pytest_cache 2025-12-04T10:49:11.1781375Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1781427Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1781470Z configfile: pytest.ini 2025-12-04T10:49:11.1781652Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1781728Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1782066Z stepcurrent: skipping 46 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1782113Z Running 1 items in this shard 2025-12-04T10:49:11.1782115Z 2025-12-04T10:49:11.1782485Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:39:51.024156370 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1782487Z 2025-12-04T10:49:11.1782642Z [W1204 10:39:58.378335826 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1782646Z 2025-12-04T10:49:11.1782802Z [W1204 10:39:58.378517414 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1782805Z 2025-12-04T10:49:11.1782959Z [W1204 10:39:58.381904981 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1782961Z 2025-12-04T10:49:11.1783110Z [W1204 10:39:58.383174789 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1783112Z 2025-12-04T10:49:11.1783266Z [W1204 10:39:58.383265208 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1783270Z 2025-12-04T10:49:11.1783447Z [W1204 10:39:58.385823573 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1783450Z 2025-12-04T10:49:11.1783605Z [W1204 10:39:58.386123780 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1783608Z 2025-12-04T10:49:11.1783762Z [W1204 10:39:58.386205669 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1783763Z 2025-12-04T10:49:11.1783815Z ('RERUN', {'yellow': True}) [9.9764s] [100%] 2025-12-04T10:49:11.1784193Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:39:59.427019630 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1784195Z 2025-12-04T10:49:11.1784346Z [W1204 10:39:59.427440426 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1784348Z 2025-12-04T10:49:11.1784501Z [W1204 10:39:59.427528455 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1784504Z 2025-12-04T10:49:11.1784657Z [W1204 10:39:59.428909222 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1784659Z 2025-12-04T10:49:11.1784808Z [W1204 10:39:59.429247169 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1784811Z 2025-12-04T10:49:11.1784964Z [W1204 10:39:59.429331408 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1784966Z 2025-12-04T10:49:11.1785116Z [W1204 10:39:59.431546846 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1785138Z 2025-12-04T10:49:11.1785290Z [W1204 10:39:59.431810263 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1785307Z 2025-12-04T10:49:11.1785455Z [W1204 10:39:59.431893533 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1785461Z 2025-12-04T10:49:11.1785512Z ('RERUN', {'yellow': True}) [0.4978s] [100%] 2025-12-04T10:49:11.1785875Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:40:00.909340711 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1785878Z 2025-12-04T10:49:11.1786027Z [W1204 10:40:00.909722047 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1786030Z 2025-12-04T10:49:11.1786182Z [W1204 10:40:00.909808026 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1786187Z 2025-12-04T10:49:11.1786335Z [W1204 10:40:00.911190933 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1786341Z 2025-12-04T10:49:11.1786490Z [W1204 10:40:00.911517310 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1786492Z 2025-12-04T10:49:11.1786645Z [W1204 10:40:00.911597559 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1786646Z 2025-12-04T10:49:11.1786811Z [W1204 10:40:00.913797027 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1786886Z 2025-12-04T10:49:11.1787087Z [W1204 10:40:00.914061135 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1787090Z 2025-12-04T10:49:11.1787239Z [W1204 10:40:00.914141524 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1787241Z 2025-12-04T10:49:11.1787286Z FAILED [0.4723s] [100%] 2025-12-04T10:49:11.1787289Z 2025-12-04T10:49:11.1787345Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1787516Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1787571Z Traceback (most recent call last): 2025-12-04T10:49:11.1787729Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1787778Z method(*args, **kwargs) 2025-12-04T10:49:11.1787933Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1787979Z method(*args, **kwargs) 2025-12-04T10:49:11.1788135Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1788179Z with policy(): 2025-12-04T10:49:11.1788332Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1788380Z raise RuntimeError(msg) 2025-12-04T10:49:11.1788780Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1788798Z 2025-12-04T10:49:11.1788878Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1789167Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1789191Z 2025-12-04T10:49:11.1789280Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1789359Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1789418Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1789598Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1789673Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1789717Z graph_break [] 2025-12-04T10:49:11.1789791Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1790141Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1790192Z if out == self.unknown_value: 2025-12-04T10:49:11.1790350Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1790398Z Traceback (most recent call last): 2025-12-04T10:49:11.1790558Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1790600Z method(*args, **kwargs) 2025-12-04T10:49:11.1790772Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1790815Z method(*args, **kwargs) 2025-12-04T10:49:11.1790971Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1791012Z with policy(): 2025-12-04T10:49:11.1791169Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1791213Z raise RuntimeError(msg) 2025-12-04T10:49:11.1791635Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1791638Z 2025-12-04T10:49:11.1791717Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1793022Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1793123Z 2025-12-04T10:49:11.1793601Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1793849Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1794022Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1794591Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1794810Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1794919Z graph_break [] 2025-12-04T10:49:11.1795138Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1796681Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1796896Z if out == self.unknown_value: 2025-12-04T10:49:11.1797101Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1797258Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1797462Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1797956Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1798064Z graph_break [] 2025-12-04T10:49:11.1798210Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1798659Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1798788Z Traceback (most recent call last): 2025-12-04T10:49:11.1799248Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1799372Z method(*args, **kwargs) 2025-12-04T10:49:11.1799801Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1799907Z method(*args, **kwargs) 2025-12-04T10:49:11.1800324Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1800426Z with policy(): 2025-12-04T10:49:11.1800848Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1800961Z raise RuntimeError(msg) 2025-12-04T10:49:11.1802268Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1802284Z 2025-12-04T10:49:11.1802500Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1803392Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1803398Z 2025-12-04T10:49:11.1803651Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1803854Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1803930Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1804184Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1804291Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1804343Z graph_break [] 2025-12-04T10:49:11.1804445Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1804921Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1804985Z if out == self.unknown_value: 2025-12-04T10:49:11.1805083Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1805168Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1805300Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1805546Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1805626Z graph_break [] 2025-12-04T10:49:11.1805724Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1805800Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1805897Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1806142Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1806193Z graph_break [] 2025-12-04T10:49:11.1806529Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-3812d43f994aee70.xml - 2025-12-04T10:49:11.1806615Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1807529Z FAILED [0.4723s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1807533Z 2025-12-04T10:49:11.1807638Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1808053Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1808058Z 2025-12-04T10:49:11.1808184Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1808266Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1808365Z ================== 1 failed, 57 deselected, 2 rerun in 11.09s ================== 2025-12-04T10:49:11.1808417Z Got exit code 1 2025-12-04T10:49:11.1808476Z Retrying single test... 2025-12-04T10:49:11.1808771Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-cb510df96182b631.xml 2025-12-04T10:49:11.1808859Z ============================= test session starts ============================== 2025-12-04T10:49:11.1809018Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1809079Z cachedir: .pytest_cache 2025-12-04T10:49:11.1809305Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1809376Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1809432Z configfile: pytest.ini 2025-12-04T10:49:11.1809672Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1809777Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1810176Z stepcurrent: skipping 46 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1810246Z Running 1 items in this shard 2025-12-04T10:49:11.1810248Z 2025-12-04T10:49:11.1810755Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:40:09.347299491 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1810771Z 2025-12-04T10:49:11.1811006Z [W1204 10:40:17.002096185 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1811009Z 2025-12-04T10:49:11.1811216Z [W1204 10:40:17.002283563 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1811218Z 2025-12-04T10:49:11.1811426Z [W1204 10:40:17.006489001 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1811428Z 2025-12-04T10:49:11.1811636Z [W1204 10:40:17.006976246 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1811639Z 2025-12-04T10:49:11.1811843Z [W1204 10:40:17.007069255 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1811884Z 2025-12-04T10:49:11.1812093Z [W1204 10:40:17.009700798 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1812096Z 2025-12-04T10:49:11.1812300Z [W1204 10:40:17.010049865 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1812302Z 2025-12-04T10:49:11.1812513Z [W1204 10:40:17.010132714 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1812515Z 2025-12-04T10:49:11.1812594Z ('RERUN', {'yellow': True}) [10.3914s] [100%] 2025-12-04T10:49:11.1813118Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:40:18.189477836 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1813121Z 2025-12-04T10:49:11.1813329Z [W1204 10:40:18.189896172 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1813333Z 2025-12-04T10:49:11.1813535Z [W1204 10:40:18.190011211 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1813539Z 2025-12-04T10:49:11.1813766Z [W1204 10:40:18.191429757 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1813770Z 2025-12-04T10:49:11.1813972Z [W1204 10:40:18.191763343 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1813979Z 2025-12-04T10:49:11.1814184Z [W1204 10:40:18.191842883 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1814187Z 2025-12-04T10:49:11.1814395Z [W1204 10:40:18.194177709 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1814398Z 2025-12-04T10:49:11.1814597Z [W1204 10:40:18.194455306 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1814599Z 2025-12-04T10:49:11.1814800Z [W1204 10:40:18.194532576 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1814802Z 2025-12-04T10:49:11.1814862Z ('RERUN', {'yellow': True}) [0.6910s] [100%] 2025-12-04T10:49:11.1815261Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:40:19.917347965 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1815277Z 2025-12-04T10:49:11.1815443Z [W1204 10:40:19.917767771 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1815477Z 2025-12-04T10:49:11.1815639Z [W1204 10:40:19.917858430 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1815641Z 2025-12-04T10:49:11.1815805Z [W1204 10:40:19.919301826 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1815807Z 2025-12-04T10:49:11.1815968Z [W1204 10:40:19.919642872 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1815970Z 2025-12-04T10:49:11.1816133Z [W1204 10:40:19.919721862 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1816136Z 2025-12-04T10:49:11.1816300Z [W1204 10:40:19.922074368 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1816302Z 2025-12-04T10:49:11.1816464Z [W1204 10:40:19.922339535 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1816466Z 2025-12-04T10:49:11.1816629Z [W1204 10:40:19.922422125 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1816632Z 2025-12-04T10:49:11.1816677Z FAILED [0.7138s] [100%] 2025-12-04T10:49:11.1816679Z 2025-12-04T10:49:11.1816740Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1816917Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1816973Z Traceback (most recent call last): 2025-12-04T10:49:11.1817146Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1817196Z method(*args, **kwargs) 2025-12-04T10:49:11.1817362Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1817409Z method(*args, **kwargs) 2025-12-04T10:49:11.1817573Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1817629Z with policy(): 2025-12-04T10:49:11.1817799Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1817843Z raise RuntimeError(msg) 2025-12-04T10:49:11.1818281Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1818285Z 2025-12-04T10:49:11.1818366Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1818687Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1818691Z 2025-12-04T10:49:11.1818785Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1818867Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1818929Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1819139Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1819222Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1819276Z graph_break [] 2025-12-04T10:49:11.1819358Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1819732Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1819784Z if out == self.unknown_value: 2025-12-04T10:49:11.1819950Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1820002Z Traceback (most recent call last): 2025-12-04T10:49:11.1820173Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1820221Z method(*args, **kwargs) 2025-12-04T10:49:11.1820385Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1820433Z method(*args, **kwargs) 2025-12-04T10:49:11.1820598Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1820642Z with policy(): 2025-12-04T10:49:11.1820808Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1820856Z raise RuntimeError(msg) 2025-12-04T10:49:11.1821313Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1821321Z 2025-12-04T10:49:11.1821400Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1821720Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1821722Z 2025-12-04T10:49:11.1821815Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1821960Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1822021Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1822215Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1822296Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1822341Z graph_break [] 2025-12-04T10:49:11.1822419Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1822796Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1822844Z if out == self.unknown_value: 2025-12-04T10:49:11.1822926Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1822986Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1823067Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1823257Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1823318Z graph_break [] 2025-12-04T10:49:11.1823378Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1823558Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1823611Z Traceback (most recent call last): 2025-12-04T10:49:11.1823778Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1823852Z method(*args, **kwargs) 2025-12-04T10:49:11.1824016Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1824062Z method(*args, **kwargs) 2025-12-04T10:49:11.1824226Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1824271Z with policy(): 2025-12-04T10:49:11.1824435Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1824482Z raise RuntimeError(msg) 2025-12-04T10:49:11.1824894Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1824897Z 2025-12-04T10:49:11.1824973Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1825274Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1825281Z 2025-12-04T10:49:11.1825367Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1825441Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1825500Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1825677Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1825750Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1825789Z graph_break [] 2025-12-04T10:49:11.1825872Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1826218Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1826264Z if out == self.unknown_value: 2025-12-04T10:49:11.1826337Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1826393Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1826466Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1826640Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1826679Z graph_break [] 2025-12-04T10:49:11.1826750Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1826807Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1826877Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1827052Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1827102Z graph_break [] 2025-12-04T10:49:11.1827344Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-cb510df96182b631.xml - 2025-12-04T10:49:11.1827417Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1828053Z FAILED [0.7138s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1828057Z 2025-12-04T10:49:11.1828131Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1828422Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1828428Z 2025-12-04T10:49:11.1828512Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1828576Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1828643Z ================== 1 failed, 57 deselected, 2 rerun in 11.96s ================== 2025-12-04T10:49:11.1828683Z Got exit code 1 2025-12-04T10:49:11.1828923Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1829065Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1829263Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-9c4a958d876758d6.xml 2025-12-04T10:49:11.1829324Z ============================= test session starts ============================== 2025-12-04T10:49:11.1829437Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1829481Z cachedir: .pytest_cache 2025-12-04T10:49:11.1829651Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1829700Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1829741Z configfile: pytest.ini 2025-12-04T10:49:11.1829910Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1829986Z collecting ... collected 58 items / 47 deselected / 11 selected 2025-12-04T10:49:11.1830041Z stepcurrent: skipping 47 already run items. 2025-12-04T10:49:11.1830085Z Running 11 items in this shard 2025-12-04T10:49:11.1830089Z 2025-12-04T10:49:11.1830341Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.6483s] [ 9%] 2025-12-04T10:49:11.1830589Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6317s] [ 9%] 2025-12-04T10:49:11.1830810Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 FAILED [0.6599s] [ 9%] 2025-12-04T10:49:11.1830813Z 2025-12-04T10:49:11.1830867Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1831035Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1831095Z Traceback (most recent call last): 2025-12-04T10:49:11.1831252Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1831295Z method(*args, **kwargs) 2025-12-04T10:49:11.1831446Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1831489Z method(*args, **kwargs) 2025-12-04T10:49:11.1831640Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1831680Z with policy(): 2025-12-04T10:49:11.1831834Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1831925Z raise RuntimeError(msg) 2025-12-04T10:49:11.1832321Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1832324Z 2025-12-04T10:49:11.1832397Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1832691Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1832694Z 2025-12-04T10:49:11.1832780Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1832868Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1832925Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1833103Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1833177Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1833215Z graph_break [] 2025-12-04T10:49:11.1833364Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1833412Z Traceback (most recent call last): 2025-12-04T10:49:11.1833592Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1833635Z method(*args, **kwargs) 2025-12-04T10:49:11.1833787Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1833830Z method(*args, **kwargs) 2025-12-04T10:49:11.1833982Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1834024Z with policy(): 2025-12-04T10:49:11.1834178Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1834219Z raise RuntimeError(msg) 2025-12-04T10:49:11.1834621Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1834623Z 2025-12-04T10:49:11.1834696Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1835008Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1835026Z 2025-12-04T10:49:11.1835111Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1835187Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1835242Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1835420Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1835494Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1835531Z graph_break [] 2025-12-04T10:49:11.1835605Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1835660Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1835733Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1835906Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1835947Z graph_break [] 2025-12-04T10:49:11.1836000Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1836151Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1836197Z Traceback (most recent call last): 2025-12-04T10:49:11.1836352Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1836392Z method(*args, **kwargs) 2025-12-04T10:49:11.1836553Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1836594Z method(*args, **kwargs) 2025-12-04T10:49:11.1836746Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1836785Z with policy(): 2025-12-04T10:49:11.1836939Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1836979Z raise RuntimeError(msg) 2025-12-04T10:49:11.1837395Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1837397Z 2025-12-04T10:49:11.1837472Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1837758Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1837761Z 2025-12-04T10:49:11.1837849Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1837919Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1837976Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1838149Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1838223Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1838259Z graph_break [] 2025-12-04T10:49:11.1838334Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1838398Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1838470Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1838642Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1838699Z graph_break [] 2025-12-04T10:49:11.1838770Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1838825Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1838896Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1839070Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1839109Z graph_break [] 2025-12-04T10:49:11.1839354Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-9c4a958d876758d6.xml - 2025-12-04T10:49:11.1839417Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1840045Z FAILED [0.6599s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1840048Z 2025-12-04T10:49:11.1840122Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1840420Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1840426Z 2025-12-04T10:49:11.1840510Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1840575Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1840641Z ================== 1 failed, 47 deselected, 2 rerun in 4.10s =================== 2025-12-04T10:49:11.1840680Z Got exit code 1 2025-12-04T10:49:11.1840720Z Retrying single test... 2025-12-04T10:49:11.1840935Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-355eeece7eea2d96.xml 2025-12-04T10:49:11.1840994Z ============================= test session starts ============================== 2025-12-04T10:49:11.1841109Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1841151Z cachedir: .pytest_cache 2025-12-04T10:49:11.1841315Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1841362Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1841405Z configfile: pytest.ini 2025-12-04T10:49:11.1841568Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1841643Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1841976Z stepcurrent: skipping 47 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1842024Z Running 1 items in this shard 2025-12-04T10:49:11.1842026Z 2025-12-04T10:49:11.1842388Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:40:40.746962073 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1842423Z 2025-12-04T10:49:11.1842576Z [W1204 10:40:47.357092448 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1842577Z 2025-12-04T10:49:11.1842730Z [W1204 10:40:47.357251027 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1842732Z 2025-12-04T10:49:11.1842881Z [W1204 10:40:47.360885029 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1842882Z 2025-12-04T10:49:11.1843036Z [W1204 10:40:47.361175476 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1843039Z 2025-12-04T10:49:11.1843187Z [W1204 10:40:47.361255555 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1843193Z 2025-12-04T10:49:11.1843341Z [W1204 10:40:47.363599171 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1843342Z 2025-12-04T10:49:11.1843493Z [W1204 10:40:47.363860788 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1843495Z 2025-12-04T10:49:11.1843644Z [W1204 10:40:47.363936997 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1843646Z 2025-12-04T10:49:11.1843700Z ('RERUN', {'yellow': True}) [10.1594s] [100%] 2025-12-04T10:49:11.1844067Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:40:48.305947976 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1844071Z 2025-12-04T10:49:11.1844223Z [W1204 10:40:48.306333952 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1844224Z 2025-12-04T10:49:11.1844374Z [W1204 10:40:48.306431281 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1844377Z 2025-12-04T10:49:11.1844544Z [W1204 10:40:48.307792387 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1844546Z 2025-12-04T10:49:11.1844696Z [W1204 10:40:48.308135073 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1844699Z 2025-12-04T10:49:11.1844847Z [W1204 10:40:48.308219112 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1844848Z 2025-12-04T10:49:11.1844999Z [W1204 10:40:48.310386640 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1845002Z 2025-12-04T10:49:11.1845156Z [W1204 10:40:48.310649527 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1845158Z 2025-12-04T10:49:11.1845307Z [W1204 10:40:48.310725706 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1845309Z 2025-12-04T10:49:11.1845364Z ('RERUN', {'yellow': True}) [0.4445s] [100%] 2025-12-04T10:49:11.1845721Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:40:49.744647903 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1845743Z 2025-12-04T10:49:11.1845897Z [W1204 10:40:49.745019939 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1845899Z 2025-12-04T10:49:11.1846052Z [W1204 10:40:49.745101528 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1846054Z 2025-12-04T10:49:11.1846203Z [W1204 10:40:49.746454044 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1846204Z 2025-12-04T10:49:11.1846356Z [W1204 10:40:49.746775971 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1846359Z 2025-12-04T10:49:11.1846510Z [W1204 10:40:49.746853190 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1846512Z 2025-12-04T10:49:11.1846665Z [W1204 10:40:49.749032407 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1846667Z 2025-12-04T10:49:11.1846814Z [W1204 10:40:49.749293734 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1846820Z 2025-12-04T10:49:11.1846968Z [W1204 10:40:49.749369784 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1846970Z 2025-12-04T10:49:11.1847014Z FAILED [0.4393s] [100%] 2025-12-04T10:49:11.1847016Z 2025-12-04T10:49:11.1847069Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1847237Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1847284Z Traceback (most recent call last): 2025-12-04T10:49:11.1847445Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1847488Z method(*args, **kwargs) 2025-12-04T10:49:11.1847645Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1847687Z method(*args, **kwargs) 2025-12-04T10:49:11.1847855Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1847896Z with policy(): 2025-12-04T10:49:11.1848054Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1848097Z raise RuntimeError(msg) 2025-12-04T10:49:11.1848495Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1848498Z 2025-12-04T10:49:11.1848576Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1848865Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1848867Z 2025-12-04T10:49:11.1848957Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1849031Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1849104Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1849280Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1849370Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1849408Z graph_break [] 2025-12-04T10:49:11.1849485Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1849833Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1849883Z if out == self.unknown_value: 2025-12-04T10:49:11.1850033Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1850085Z Traceback (most recent call last): 2025-12-04T10:49:11.1850243Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1850284Z method(*args, **kwargs) 2025-12-04T10:49:11.1850441Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1850483Z method(*args, **kwargs) 2025-12-04T10:49:11.1850637Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1850674Z with policy(): 2025-12-04T10:49:11.1850832Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1850874Z raise RuntimeError(msg) 2025-12-04T10:49:11.1851290Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1851295Z 2025-12-04T10:49:11.1851369Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1851660Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1851662Z 2025-12-04T10:49:11.1851762Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1851839Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1851937Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1852114Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1852191Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1852229Z graph_break [] 2025-12-04T10:49:11.1852308Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1852656Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1852705Z if out == self.unknown_value: 2025-12-04T10:49:11.1852778Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1852838Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1852910Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1853091Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1853150Z graph_break [] 2025-12-04T10:49:11.1853224Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1853374Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1853424Z Traceback (most recent call last): 2025-12-04T10:49:11.1853578Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1853624Z method(*args, **kwargs) 2025-12-04T10:49:11.1853776Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1853821Z method(*args, **kwargs) 2025-12-04T10:49:11.1853973Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1854018Z with policy(): 2025-12-04T10:49:11.1854175Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1854218Z raise RuntimeError(msg) 2025-12-04T10:49:11.1854621Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1854624Z 2025-12-04T10:49:11.1854698Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1855003Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1855006Z 2025-12-04T10:49:11.1855093Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1855170Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1855227Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1855405Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1855481Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1855533Z graph_break [] 2025-12-04T10:49:11.1855612Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1855956Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1856007Z if out == self.unknown_value: 2025-12-04T10:49:11.1856081Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1856141Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1856212Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1856391Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1856429Z graph_break [] 2025-12-04T10:49:11.1856505Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1856560Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1856635Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1856809Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1856861Z graph_break [] 2025-12-04T10:49:11.1857123Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-355eeece7eea2d96.xml - 2025-12-04T10:49:11.1857187Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1857822Z FAILED [0.4393s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1857826Z 2025-12-04T10:49:11.1857899Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1858191Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1858195Z 2025-12-04T10:49:11.1858280Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1858345Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1858413Z ================== 1 failed, 57 deselected, 2 rerun in 11.21s ================== 2025-12-04T10:49:11.1858454Z Got exit code 1 2025-12-04T10:49:11.1858496Z Retrying single test... 2025-12-04T10:49:11.1858713Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-4b6d77887180fe9b.xml 2025-12-04T10:49:11.1858774Z ============================= test session starts ============================== 2025-12-04T10:49:11.1858890Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1858934Z cachedir: .pytest_cache 2025-12-04T10:49:11.1859097Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1859148Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1859189Z configfile: pytest.ini 2025-12-04T10:49:11.1859368Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1859443Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1859733Z stepcurrent: skipping 47 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1859780Z Running 1 items in this shard 2025-12-04T10:49:11.1859782Z 2025-12-04T10:49:11.1860146Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:40:57.425050108 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1860150Z 2025-12-04T10:49:11.1860303Z [W1204 10:41:05.795707964 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1860306Z 2025-12-04T10:49:11.1860461Z [W1204 10:41:05.795865272 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1860465Z 2025-12-04T10:49:11.1860618Z [W1204 10:41:05.799354685 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1860632Z 2025-12-04T10:49:11.1860781Z [W1204 10:41:05.799646982 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1860803Z 2025-12-04T10:49:11.1860955Z [W1204 10:41:05.799723811 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1860957Z 2025-12-04T10:49:11.1861104Z [W1204 10:41:05.802064137 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1861107Z 2025-12-04T10:49:11.1861260Z [W1204 10:41:05.802324234 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1861261Z 2025-12-04T10:49:11.1861414Z [W1204 10:41:05.802399673 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1861417Z 2025-12-04T10:49:11.1861468Z ('RERUN', {'yellow': True}) [9.9317s] [100%] 2025-12-04T10:49:11.1861832Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:41:06.758443882 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1861835Z 2025-12-04T10:49:11.1862024Z [W1204 10:41:06.758814018 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1862026Z 2025-12-04T10:49:11.1862179Z [W1204 10:41:06.758896837 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1862181Z 2025-12-04T10:49:11.1862347Z [W1204 10:41:06.760271752 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1862350Z 2025-12-04T10:49:11.1862500Z [W1204 10:41:06.760599309 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1862503Z 2025-12-04T10:49:11.1862658Z [W1204 10:41:06.760676868 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1862660Z 2025-12-04T10:49:11.1862810Z [W1204 10:41:06.762860185 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1862812Z 2025-12-04T10:49:11.1862980Z [W1204 10:41:06.763121242 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1862982Z 2025-12-04T10:49:11.1863132Z [W1204 10:41:06.763198941 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1863138Z 2025-12-04T10:49:11.1863187Z ('RERUN', {'yellow': True}) [0.4718s] [100%] 2025-12-04T10:49:11.1863546Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:41:06.217208541 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1863549Z 2025-12-04T10:49:11.1863697Z [W1204 10:41:06.217591607 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1863699Z 2025-12-04T10:49:11.1863854Z [W1204 10:41:06.217672096 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1863856Z 2025-12-04T10:49:11.1864005Z [W1204 10:41:06.219053491 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1864022Z 2025-12-04T10:49:11.1864174Z [W1204 10:41:06.219382458 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1864189Z 2025-12-04T10:49:11.1864341Z [W1204 10:41:06.219460087 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1864343Z 2025-12-04T10:49:11.1864491Z [W1204 10:41:06.221628734 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1864493Z 2025-12-04T10:49:11.1864646Z [W1204 10:41:06.221884871 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1864649Z 2025-12-04T10:49:11.1864797Z [W1204 10:41:06.221960730 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1864800Z 2025-12-04T10:49:11.1864844Z FAILED [0.4606s] [100%] 2025-12-04T10:49:11.1864846Z 2025-12-04T10:49:11.1864900Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1865055Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1865107Z Traceback (most recent call last): 2025-12-04T10:49:11.1865265Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1865311Z method(*args, **kwargs) 2025-12-04T10:49:11.1865465Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1865509Z method(*args, **kwargs) 2025-12-04T10:49:11.1865662Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1865715Z with policy(): 2025-12-04T10:49:11.1865872Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1865918Z raise RuntimeError(msg) 2025-12-04T10:49:11.1866317Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1866320Z 2025-12-04T10:49:11.1866409Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1866698Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1866706Z 2025-12-04T10:49:11.1866794Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1866871Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1866929Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1867112Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1867186Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1872228Z graph_break [] 2025-12-04T10:49:11.1872316Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1872666Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1872751Z if out == self.unknown_value: 2025-12-04T10:49:11.1872903Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1872970Z Traceback (most recent call last): 2025-12-04T10:49:11.1873126Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1873170Z method(*args, **kwargs) 2025-12-04T10:49:11.1873321Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1873366Z method(*args, **kwargs) 2025-12-04T10:49:11.1873516Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1873557Z with policy(): 2025-12-04T10:49:11.1873710Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1873755Z raise RuntimeError(msg) 2025-12-04T10:49:11.1874159Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1874163Z 2025-12-04T10:49:11.1874236Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1874526Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1874528Z 2025-12-04T10:49:11.1874615Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1874703Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1874760Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1874939Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1875016Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1875056Z graph_break [] 2025-12-04T10:49:11.1875128Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1875492Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1875539Z if out == self.unknown_value: 2025-12-04T10:49:11.1875612Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1875673Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1875745Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1875924Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1875961Z graph_break [] 2025-12-04T10:49:11.1876016Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1876168Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1876216Z Traceback (most recent call last): 2025-12-04T10:49:11.1876371Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1876413Z method(*args, **kwargs) 2025-12-04T10:49:11.1876577Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1876619Z method(*args, **kwargs) 2025-12-04T10:49:11.1876769Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1876832Z with policy(): 2025-12-04T10:49:11.1876984Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1877028Z raise RuntimeError(msg) 2025-12-04T10:49:11.1877428Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1877434Z 2025-12-04T10:49:11.1877510Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1877800Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1877803Z 2025-12-04T10:49:11.1877889Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1877964Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1878019Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1878198Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1878270Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1878309Z graph_break [] 2025-12-04T10:49:11.1878390Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1878734Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1878778Z if out == self.unknown_value: 2025-12-04T10:49:11.1878852Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1878906Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1878980Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1879166Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1879208Z graph_break [] 2025-12-04T10:49:11.1879282Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1879337Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1879411Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1879584Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1879623Z graph_break [] 2025-12-04T10:49:11.1879867Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-4b6d77887180fe9b.xml - 2025-12-04T10:49:11.1879931Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1880559Z FAILED [0.4606s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1880584Z 2025-12-04T10:49:11.1880659Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1880947Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1880950Z 2025-12-04T10:49:11.1881035Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1881101Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1881169Z ================== 1 failed, 57 deselected, 2 rerun in 11.03s ================== 2025-12-04T10:49:11.1881210Z Got exit code 1 2025-12-04T10:49:11.1881449Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1881581Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1881780Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-517edc140fbd82bc.xml 2025-12-04T10:49:11.1881840Z ============================= test session starts ============================== 2025-12-04T10:49:11.1882010Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1882056Z cachedir: .pytest_cache 2025-12-04T10:49:11.1882230Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1882283Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1882323Z configfile: pytest.ini 2025-12-04T10:49:11.1882491Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1882566Z collecting ... collected 58 items / 48 deselected / 10 selected 2025-12-04T10:49:11.1882622Z stepcurrent: skipping 48 already run items. 2025-12-04T10:49:11.1882667Z Running 10 items in this shard 2025-12-04T10:49:11.1882672Z 2025-12-04T10:49:11.1882938Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.9535s] [ 10%] 2025-12-04T10:49:11.1883185Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.4401s] [ 10%] 2025-12-04T10:49:11.1883409Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 FAILED [0.4483s] [ 10%] 2025-12-04T10:49:11.1883412Z 2025-12-04T10:49:11.1883466Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1883616Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1883664Z Traceback (most recent call last): 2025-12-04T10:49:11.1883822Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1883866Z method(*args, **kwargs) 2025-12-04T10:49:11.1884018Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1884061Z method(*args, **kwargs) 2025-12-04T10:49:11.1884212Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1884266Z with policy(): 2025-12-04T10:49:11.1884421Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1884478Z raise RuntimeError(msg) 2025-12-04T10:49:11.1884877Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1884880Z 2025-12-04T10:49:11.1884953Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1885245Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1885248Z 2025-12-04T10:49:11.1885334Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1885409Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1885465Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1885742Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1885817Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1885854Z graph_break [] 2025-12-04T10:49:11.1886004Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1886060Z Traceback (most recent call last): 2025-12-04T10:49:11.1886217Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1886257Z method(*args, **kwargs) 2025-12-04T10:49:11.1886412Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1886451Z method(*args, **kwargs) 2025-12-04T10:49:11.1886603Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1886642Z with policy(): 2025-12-04T10:49:11.1886809Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1886850Z raise RuntimeError(msg) 2025-12-04T10:49:11.1887250Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1887254Z 2025-12-04T10:49:11.1887327Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1887616Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1887618Z 2025-12-04T10:49:11.1887706Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1887778Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1887836Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1888111Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1888195Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1888248Z graph_break [] 2025-12-04T10:49:11.1888323Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1888377Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1888450Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1888718Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1888758Z graph_break [] 2025-12-04T10:49:11.1888811Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1888966Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1889011Z Traceback (most recent call last): 2025-12-04T10:49:11.1889168Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1889207Z method(*args, **kwargs) 2025-12-04T10:49:11.1889361Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1889400Z method(*args, **kwargs) 2025-12-04T10:49:11.1889554Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1889593Z with policy(): 2025-12-04T10:49:11.1889744Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1889798Z raise RuntimeError(msg) 2025-12-04T10:49:11.1890198Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1890201Z 2025-12-04T10:49:11.1890277Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1890578Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1890580Z 2025-12-04T10:49:11.1890668Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1890740Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1890799Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1891070Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1891144Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1891183Z graph_break [] 2025-12-04T10:49:11.1891255Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1891311Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1891382Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1891653Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1891709Z graph_break [] 2025-12-04T10:49:11.1891783Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1891837Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1891974Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1892242Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1892281Z graph_break [] 2025-12-04T10:49:11.1892526Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-517edc140fbd82bc.xml - 2025-12-04T10:49:11.1892589Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1893219Z FAILED [0.4483s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1893222Z 2025-12-04T10:49:11.1893295Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1893589Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1893591Z 2025-12-04T10:49:11.1893693Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1893760Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1893826Z ================== 1 failed, 48 deselected, 2 rerun in 4.01s =================== 2025-12-04T10:49:11.1893867Z Got exit code 1 2025-12-04T10:49:11.1893908Z Retrying single test... 2025-12-04T10:49:11.1894109Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-4249fd23af3c224b.xml 2025-12-04T10:49:11.1894167Z ============================= test session starts ============================== 2025-12-04T10:49:11.1894296Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1894338Z cachedir: .pytest_cache 2025-12-04T10:49:11.1894499Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1894548Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1894589Z configfile: pytest.ini 2025-12-04T10:49:11.1894755Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1894828Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1895115Z stepcurrent: skipping 48 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1895159Z Running 1 items in this shard 2025-12-04T10:49:11.1895161Z 2025-12-04T10:49:11.1895524Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:41:26.441645850 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1895539Z 2025-12-04T10:49:11.1895694Z [W1204 10:41:34.145540575 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1895696Z 2025-12-04T10:49:11.1895850Z [W1204 10:41:34.145681653 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1895869Z 2025-12-04T10:49:11.1896020Z [W1204 10:41:34.149311413 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1896022Z 2025-12-04T10:49:11.1896170Z [W1204 10:41:34.149598540 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1896172Z 2025-12-04T10:49:11.1896323Z [W1204 10:41:34.149676399 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1896325Z 2025-12-04T10:49:11.1896473Z [W1204 10:41:34.152249511 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1896475Z 2025-12-04T10:49:11.1896625Z [W1204 10:41:34.152521188 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1896627Z 2025-12-04T10:49:11.1896777Z [W1204 10:41:34.152598977 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1896779Z 2025-12-04T10:49:11.1896831Z ('RERUN', {'yellow': True}) [10.7095s] [100%] 2025-12-04T10:49:11.1897192Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:41:35.793879695 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1897194Z 2025-12-04T10:49:11.1897358Z [W1204 10:41:35.794243001 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1897360Z 2025-12-04T10:49:11.1897511Z [W1204 10:41:35.794332420 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1897514Z 2025-12-04T10:49:11.1897664Z [W1204 10:41:35.795732505 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1897666Z 2025-12-04T10:49:11.1897829Z [W1204 10:41:35.795981572 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1897831Z 2025-12-04T10:49:11.1897983Z [W1204 10:41:35.796059491 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1897985Z 2025-12-04T10:49:11.1898134Z [W1204 10:41:35.798326706 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1898136Z 2025-12-04T10:49:11.1898286Z [W1204 10:41:35.798579413 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1898289Z 2025-12-04T10:49:11.1898436Z [W1204 10:41:35.798654313 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1898441Z 2025-12-04T10:49:11.1898491Z ('RERUN', {'yellow': True}) [0.5053s] [100%] 2025-12-04T10:49:11.1898848Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:41:35.295134303 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1898850Z 2025-12-04T10:49:11.1898998Z [W1204 10:41:35.295497359 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1899011Z 2025-12-04T10:49:11.1899161Z [W1204 10:41:35.295581838 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1899176Z 2025-12-04T10:49:11.1899323Z [W1204 10:41:35.296976103 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1899327Z 2025-12-04T10:49:11.1899478Z [W1204 10:41:35.297235220 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1899480Z 2025-12-04T10:49:11.1899630Z [W1204 10:41:35.297312879 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1899632Z 2025-12-04T10:49:11.1899781Z [W1204 10:41:35.299567364 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1899783Z 2025-12-04T10:49:11.1899935Z [W1204 10:41:35.299817901 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1899939Z 2025-12-04T10:49:11.1900087Z [W1204 10:41:35.299891621 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1900089Z 2025-12-04T10:49:11.1900130Z FAILED [0.5013s] [100%] 2025-12-04T10:49:11.1900132Z 2025-12-04T10:49:11.1900185Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1900338Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1900386Z Traceback (most recent call last): 2025-12-04T10:49:11.1900555Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1900600Z method(*args, **kwargs) 2025-12-04T10:49:11.1900753Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1900797Z method(*args, **kwargs) 2025-12-04T10:49:11.1900948Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1900988Z with policy(): 2025-12-04T10:49:11.1901141Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1901195Z raise RuntimeError(msg) 2025-12-04T10:49:11.1901592Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1901595Z 2025-12-04T10:49:11.1901671Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1902003Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1902008Z 2025-12-04T10:49:11.1902094Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1902171Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1902227Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1902504Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1902593Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1902633Z graph_break [] 2025-12-04T10:49:11.1902705Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1903068Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1903113Z if out == self.unknown_value: 2025-12-04T10:49:11.1903268Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1903313Z Traceback (most recent call last): 2025-12-04T10:49:11.1903469Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1903511Z method(*args, **kwargs) 2025-12-04T10:49:11.1903666Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1903705Z method(*args, **kwargs) 2025-12-04T10:49:11.1903860Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1903897Z with policy(): 2025-12-04T10:49:11.1904052Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1904095Z raise RuntimeError(msg) 2025-12-04T10:49:11.1904511Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1904515Z 2025-12-04T10:49:11.1904591Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1904878Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1904881Z 2025-12-04T10:49:11.1904970Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1905041Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1905120Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1905392Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1905470Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1905510Z graph_break [] 2025-12-04T10:49:11.1905581Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1905928Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1905972Z if out == self.unknown_value: 2025-12-04T10:49:11.1906045Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1906101Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1906175Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1906446Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1906497Z graph_break [] 2025-12-04T10:49:11.1906549Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1906717Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1906762Z Traceback (most recent call last): 2025-12-04T10:49:11.1906919Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1906959Z method(*args, **kwargs) 2025-12-04T10:49:11.1907115Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1907155Z method(*args, **kwargs) 2025-12-04T10:49:11.1907309Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1907347Z with policy(): 2025-12-04T10:49:11.1907501Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1907543Z raise RuntimeError(msg) 2025-12-04T10:49:11.1907947Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1907949Z 2025-12-04T10:49:11.1908025Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1908323Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1908327Z 2025-12-04T10:49:11.1908416Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1908488Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1908547Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1908821Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1908910Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1908948Z graph_break [] 2025-12-04T10:49:11.1909022Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1909368Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1909412Z if out == self.unknown_value: 2025-12-04T10:49:11.1909486Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1909541Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1909615Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1909884Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1909923Z graph_break [] 2025-12-04T10:49:11.1909994Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1910050Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1910135Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1910406Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1910455Z graph_break [] 2025-12-04T10:49:11.1910701Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-4249fd23af3c224b.xml - 2025-12-04T10:49:11.1910760Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1911400Z FAILED [0.5013s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1911404Z 2025-12-04T10:49:11.1911479Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1911764Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1911766Z 2025-12-04T10:49:11.1911907Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1911970Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1912039Z ================== 1 failed, 57 deselected, 2 rerun in 11.86s ================== 2025-12-04T10:49:11.1912077Z Got exit code 1 2025-12-04T10:49:11.1912141Z Retrying single test... 2025-12-04T10:49:11.1912340Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-adbf78b3acd750cf.xml 2025-12-04T10:49:11.1912401Z ============================= test session starts ============================== 2025-12-04T10:49:11.1912516Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1912558Z cachedir: .pytest_cache 2025-12-04T10:49:11.1912719Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1912780Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1912825Z configfile: pytest.ini 2025-12-04T10:49:11.1912990Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1913067Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1913353Z stepcurrent: skipping 48 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1913402Z Running 1 items in this shard 2025-12-04T10:49:11.1913404Z 2025-12-04T10:49:11.1913767Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:41:45.023542651 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1913770Z 2025-12-04T10:49:11.1913927Z [W1204 10:41:52.522737613 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1913929Z 2025-12-04T10:49:11.1914085Z [W1204 10:41:52.522911771 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1914100Z 2025-12-04T10:49:11.1914250Z [W1204 10:41:53.526707989 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1914269Z 2025-12-04T10:49:11.1914421Z [W1204 10:41:53.527012196 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1914423Z 2025-12-04T10:49:11.1914571Z [W1204 10:41:53.527090465 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1914572Z 2025-12-04T10:49:11.1914724Z [W1204 10:41:53.529668306 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1914725Z 2025-12-04T10:49:11.1914876Z [W1204 10:41:53.529931233 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1914878Z 2025-12-04T10:49:11.1915026Z [W1204 10:41:53.530011682 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1915029Z 2025-12-04T10:49:11.1915084Z ('RERUN', {'yellow': True}) [10.4843s] [100%] 2025-12-04T10:49:11.1915440Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:41:53.189439214 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1915442Z 2025-12-04T10:49:11.1915594Z [W1204 10:41:53.189811970 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1915596Z 2025-12-04T10:49:11.1915757Z [W1204 10:41:53.189907968 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1915760Z 2025-12-04T10:49:11.1915909Z [W1204 10:41:53.191299843 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1915912Z 2025-12-04T10:49:11.1916058Z [W1204 10:41:53.191553180 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1916060Z 2025-12-04T10:49:11.1916209Z [W1204 10:41:53.191632309 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1916211Z 2025-12-04T10:49:11.1916369Z [W1204 10:41:53.193903764 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1916371Z 2025-12-04T10:49:11.1916520Z [W1204 10:41:53.194172771 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1916524Z 2025-12-04T10:49:11.1916671Z [W1204 10:41:53.194250740 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1916674Z 2025-12-04T10:49:11.1916723Z ('RERUN', {'yellow': True}) [0.5209s] [100%] 2025-12-04T10:49:11.1917077Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:41:54.702437167 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1917079Z 2025-12-04T10:49:11.1917227Z [W1204 10:41:54.702810823 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1917229Z 2025-12-04T10:49:11.1917378Z [W1204 10:41:54.702903252 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1917390Z 2025-12-04T10:49:11.1917537Z [W1204 10:41:54.704288007 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1917554Z 2025-12-04T10:49:11.1917702Z [W1204 10:41:54.704545334 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1917704Z 2025-12-04T10:49:11.1917853Z [W1204 10:41:54.704621863 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1917855Z 2025-12-04T10:49:11.1918003Z [W1204 10:41:54.706910377 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1918005Z 2025-12-04T10:49:11.1918153Z [W1204 10:41:54.707170294 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1918156Z 2025-12-04T10:49:11.1918302Z [W1204 10:41:54.707249194 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1918306Z 2025-12-04T10:49:11.1918345Z FAILED [0.5087s] [100%] 2025-12-04T10:49:11.1918346Z 2025-12-04T10:49:11.1918399Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1918548Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1918594Z Traceback (most recent call last): 2025-12-04T10:49:11.1918751Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1918792Z method(*args, **kwargs) 2025-12-04T10:49:11.1918944Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1919001Z method(*args, **kwargs) 2025-12-04T10:49:11.1919152Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1919192Z with policy(): 2025-12-04T10:49:11.1919344Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1919387Z raise RuntimeError(msg) 2025-12-04T10:49:11.1919792Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.1919794Z 2025-12-04T10:49:11.1919869Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1920159Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1920164Z 2025-12-04T10:49:11.1920248Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1920321Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1920375Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1920648Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1920719Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1920756Z graph_break [] 2025-12-04T10:49:11.1920826Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1921182Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1921244Z if out == self.unknown_value: 2025-12-04T10:49:11.1921394Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1921438Z Traceback (most recent call last): 2025-12-04T10:49:11.1921592Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1921632Z method(*args, **kwargs) 2025-12-04T10:49:11.1921783Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1921821Z method(*args, **kwargs) 2025-12-04T10:49:11.1922021Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1922058Z with policy(): 2025-12-04T10:49:11.1922209Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1922251Z raise RuntimeError(msg) 2025-12-04T10:49:11.1922649Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.1922652Z 2025-12-04T10:49:11.1922725Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1923030Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1923033Z 2025-12-04T10:49:11.1923122Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1923194Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1923252Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1923520Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1923605Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1923644Z graph_break [] 2025-12-04T10:49:11.1923714Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1924057Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1924100Z if out == self.unknown_value: 2025-12-04T10:49:11.1924173Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1924227Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1924300Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1924573Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1924610Z graph_break [] 2025-12-04T10:49:11.1924660Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1924810Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.1924868Z Traceback (most recent call last): 2025-12-04T10:49:11.1925021Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1925084Z method(*args, **kwargs) 2025-12-04T10:49:11.1925235Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1925273Z method(*args, **kwargs) 2025-12-04T10:49:11.1925424Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1925460Z with policy(): 2025-12-04T10:49:11.1925613Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1925654Z raise RuntimeError(msg) 2025-12-04T10:49:11.1926054Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1926057Z 2025-12-04T10:49:11.1926129Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1926414Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1926417Z 2025-12-04T10:49:11.1926504Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1926575Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1926643Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1926914Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1926988Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1927024Z graph_break [] 2025-12-04T10:49:11.1927093Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1927445Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1927487Z if out == self.unknown_value: 2025-12-04T10:49:11.1927559Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1927613Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1927685Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1927951Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1927989Z graph_break [] 2025-12-04T10:49:11.1928059Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1928113Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1928183Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1928450Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1928496Z graph_break [] 2025-12-04T10:49:11.1928740Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-adbf78b3acd750cf.xml - 2025-12-04T10:49:11.1928813Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1929438Z FAILED [0.5087s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.1929440Z 2025-12-04T10:49:11.1929513Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1929798Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1929802Z 2025-12-04T10:49:11.1929887Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1929947Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1930013Z ================== 1 failed, 57 deselected, 2 rerun in 11.66s ================== 2025-12-04T10:49:11.1930049Z Got exit code 1 2025-12-04T10:49:11.1930287Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.1930425Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1930623Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ee478bc0f04752c9.xml 2025-12-04T10:49:11.1930680Z ============================= test session starts ============================== 2025-12-04T10:49:11.1930793Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1930836Z cachedir: .pytest_cache 2025-12-04T10:49:11.1930992Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1931039Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1931089Z configfile: pytest.ini 2025-12-04T10:49:11.1931253Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1931325Z collecting ... collected 58 items / 49 deselected / 9 selected 2025-12-04T10:49:11.1931380Z stepcurrent: skipping 49 already run items. 2025-12-04T10:49:11.1931423Z Running 9 items in this shard 2025-12-04T10:49:11.1931424Z 2025-12-04T10:49:11.1931676Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.6380s] [ 11%] 2025-12-04T10:49:11.1931950Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.5800s] [ 11%] 2025-12-04T10:49:11.1932175Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 FAILED [0.5775s] [ 11%] 2025-12-04T10:49:11.1932178Z 2025-12-04T10:49:11.1932231Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1932399Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1932446Z Traceback (most recent call last): 2025-12-04T10:49:11.1932601Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1932667Z method(*args, **kwargs) 2025-12-04T10:49:11.1932817Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1932857Z method(*args, **kwargs) 2025-12-04T10:49:11.1933006Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1933044Z with policy(): 2025-12-04T10:49:11.1933196Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1933237Z raise RuntimeError(msg) 2025-12-04T10:49:11.1933641Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1933645Z 2025-12-04T10:49:11.1933717Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1934009Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1934011Z 2025-12-04T10:49:11.1934096Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1934169Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1934240Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1934418Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1934490Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1934527Z graph_break [] 2025-12-04T10:49:11.1934677Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1934722Z Traceback (most recent call last): 2025-12-04T10:49:11.1934890Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1934931Z method(*args, **kwargs) 2025-12-04T10:49:11.1935080Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1935120Z method(*args, **kwargs) 2025-12-04T10:49:11.1935269Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1935307Z with policy(): 2025-12-04T10:49:11.1935459Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1935500Z raise RuntimeError(msg) 2025-12-04T10:49:11.1935914Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1935916Z 2025-12-04T10:49:11.1935988Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1936282Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1936295Z 2025-12-04T10:49:11.1936380Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1936465Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1936519Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1936695Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1936766Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1936805Z graph_break [] 2025-12-04T10:49:11.1936876Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1936933Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1937004Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1937181Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1937219Z graph_break [] 2025-12-04T10:49:11.1937271Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1937423Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1937469Z Traceback (most recent call last): 2025-12-04T10:49:11.1937627Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1937666Z method(*args, **kwargs) 2025-12-04T10:49:11.1937832Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1937873Z method(*args, **kwargs) 2025-12-04T10:49:11.1938024Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1938062Z with policy(): 2025-12-04T10:49:11.1938216Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1938257Z raise RuntimeError(msg) 2025-12-04T10:49:11.1938682Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1938684Z 2025-12-04T10:49:11.1938757Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1939052Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1939056Z 2025-12-04T10:49:11.1939144Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1939216Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1939273Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1939450Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1939524Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1939560Z graph_break [] 2025-12-04T10:49:11.1939634Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1939689Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1939775Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1939950Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1940002Z graph_break [] 2025-12-04T10:49:11.1940072Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1940128Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1940198Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1940373Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1940410Z graph_break [] 2025-12-04T10:49:11.1940657Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ee478bc0f04752c9.xml - 2025-12-04T10:49:11.1940720Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1941366Z FAILED [0.5775s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1941368Z 2025-12-04T10:49:11.1941441Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1941747Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1941750Z 2025-12-04T10:49:11.1941838Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1941949Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1942018Z ================== 1 failed, 49 deselected, 2 rerun in 3.97s =================== 2025-12-04T10:49:11.1942058Z Got exit code 1 2025-12-04T10:49:11.1942100Z Retrying single test... 2025-12-04T10:49:11.1942317Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-fb069a8caf7fba44.xml 2025-12-04T10:49:11.1942374Z ============================= test session starts ============================== 2025-12-04T10:49:11.1942486Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1942528Z cachedir: .pytest_cache 2025-12-04T10:49:11.1942688Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1942734Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1942779Z configfile: pytest.ini 2025-12-04T10:49:11.1942943Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1943016Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1943305Z stepcurrent: skipping 49 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1943351Z Running 1 items in this shard 2025-12-04T10:49:11.1943353Z 2025-12-04T10:49:11.1943719Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:42:13.119218368 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1943757Z 2025-12-04T10:49:11.1943911Z [W1204 10:42:20.492215069 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1943913Z 2025-12-04T10:49:11.1944066Z [W1204 10:42:20.492398047 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1944068Z 2025-12-04T10:49:11.1944218Z [W1204 10:42:20.496202853 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1944220Z 2025-12-04T10:49:11.1944371Z [W1204 10:42:20.496502770 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1944375Z 2025-12-04T10:49:11.1944524Z [W1204 10:42:20.496584549 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1944526Z 2025-12-04T10:49:11.1944678Z [W1204 10:42:20.499059231 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1944680Z 2025-12-04T10:49:11.1944830Z [W1204 10:42:20.499322578 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1944831Z 2025-12-04T10:49:11.1944980Z [W1204 10:42:20.499398367 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1944982Z 2025-12-04T10:49:11.1945036Z ('RERUN', {'yellow': True}) [10.1234s] [100%] 2025-12-04T10:49:11.1945410Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:42:22.598298634 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1945414Z 2025-12-04T10:49:11.1945568Z [W1204 10:42:22.598680220 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1945570Z 2025-12-04T10:49:11.1945720Z [W1204 10:42:22.598771469 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1945722Z 2025-12-04T10:49:11.1945883Z [W1204 10:42:22.600166553 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1945886Z 2025-12-04T10:49:11.1946037Z [W1204 10:42:22.600505789 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1946040Z 2025-12-04T10:49:11.1946187Z [W1204 10:42:22.600585248 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1946189Z 2025-12-04T10:49:11.1946339Z [W1204 10:42:22.602878872 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1946342Z 2025-12-04T10:49:11.1946492Z [W1204 10:42:22.603152769 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1946494Z 2025-12-04T10:49:11.1946643Z [W1204 10:42:22.603232708 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1946645Z 2025-12-04T10:49:11.1946696Z ('RERUN', {'yellow': True}) [0.6151s] [100%] 2025-12-04T10:49:11.1947057Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:42:22.193453868 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1947077Z 2025-12-04T10:49:11.1947238Z [W1204 10:42:22.193821504 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1947240Z 2025-12-04T10:49:11.1947390Z [W1204 10:42:22.193903363 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1947394Z 2025-12-04T10:49:11.1947543Z [W1204 10:42:22.195287517 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1947545Z 2025-12-04T10:49:11.1947696Z [W1204 10:42:22.195617164 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1947697Z 2025-12-04T10:49:11.1947847Z [W1204 10:42:22.195695523 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1947849Z 2025-12-04T10:49:11.1948000Z [W1204 10:42:22.197979347 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1948003Z 2025-12-04T10:49:11.1948152Z [W1204 10:42:22.198244723 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1948154Z 2025-12-04T10:49:11.1948306Z [W1204 10:42:22.198324003 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1948308Z 2025-12-04T10:49:11.1948349Z FAILED [0.6071s] [100%] 2025-12-04T10:49:11.1948351Z 2025-12-04T10:49:11.1948404Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1948569Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1948617Z Traceback (most recent call last): 2025-12-04T10:49:11.1948778Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1948818Z method(*args, **kwargs) 2025-12-04T10:49:11.1948975Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1949015Z method(*args, **kwargs) 2025-12-04T10:49:11.1949182Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1949222Z with policy(): 2025-12-04T10:49:11.1949378Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1949419Z raise RuntimeError(msg) 2025-12-04T10:49:11.1949824Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1949827Z 2025-12-04T10:49:11.1949902Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1950197Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1950199Z 2025-12-04T10:49:11.1950288Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1950362Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1950419Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1950611Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1950700Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1950738Z graph_break [] 2025-12-04T10:49:11.1950813Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1951163Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1951210Z if out == self.unknown_value: 2025-12-04T10:49:11.1951361Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1951410Z Traceback (most recent call last): 2025-12-04T10:49:11.1951565Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1951608Z method(*args, **kwargs) 2025-12-04T10:49:11.1951765Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1951806Z method(*args, **kwargs) 2025-12-04T10:49:11.1952006Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1952044Z with policy(): 2025-12-04T10:49:11.1952202Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1952242Z raise RuntimeError(msg) 2025-12-04T10:49:11.1952676Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1952681Z 2025-12-04T10:49:11.1952754Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1953047Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1953049Z 2025-12-04T10:49:11.1953150Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1953226Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1953282Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1953462Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1953537Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1953576Z graph_break [] 2025-12-04T10:49:11.1953652Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1954004Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1954050Z if out == self.unknown_value: 2025-12-04T10:49:11.1954123Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1954181Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1954255Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1954436Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1954487Z graph_break [] 2025-12-04T10:49:11.1954545Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1954718Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1954769Z Traceback (most recent call last): 2025-12-04T10:49:11.1954926Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1954971Z method(*args, **kwargs) 2025-12-04T10:49:11.1955124Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1955166Z method(*args, **kwargs) 2025-12-04T10:49:11.1955321Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1955361Z with policy(): 2025-12-04T10:49:11.1955516Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1955560Z raise RuntimeError(msg) 2025-12-04T10:49:11.1955983Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1955985Z 2025-12-04T10:49:11.1956059Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1956369Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1956372Z 2025-12-04T10:49:11.1956459Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1956533Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1956590Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1956770Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1956843Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1956895Z graph_break [] 2025-12-04T10:49:11.1956970Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1957323Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1957370Z if out == self.unknown_value: 2025-12-04T10:49:11.1957442Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1957501Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1957573Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1957757Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1957794Z graph_break [] 2025-12-04T10:49:11.1957873Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1957929Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1958005Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1958195Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1958237Z graph_break [] 2025-12-04T10:49:11.1958503Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-fb069a8caf7fba44.xml - 2025-12-04T10:49:11.1958568Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1959234Z FAILED [0.6071s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1959241Z 2025-12-04T10:49:11.1959315Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1959617Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1959620Z 2025-12-04T10:49:11.1959709Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1959777Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1959846Z ================== 1 failed, 57 deselected, 2 rerun in 11.49s ================== 2025-12-04T10:49:11.1959889Z Got exit code 1 2025-12-04T10:49:11.1959931Z Retrying single test... 2025-12-04T10:49:11.1960152Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-dd694da01bad65e1.xml 2025-12-04T10:49:11.1960212Z ============================= test session starts ============================== 2025-12-04T10:49:11.1960332Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1960376Z cachedir: .pytest_cache 2025-12-04T10:49:11.1960543Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1960595Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1960638Z configfile: pytest.ini 2025-12-04T10:49:11.1960829Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1960905Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1961206Z stepcurrent: skipping 49 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1961253Z Running 1 items in this shard 2025-12-04T10:49:11.1961257Z 2025-12-04T10:49:11.1961636Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:42:31.264166314 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1961639Z 2025-12-04T10:49:11.1961798Z [W1204 10:42:39.616306916 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1961800Z 2025-12-04T10:49:11.1962009Z [W1204 10:42:39.616450034 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1962011Z 2025-12-04T10:49:11.1962168Z [W1204 10:42:39.620184531 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1962187Z 2025-12-04T10:49:11.1962341Z [W1204 10:42:39.620479668 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1962359Z 2025-12-04T10:49:11.1962517Z [W1204 10:42:39.620562287 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1962519Z 2025-12-04T10:49:11.1962671Z [W1204 10:42:39.623050618 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1962674Z 2025-12-04T10:49:11.1962830Z [W1204 10:42:39.623313695 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1962833Z 2025-12-04T10:49:11.1962989Z [W1204 10:42:39.623390904 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1962992Z 2025-12-04T10:49:11.1963045Z ('RERUN', {'yellow': True}) [10.0937s] [100%] 2025-12-04T10:49:11.1963418Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:42:40.700988265 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1963421Z 2025-12-04T10:49:11.1963580Z [W1204 10:42:40.701360291 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1963582Z 2025-12-04T10:49:11.1963746Z [W1204 10:42:40.701440800 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1963748Z 2025-12-04T10:49:11.1963923Z [W1204 10:42:40.702822514 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1963930Z 2025-12-04T10:49:11.1964092Z [W1204 10:42:40.703151210 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1964095Z 2025-12-04T10:49:11.1964259Z [W1204 10:42:40.703232129 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1964261Z 2025-12-04T10:49:11.1964423Z [W1204 10:42:40.705513553 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1964439Z 2025-12-04T10:49:11.1964611Z [W1204 10:42:40.705776889 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1964613Z 2025-12-04T10:49:11.1964781Z [W1204 10:42:40.705853579 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1964787Z 2025-12-04T10:49:11.1964843Z ('RERUN', {'yellow': True}) [0.5909s] [100%] 2025-12-04T10:49:11.1965236Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 [W1204 10:42:40.310659141 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1965239Z 2025-12-04T10:49:11.1965400Z [W1204 10:42:40.311053136 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1965403Z 2025-12-04T10:49:11.1965567Z [W1204 10:42:40.311148915 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1965569Z 2025-12-04T10:49:11.1965729Z [W1204 10:42:40.312559389 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1965742Z 2025-12-04T10:49:11.1965906Z [W1204 10:42:40.312896505 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1965925Z 2025-12-04T10:49:11.1966090Z [W1204 10:42:40.312977034 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1966092Z 2025-12-04T10:49:11.1966253Z [W1204 10:42:40.315271137 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1966254Z 2025-12-04T10:49:11.1966421Z [W1204 10:42:40.315528704 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1966423Z 2025-12-04T10:49:11.1966585Z [W1204 10:42:40.315604663 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1966587Z 2025-12-04T10:49:11.1966634Z FAILED [0.6035s] [100%] 2025-12-04T10:49:11.1966636Z 2025-12-04T10:49:11.1966693Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1966865Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1966919Z Traceback (most recent call last): 2025-12-04T10:49:11.1967091Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1967140Z method(*args, **kwargs) 2025-12-04T10:49:11.1967308Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1967355Z method(*args, **kwargs) 2025-12-04T10:49:11.1967531Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1967577Z with policy(): 2025-12-04T10:49:11.1967745Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1967794Z raise RuntimeError(msg) 2025-12-04T10:49:11.1968234Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 1048576 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1968237Z 2025-12-04T10:49:11.1968334Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1968653Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1968660Z 2025-12-04T10:49:11.1968755Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1968837Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1968899Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1969094Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1969173Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1969217Z graph_break [] 2025-12-04T10:49:11.1969298Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1969681Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1969741Z if out == self.unknown_value: 2025-12-04T10:49:11.1969910Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1969974Z Traceback (most recent call last): 2025-12-04T10:49:11.1970144Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1970189Z method(*args, **kwargs) 2025-12-04T10:49:11.1970358Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1970402Z method(*args, **kwargs) 2025-12-04T10:49:11.1970571Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1970612Z with policy(): 2025-12-04T10:49:11.1970782Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1970828Z raise RuntimeError(msg) 2025-12-04T10:49:11.1971275Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 1048576 and is now reported as 2097152 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1971279Z 2025-12-04T10:49:11.1971362Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1971679Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1971681Z 2025-12-04T10:49:11.1971789Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1971912Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1971976Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1972171Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1972253Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1972295Z graph_break [] 2025-12-04T10:49:11.1972376Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1972764Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1972817Z if out == self.unknown_value: 2025-12-04T10:49:11.1972900Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1972960Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1973041Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1973231Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1973275Z graph_break [] 2025-12-04T10:49:11.1973332Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1973502Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.1973553Z Traceback (most recent call last): 2025-12-04T10:49:11.1973729Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1973790Z method(*args, **kwargs) 2025-12-04T10:49:11.1973962Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1974007Z method(*args, **kwargs) 2025-12-04T10:49:11.1974199Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1974241Z with policy(): 2025-12-04T10:49:11.1974415Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1974461Z raise RuntimeError(msg) 2025-12-04T10:49:11.1974929Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1974933Z 2025-12-04T10:49:11.1975017Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1975337Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1975341Z 2025-12-04T10:49:11.1975439Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1975519Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1975585Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1975781Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1975865Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1975935Z graph_break [] 2025-12-04T10:49:11.1976020Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.1976402Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.1976455Z if out == self.unknown_value: 2025-12-04T10:49:11.1976535Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1976600Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1976692Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1976893Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1976938Z graph_break [] 2025-12-04T10:49:11.1977021Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1977086Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1977165Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1977361Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1977402Z graph_break [] 2025-12-04T10:49:11.1977679Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-dd694da01bad65e1.xml - 2025-12-04T10:49:11.1977746Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1978467Z FAILED [0.6035s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 2097152 and is now reported as 3145728 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1978494Z 2025-12-04T10:49:11.1978581Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1978902Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1978905Z 2025-12-04T10:49:11.1979003Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1979073Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1979152Z ================== 1 failed, 57 deselected, 2 rerun in 11.44s ================== 2025-12-04T10:49:11.1979194Z Got exit code 1 2025-12-04T10:49:11.1979468Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.1979611Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.1979833Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-f4be804bb213f75e.xml 2025-12-04T10:49:11.1979899Z ============================= test session starts ============================== 2025-12-04T10:49:11.1980027Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1980075Z cachedir: .pytest_cache 2025-12-04T10:49:11.1980266Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1980320Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1980370Z configfile: pytest.ini 2025-12-04T10:49:11.1980553Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1980638Z collecting ... collected 58 items / 50 deselected / 8 selected 2025-12-04T10:49:11.1980700Z stepcurrent: skipping 50 already run items. 2025-12-04T10:49:11.1980750Z Running 8 items in this shard 2025-12-04T10:49:11.1980752Z 2025-12-04T10:49:11.1981047Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.6112s] [ 12%] 2025-12-04T10:49:11.1981321Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6693s] [ 12%] 2025-12-04T10:49:11.1981575Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 FAILED [0.7154s] [ 12%] 2025-12-04T10:49:11.1981578Z 2025-12-04T10:49:11.1981637Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1981810Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1981922Z Traceback (most recent call last): 2025-12-04T10:49:11.1982103Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1982149Z method(*args, **kwargs) 2025-12-04T10:49:11.1982323Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1982395Z method(*args, **kwargs) 2025-12-04T10:49:11.1982564Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1982627Z with policy(): 2025-12-04T10:49:11.1982797Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1982847Z raise RuntimeError(msg) 2025-12-04T10:49:11.1983291Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 65536 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.1983294Z 2025-12-04T10:49:11.1983377Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1983699Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1983702Z 2025-12-04T10:49:11.1983801Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1983880Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1983943Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1984137Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1984219Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1984261Z graph_break [] 2025-12-04T10:49:11.1984428Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1984508Z Traceback (most recent call last): 2025-12-04T10:49:11.1984680Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1984727Z method(*args, **kwargs) 2025-12-04T10:49:11.1984893Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1984937Z method(*args, **kwargs) 2025-12-04T10:49:11.1985104Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1985146Z with policy(): 2025-12-04T10:49:11.1985335Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1985382Z raise RuntimeError(msg) 2025-12-04T10:49:11.1985827Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 65536 and is now reported as 131072 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.1985831Z 2025-12-04T10:49:11.1985914Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1986234Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1986239Z 2025-12-04T10:49:11.1986334Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1986417Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1986478Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1986677Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1986769Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1986828Z graph_break [] 2025-12-04T10:49:11.1986907Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1986969Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1987046Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1987240Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1987280Z graph_break [] 2025-12-04T10:49:11.1987339Z =================================== FAILURES =================================== 2025-12-04T10:49:11.1987505Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1987558Z Traceback (most recent call last): 2025-12-04T10:49:11.1987728Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1987776Z method(*args, **kwargs) 2025-12-04T10:49:11.1987944Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1987991Z method(*args, **kwargs) 2025-12-04T10:49:11.1988159Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.1988204Z with policy(): 2025-12-04T10:49:11.1988371Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.1988419Z raise RuntimeError(msg) 2025-12-04T10:49:11.1988884Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1988889Z 2025-12-04T10:49:11.1988969Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1989291Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1989294Z 2025-12-04T10:49:11.1989400Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1989482Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1989542Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1989738Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1989818Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1989861Z graph_break [] 2025-12-04T10:49:11.1989941Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1990004Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1990081Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1990277Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1990319Z graph_break [] 2025-12-04T10:49:11.1990398Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.1990459Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.1990553Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.1990745Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.1990805Z graph_break [] 2025-12-04T10:49:11.1991074Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-f4be804bb213f75e.xml - 2025-12-04T10:49:11.1991139Z =========================== short test summary info ============================ 2025-12-04T10:49:11.1991842Z FAILED [0.7154s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.1991897Z 2025-12-04T10:49:11.1991979Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.1992299Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1992301Z 2025-12-04T10:49:11.1992397Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.1992466Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.1992541Z ================== 1 failed, 50 deselected, 2 rerun in 4.16s =================== 2025-12-04T10:49:11.1992581Z Got exit code 1 2025-12-04T10:49:11.1992629Z Retrying single test... 2025-12-04T10:49:11.1992877Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ac9366a191378ad6.xml 2025-12-04T10:49:11.1992941Z ============================= test session starts ============================== 2025-12-04T10:49:11.1993066Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.1993114Z cachedir: .pytest_cache 2025-12-04T10:49:11.1993288Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.1993341Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.1993406Z configfile: pytest.ini 2025-12-04T10:49:11.1993588Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.1993668Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.1993986Z stepcurrent: skipping 50 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.1994037Z Running 1 items in this shard 2025-12-04T10:49:11.1994040Z 2025-12-04T10:49:11.1994439Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:43:00.980470261 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1994442Z 2025-12-04T10:49:11.1994615Z [W1204 10:43:08.680969188 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1994617Z 2025-12-04T10:49:11.1994784Z [W1204 10:43:08.681128197 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1994810Z 2025-12-04T10:49:11.1994977Z [W1204 10:43:08.684639795 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1994994Z 2025-12-04T10:49:11.1995159Z [W1204 10:43:08.684924821 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1995161Z 2025-12-04T10:49:11.1995325Z [W1204 10:43:08.685006820 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1995327Z 2025-12-04T10:49:11.1995492Z [W1204 10:43:08.687378162 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1995494Z 2025-12-04T10:49:11.1995655Z [W1204 10:43:08.687657969 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1995660Z 2025-12-04T10:49:11.1995822Z [W1204 10:43:08.687733998 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1995825Z 2025-12-04T10:49:11.1995880Z ('RERUN', {'yellow': True}) [10.2136s] [100%] 2025-12-04T10:49:11.1996273Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:43:09.634764032 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1996276Z 2025-12-04T10:49:11.1996442Z [W1204 10:43:09.635180947 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1996444Z 2025-12-04T10:49:11.1996617Z [W1204 10:43:09.635267676 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1996621Z 2025-12-04T10:49:11.1996786Z [W1204 10:43:09.636653429 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1996789Z 2025-12-04T10:49:11.1996950Z [W1204 10:43:09.636987445 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1996954Z 2025-12-04T10:49:11.1997115Z [W1204 10:43:09.637072844 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1997117Z 2025-12-04T10:49:11.1997295Z [W1204 10:43:09.639241878 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1997297Z 2025-12-04T10:49:11.1997459Z [W1204 10:43:09.639509135 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1997462Z 2025-12-04T10:49:11.1997626Z [W1204 10:43:09.639585104 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1997628Z 2025-12-04T10:49:11.1997683Z ('RERUN', {'yellow': True}) [0.4569s] [100%] 2025-12-04T10:49:11.1998075Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:43:09.109132969 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1998077Z 2025-12-04T10:49:11.1998242Z [W1204 10:43:09.109520975 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1998244Z 2025-12-04T10:49:11.1998406Z [W1204 10:43:09.109607034 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1998420Z 2025-12-04T10:49:11.1998583Z [W1204 10:43:09.110991357 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1998585Z 2025-12-04T10:49:11.1998761Z [W1204 10:43:09.111325783 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1998764Z 2025-12-04T10:49:11.1998929Z [W1204 10:43:09.111404672 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1998931Z 2025-12-04T10:49:11.1999095Z [W1204 10:43:09.113595726 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1999097Z 2025-12-04T10:49:11.1999258Z [W1204 10:43:09.113852653 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1999260Z 2025-12-04T10:49:11.1999427Z [W1204 10:43:09.113927802 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.1999429Z 2025-12-04T10:49:11.1999470Z FAILED [0.4732s] [100%] 2025-12-04T10:49:11.1999473Z 2025-12-04T10:49:11.1999532Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.1999696Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.1999748Z Traceback (most recent call last): 2025-12-04T10:49:11.1999921Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.1999969Z method(*args, **kwargs) 2025-12-04T10:49:11.2000136Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2000182Z method(*args, **kwargs) 2025-12-04T10:49:11.2000361Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2000403Z with policy(): 2025-12-04T10:49:11.2000573Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2000618Z raise RuntimeError(msg) 2025-12-04T10:49:11.2001067Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 65536 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.2001070Z 2025-12-04T10:49:11.2001151Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2001471Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2001474Z 2025-12-04T10:49:11.2001567Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2001649Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2001710Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2001953Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2002034Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2002077Z graph_break [] 2025-12-04T10:49:11.2002157Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2002535Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2002612Z if out == self.unknown_value: 2025-12-04T10:49:11.2002798Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2002849Z Traceback (most recent call last): 2025-12-04T10:49:11.2003017Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2003062Z method(*args, **kwargs) 2025-12-04T10:49:11.2003228Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2003272Z method(*args, **kwargs) 2025-12-04T10:49:11.2003436Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2003479Z with policy(): 2025-12-04T10:49:11.2003645Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2003691Z raise RuntimeError(msg) 2025-12-04T10:49:11.2004136Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 65536 and is now reported as 131072 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.2004140Z 2025-12-04T10:49:11.2004220Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2004538Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2004555Z 2025-12-04T10:49:11.2004650Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2004732Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2004793Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2004985Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2005063Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2005110Z graph_break [] 2025-12-04T10:49:11.2005203Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2005581Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2005631Z if out == self.unknown_value: 2025-12-04T10:49:11.2005713Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2005774Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2005858Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2006050Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2006094Z graph_break [] 2025-12-04T10:49:11.2006157Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2006324Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2006377Z Traceback (most recent call last): 2025-12-04T10:49:11.2006547Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2006614Z method(*args, **kwargs) 2025-12-04T10:49:11.2006780Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2006840Z method(*args, **kwargs) 2025-12-04T10:49:11.2007005Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2007050Z with policy(): 2025-12-04T10:49:11.2007216Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2007266Z raise RuntimeError(msg) 2025-12-04T10:49:11.2007709Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2007712Z 2025-12-04T10:49:11.2007796Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2008114Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2008119Z 2025-12-04T10:49:11.2008213Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2008296Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2008356Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2008551Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2008641Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2008686Z graph_break [] 2025-12-04T10:49:11.2008764Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2009145Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2009193Z if out == self.unknown_value: 2025-12-04T10:49:11.2009275Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2009350Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2009432Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2009624Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2009669Z graph_break [] 2025-12-04T10:49:11.2009747Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2009810Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2009889Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2010081Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2010121Z graph_break [] 2025-12-04T10:49:11.2010390Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ac9366a191378ad6.xml - 2025-12-04T10:49:11.2010458Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2011155Z FAILED [0.4732s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2011181Z 2025-12-04T10:49:11.2011264Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2011580Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2011585Z 2025-12-04T10:49:11.2011678Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2011750Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2011825Z ================== 1 failed, 57 deselected, 2 rerun in 11.30s ================== 2025-12-04T10:49:11.2011914Z Got exit code 1 2025-12-04T10:49:11.2011961Z Retrying single test... 2025-12-04T10:49:11.2012180Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d83e981c227c7f28.xml 2025-12-04T10:49:11.2012244Z ============================= test session starts ============================== 2025-12-04T10:49:11.2012369Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2012416Z cachedir: .pytest_cache 2025-12-04T10:49:11.2012593Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2012644Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2012692Z configfile: pytest.ini 2025-12-04T10:49:11.2012893Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2012977Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.2013289Z stepcurrent: skipping 50 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2013341Z Running 1 items in this shard 2025-12-04T10:49:11.2013343Z 2025-12-04T10:49:11.2013754Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:43:18.825023419 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2013760Z 2025-12-04T10:49:11.2013930Z [W1204 10:43:25.174833900 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2013933Z 2025-12-04T10:49:11.2014103Z [W1204 10:43:25.174994858 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2014106Z 2025-12-04T10:49:11.2014270Z [W1204 10:43:25.178663104 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2014272Z 2025-12-04T10:49:11.2014439Z [W1204 10:43:25.178952540 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2014442Z 2025-12-04T10:49:11.2014605Z [W1204 10:43:25.179038729 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2014611Z 2025-12-04T10:49:11.2014773Z [W1204 10:43:25.181376991 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2014797Z 2025-12-04T10:49:11.2014963Z [W1204 10:43:25.181635528 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2014979Z 2025-12-04T10:49:11.2015141Z [W1204 10:43:25.181710817 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2015143Z 2025-12-04T10:49:11.2015201Z ('RERUN', {'yellow': True}) [9.8568s] [100%] 2025-12-04T10:49:11.2015593Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:43:26.119263660 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2015597Z 2025-12-04T10:49:11.2015763Z [W1204 10:43:26.119640716 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2015766Z 2025-12-04T10:49:11.2015933Z [W1204 10:43:26.119721125 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2015936Z 2025-12-04T10:49:11.2016100Z [W1204 10:43:26.121104558 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2016102Z 2025-12-04T10:49:11.2016267Z [W1204 10:43:26.121433844 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2016271Z 2025-12-04T10:49:11.2016433Z [W1204 10:43:26.121511323 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2016435Z 2025-12-04T10:49:11.2016614Z [W1204 10:43:26.123676617 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2016617Z 2025-12-04T10:49:11.2016785Z [W1204 10:43:26.123937824 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2016788Z 2025-12-04T10:49:11.2016950Z [W1204 10:43:26.124018023 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2016952Z 2025-12-04T10:49:11.2017010Z ('RERUN', {'yellow': True}) [0.4526s] [100%] 2025-12-04T10:49:11.2017427Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 [W1204 10:43:27.557673598 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2017429Z 2025-12-04T10:49:11.2017597Z [W1204 10:43:27.558042553 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2017600Z 2025-12-04T10:49:11.2017765Z [W1204 10:43:27.558125472 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2017769Z 2025-12-04T10:49:11.2017931Z [W1204 10:43:27.559487876 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2017933Z 2025-12-04T10:49:11.2018098Z [W1204 10:43:27.559801062 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2018100Z 2025-12-04T10:49:11.2018266Z [W1204 10:43:27.559878731 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2018268Z 2025-12-04T10:49:11.2018436Z [W1204 10:43:27.562068944 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2018452Z 2025-12-04T10:49:11.2018615Z [W1204 10:43:27.562321091 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2018637Z 2025-12-04T10:49:11.2018799Z [W1204 10:43:27.562395270 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2018801Z 2025-12-04T10:49:11.2018848Z FAILED [0.4283s] [100%] 2025-12-04T10:49:11.2018850Z 2025-12-04T10:49:11.2018907Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2019076Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2019128Z Traceback (most recent call last): 2025-12-04T10:49:11.2019303Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2019350Z method(*args, **kwargs) 2025-12-04T10:49:11.2019521Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2019566Z method(*args, **kwargs) 2025-12-04T10:49:11.2019736Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2019777Z with policy(): 2025-12-04T10:49:11.2019948Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2019993Z raise RuntimeError(msg) 2025-12-04T10:49:11.2020445Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 65536 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.2020449Z 2025-12-04T10:49:11.2020533Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2020852Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2020856Z 2025-12-04T10:49:11.2020954Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2021035Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2021126Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2021320Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2021404Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2021447Z graph_break [] 2025-12-04T10:49:11.2021530Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2021950Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2022006Z if out == self.unknown_value: 2025-12-04T10:49:11.2022170Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2022225Z Traceback (most recent call last): 2025-12-04T10:49:11.2022396Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2022441Z method(*args, **kwargs) 2025-12-04T10:49:11.2022611Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2022671Z method(*args, **kwargs) 2025-12-04T10:49:11.2022839Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2022898Z with policy(): 2025-12-04T10:49:11.2023068Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2023114Z raise RuntimeError(msg) 2025-12-04T10:49:11.2023565Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 65536 and is now reported as 131072 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.2023567Z 2025-12-04T10:49:11.2023647Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2023967Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2023970Z 2025-12-04T10:49:11.2024065Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2024148Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2024213Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2024407Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2024489Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2024532Z graph_break [] 2025-12-04T10:49:11.2024627Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2025004Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2025057Z if out == self.unknown_value: 2025-12-04T10:49:11.2025136Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2025201Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2025280Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2025492Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2025535Z graph_break [] 2025-12-04T10:49:11.2025596Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2025764Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2025818Z Traceback (most recent call last): 2025-12-04T10:49:11.2025986Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2026037Z method(*args, **kwargs) 2025-12-04T10:49:11.2026203Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2026249Z method(*args, **kwargs) 2025-12-04T10:49:11.2026417Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2026462Z with policy(): 2025-12-04T10:49:11.2026632Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2026680Z raise RuntimeError(msg) 2025-12-04T10:49:11.2027136Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2027155Z 2025-12-04T10:49:11.2027235Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2027555Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2027558Z 2025-12-04T10:49:11.2027652Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2027734Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2027797Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2027993Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2028073Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2028117Z graph_break [] 2025-12-04T10:49:11.2028199Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2028577Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2028630Z if out == self.unknown_value: 2025-12-04T10:49:11.2028709Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2028782Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2028862Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2029058Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2029101Z graph_break [] 2025-12-04T10:49:11.2029182Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2029242Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2029324Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2029530Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2029575Z graph_break [] 2025-12-04T10:49:11.2029842Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-d83e981c227c7f28.xml - 2025-12-04T10:49:11.2029913Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2030610Z FAILED [0.4283s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 196608 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2030614Z 2025-12-04T10:49:11.2030693Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2031015Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2031028Z 2025-12-04T10:49:11.2031122Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2031205Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2031279Z ================== 1 failed, 57 deselected, 2 rerun in 10.90s ================== 2025-12-04T10:49:11.2031324Z Got exit code 1 2025-12-04T10:49:11.2031587Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2031732Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.2032015Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-b5900ef3d5a6a804.xml 2025-12-04T10:49:11.2032081Z ============================= test session starts ============================== 2025-12-04T10:49:11.2032208Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2032256Z cachedir: .pytest_cache 2025-12-04T10:49:11.2032434Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2032485Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2032534Z configfile: pytest.ini 2025-12-04T10:49:11.2032712Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2032795Z collecting ... collected 58 items / 51 deselected / 7 selected 2025-12-04T10:49:11.2032854Z stepcurrent: skipping 51 already run items. 2025-12-04T10:49:11.2032907Z Running 7 items in this shard 2025-12-04T10:49:11.2032909Z 2025-12-04T10:49:11.2033208Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.9921s] [ 14%] 2025-12-04T10:49:11.2033483Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6534s] [ 14%] 2025-12-04T10:49:11.2033726Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 FAILED [0.6466s] [ 14%] 2025-12-04T10:49:11.2033732Z 2025-12-04T10:49:11.2033805Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2033973Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2034023Z Traceback (most recent call last): 2025-12-04T10:49:11.2034203Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2034249Z method(*args, **kwargs) 2025-12-04T10:49:11.2034420Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2034465Z method(*args, **kwargs) 2025-12-04T10:49:11.2034635Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2034677Z with policy(): 2025-12-04T10:49:11.2034848Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2034894Z raise RuntimeError(msg) 2025-12-04T10:49:11.2035334Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.2035354Z 2025-12-04T10:49:11.2035435Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2035772Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2035775Z 2025-12-04T10:49:11.2035872Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2035952Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2036017Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2036319Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2036404Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2036446Z graph_break [] 2025-12-04T10:49:11.2036613Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2036663Z Traceback (most recent call last): 2025-12-04T10:49:11.2036835Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2036879Z method(*args, **kwargs) 2025-12-04T10:49:11.2037048Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2037093Z method(*args, **kwargs) 2025-12-04T10:49:11.2037279Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2037322Z with policy(): 2025-12-04T10:49:11.2037492Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2037538Z raise RuntimeError(msg) 2025-12-04T10:49:11.2037986Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.2037989Z 2025-12-04T10:49:11.2038085Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2038408Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2038412Z 2025-12-04T10:49:11.2038509Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2038588Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2038654Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2038954Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2039037Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2039078Z graph_break [] 2025-12-04T10:49:11.2039161Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2039221Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2039303Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2039616Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2039670Z graph_break [] 2025-12-04T10:49:11.2039732Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2039897Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2039951Z Traceback (most recent call last): 2025-12-04T10:49:11.2040122Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2040171Z method(*args, **kwargs) 2025-12-04T10:49:11.2040337Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2040386Z method(*args, **kwargs) 2025-12-04T10:49:11.2040554Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2040600Z with policy(): 2025-12-04T10:49:11.2040768Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2040817Z raise RuntimeError(msg) 2025-12-04T10:49:11.2041261Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2041263Z 2025-12-04T10:49:11.2041346Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2041675Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2041679Z 2025-12-04T10:49:11.2041774Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2041928Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2041990Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2042310Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2042391Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2042435Z graph_break [] 2025-12-04T10:49:11.2042514Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2042578Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2042656Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2042958Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2042999Z graph_break [] 2025-12-04T10:49:11.2043081Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2043142Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2043223Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2043520Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2043577Z graph_break [] 2025-12-04T10:49:11.2043847Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-b5900ef3d5a6a804.xml - 2025-12-04T10:49:11.2043934Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2044634Z FAILED [0.6466s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2044638Z 2025-12-04T10:49:11.2044720Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2045044Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2045047Z 2025-12-04T10:49:11.2045144Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2045213Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2045289Z ================== 1 failed, 51 deselected, 2 rerun in 4.45s =================== 2025-12-04T10:49:11.2045331Z Got exit code 1 2025-12-04T10:49:11.2045380Z Retrying single test... 2025-12-04T10:49:11.2045601Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ea5bddd707d4884e.xml 2025-12-04T10:49:11.2045682Z ============================= test session starts ============================== 2025-12-04T10:49:11.2045807Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2045856Z cachedir: .pytest_cache 2025-12-04T10:49:11.2046032Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2046086Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2046131Z configfile: pytest.ini 2025-12-04T10:49:11.2046317Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2046411Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.2046727Z stepcurrent: skipping 51 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2046777Z Running 1 items in this shard 2025-12-04T10:49:11.2046779Z 2025-12-04T10:49:11.2047177Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:43:47.034436354 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2047181Z 2025-12-04T10:49:11.2047353Z [W1204 10:43:54.372164023 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2047355Z 2025-12-04T10:49:11.2047522Z [W1204 10:43:54.372314551 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2047524Z 2025-12-04T10:49:11.2047691Z [W1204 10:43:54.375962076 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2047706Z 2025-12-04T10:49:11.2047871Z [W1204 10:43:54.376278422 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2047887Z 2025-12-04T10:49:11.2048056Z [W1204 10:43:54.376360831 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2048058Z 2025-12-04T10:49:11.2048224Z [W1204 10:43:54.378796331 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2048226Z 2025-12-04T10:49:11.2048389Z [W1204 10:43:54.379064868 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2048391Z 2025-12-04T10:49:11.2048556Z [W1204 10:43:54.379144497 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2048560Z 2025-12-04T10:49:11.2048617Z ('RERUN', {'yellow': True}) [10.2342s] [100%] 2025-12-04T10:49:11.2049010Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:43:55.007257248 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2049014Z 2025-12-04T10:49:11.2049180Z [W1204 10:43:55.007605564 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2049183Z 2025-12-04T10:49:11.2049346Z [W1204 10:43:55.007685803 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2049348Z 2025-12-04T10:49:11.2049515Z [W1204 10:43:55.009070726 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2049539Z 2025-12-04T10:49:11.2049702Z [W1204 10:43:55.009321323 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2049705Z 2025-12-04T10:49:11.2049871Z [W1204 10:43:55.009397122 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2049873Z 2025-12-04T10:49:11.2050041Z [W1204 10:43:55.011580735 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2050043Z 2025-12-04T10:49:11.2050227Z [W1204 10:43:55.011836482 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2050230Z 2025-12-04T10:49:11.2050397Z [W1204 10:43:55.011910511 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2050400Z 2025-12-04T10:49:11.2050460Z ('RERUN', {'yellow': True}) [0.4766s] [100%] 2025-12-04T10:49:11.2050858Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:43:55.468776087 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2050861Z 2025-12-04T10:49:11.2051025Z [W1204 10:43:55.469162372 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2051031Z 2025-12-04T10:49:11.2051193Z [W1204 10:43:55.469252251 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2051195Z 2025-12-04T10:49:11.2051361Z [W1204 10:43:55.470623634 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2051378Z 2025-12-04T10:49:11.2051543Z [W1204 10:43:55.470879511 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2051545Z 2025-12-04T10:49:11.2051710Z [W1204 10:43:55.470956270 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2051736Z 2025-12-04T10:49:11.2051951Z [W1204 10:43:55.473134573 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2051954Z 2025-12-04T10:49:11.2052121Z [W1204 10:43:55.473393550 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2052123Z 2025-12-04T10:49:11.2052289Z [W1204 10:43:55.473467659 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2052291Z 2025-12-04T10:49:11.2052337Z FAILED [0.4558s] [100%] 2025-12-04T10:49:11.2052339Z 2025-12-04T10:49:11.2052401Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2052567Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2052622Z Traceback (most recent call last): 2025-12-04T10:49:11.2052796Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2052845Z method(*args, **kwargs) 2025-12-04T10:49:11.2053013Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2055731Z method(*args, **kwargs) 2025-12-04T10:49:11.2055910Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2055953Z with policy(): 2025-12-04T10:49:11.2056164Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2056210Z raise RuntimeError(msg) 2025-12-04T10:49:11.2056651Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.2056657Z 2025-12-04T10:49:11.2056739Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2057077Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2057079Z 2025-12-04T10:49:11.2057178Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2057261Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2057325Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2057626Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2057708Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2057748Z graph_break [] 2025-12-04T10:49:11.2057832Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2058215Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2058279Z if out == self.unknown_value: 2025-12-04T10:49:11.2058445Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2058515Z Traceback (most recent call last): 2025-12-04T10:49:11.2058685Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2058729Z method(*args, **kwargs) 2025-12-04T10:49:11.2058897Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2058943Z method(*args, **kwargs) 2025-12-04T10:49:11.2059109Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2059150Z with policy(): 2025-12-04T10:49:11.2059320Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2059366Z raise RuntimeError(msg) 2025-12-04T10:49:11.2059811Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.2059816Z 2025-12-04T10:49:11.2059895Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2060216Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2060218Z 2025-12-04T10:49:11.2060314Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2060404Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2060467Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2060767Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2060849Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2060889Z graph_break [] 2025-12-04T10:49:11.2060968Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2061357Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2061408Z if out == self.unknown_value: 2025-12-04T10:49:11.2061486Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2061549Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2061627Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2061983Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2062023Z graph_break [] 2025-12-04T10:49:11.2062082Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2062244Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2062296Z Traceback (most recent call last): 2025-12-04T10:49:11.2062467Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2062537Z method(*args, **kwargs) 2025-12-04T10:49:11.2062703Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2062771Z method(*args, **kwargs) 2025-12-04T10:49:11.2062938Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2062978Z with policy(): 2025-12-04T10:49:11.2063148Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2063192Z raise RuntimeError(msg) 2025-12-04T10:49:11.2063643Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2063648Z 2025-12-04T10:49:11.2063728Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2064046Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2064048Z 2025-12-04T10:49:11.2064143Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2064225Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2064287Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2064595Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2064677Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2064716Z graph_break [] 2025-12-04T10:49:11.2064797Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2065170Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2065218Z if out == self.unknown_value: 2025-12-04T10:49:11.2065310Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2065371Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2065449Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2065746Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2065787Z graph_break [] 2025-12-04T10:49:11.2065866Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2065924Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2066003Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2066297Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2066337Z graph_break [] 2025-12-04T10:49:11.2066609Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ea5bddd707d4884e.xml - 2025-12-04T10:49:11.2066687Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2067382Z FAILED [0.4558s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2067402Z 2025-12-04T10:49:11.2067481Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2067796Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2067799Z 2025-12-04T10:49:11.2067894Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2067961Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2068038Z ================== 1 failed, 57 deselected, 2 rerun in 11.32s ================== 2025-12-04T10:49:11.2068078Z Got exit code 1 2025-12-04T10:49:11.2068123Z Retrying single test... 2025-12-04T10:49:11.2068341Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ad779801dbf0137d.xml 2025-12-04T10:49:11.2068406Z ============================= test session starts ============================== 2025-12-04T10:49:11.2068530Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2068576Z cachedir: .pytest_cache 2025-12-04T10:49:11.2068766Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2068819Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2068864Z configfile: pytest.ini 2025-12-04T10:49:11.2069046Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2069127Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.2069456Z stepcurrent: skipping 51 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2069506Z Running 1 items in this shard 2025-12-04T10:49:11.2069509Z 2025-12-04T10:49:11.2069908Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:44:05.906954079 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2069913Z 2025-12-04T10:49:11.2070083Z [W1204 10:44:12.646842822 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2070086Z 2025-12-04T10:49:11.2070250Z [W1204 10:44:12.647005490 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2070252Z 2025-12-04T10:49:11.2070418Z [W1204 10:44:12.650929241 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2070420Z 2025-12-04T10:49:11.2070581Z [W1204 10:44:12.651240727 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2070586Z 2025-12-04T10:49:11.2070749Z [W1204 10:44:12.651318186 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2070768Z 2025-12-04T10:49:11.2070933Z [W1204 10:44:12.653825405 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2070952Z 2025-12-04T10:49:11.2071113Z [W1204 10:44:12.654116261 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2071115Z 2025-12-04T10:49:11.2071278Z [W1204 10:44:12.654193540 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2071281Z 2025-12-04T10:49:11.2071336Z ('RERUN', {'yellow': True}) [9.7987s] [100%] 2025-12-04T10:49:11.2071730Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:44:12.400699419 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2071733Z 2025-12-04T10:49:11.2071953Z [W1204 10:44:12.401110584 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2071956Z 2025-12-04T10:49:11.2072119Z [W1204 10:44:12.401197393 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2072121Z 2025-12-04T10:49:11.2072285Z [W1204 10:44:12.402582566 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2072287Z 2025-12-04T10:49:11.2072449Z [W1204 10:44:12.402833003 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2072451Z 2025-12-04T10:49:11.2072637Z [W1204 10:44:12.402906812 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2072640Z 2025-12-04T10:49:11.2072804Z [W1204 10:44:12.405088005 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2072807Z 2025-12-04T10:49:11.2072969Z [W1204 10:44:12.405366721 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2072971Z 2025-12-04T10:49:11.2073137Z [W1204 10:44:12.405440480 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2073155Z 2025-12-04T10:49:11.2073209Z ('RERUN', {'yellow': True}) [0.6043s] [100%] 2025-12-04T10:49:11.2073601Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 [W1204 10:44:13.997995360 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2073604Z 2025-12-04T10:49:11.2073765Z [W1204 10:44:13.998444134 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2073770Z 2025-12-04T10:49:11.2073931Z [W1204 10:44:13.998537503 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2073933Z 2025-12-04T10:49:11.2074095Z [W1204 10:44:13.999938876 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2074098Z 2025-12-04T10:49:11.2074259Z [W1204 10:44:13.000214522 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2074261Z 2025-12-04T10:49:11.2074423Z [W1204 10:44:13.000292821 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2074441Z 2025-12-04T10:49:11.2074602Z [W1204 10:44:13.002492634 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2074623Z 2025-12-04T10:49:11.2074783Z [W1204 10:44:13.002750781 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2074785Z 2025-12-04T10:49:11.2074948Z [W1204 10:44:13.002825530 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2074950Z 2025-12-04T10:49:11.2074993Z FAILED [0.6277s] [100%] 2025-12-04T10:49:11.2074995Z 2025-12-04T10:49:11.2075054Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2075220Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2075273Z Traceback (most recent call last): 2025-12-04T10:49:11.2075448Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2075495Z method(*args, **kwargs) 2025-12-04T10:49:11.2075660Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2075705Z method(*args, **kwargs) 2025-12-04T10:49:11.2075872Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2077926Z with policy(): 2025-12-04T10:49:11.2078096Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2078144Z raise RuntimeError(msg) 2025-12-04T10:49:11.2078595Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 66560 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.2078602Z 2025-12-04T10:49:11.2078683Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2079001Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2079005Z 2025-12-04T10:49:11.2079146Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2079229Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2079290Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2079592Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2079673Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2079715Z graph_break [] 2025-12-04T10:49:11.2079793Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2080178Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2080227Z if out == self.unknown_value: 2025-12-04T10:49:11.2080392Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2080444Z Traceback (most recent call last): 2025-12-04T10:49:11.2080615Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2080659Z method(*args, **kwargs) 2025-12-04T10:49:11.2080839Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2080882Z method(*args, **kwargs) 2025-12-04T10:49:11.2081048Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2081090Z with policy(): 2025-12-04T10:49:11.2081258Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2081305Z raise RuntimeError(msg) 2025-12-04T10:49:11.2081750Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 66560 and is now reported as 133120 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.2081753Z 2025-12-04T10:49:11.2081837Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2082199Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2082202Z 2025-12-04T10:49:11.2082299Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2082460Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2082523Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2082841Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2082924Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2082967Z graph_break [] 2025-12-04T10:49:11.2083044Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2083419Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2083483Z if out == self.unknown_value: 2025-12-04T10:49:11.2083563Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2083621Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2083701Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2083997Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2084039Z graph_break [] 2025-12-04T10:49:11.2084096Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2084264Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2084313Z Traceback (most recent call last): 2025-12-04T10:49:11.2084483Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2084528Z method(*args, **kwargs) 2025-12-04T10:49:11.2084696Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2084740Z method(*args, **kwargs) 2025-12-04T10:49:11.2084907Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2084968Z with policy(): 2025-12-04T10:49:11.2085137Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2085182Z raise RuntimeError(msg) 2025-12-04T10:49:11.2085626Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2085630Z 2025-12-04T10:49:11.2085710Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2086022Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2086025Z 2025-12-04T10:49:11.2086121Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2086198Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2086259Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2086554Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2086654Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2086695Z graph_break [] 2025-12-04T10:49:11.2086789Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2087162Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2087209Z if out == self.unknown_value: 2025-12-04T10:49:11.2087290Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2087349Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2087429Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2087735Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2087778Z graph_break [] 2025-12-04T10:49:11.2087856Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2087916Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2087994Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2088289Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2088329Z graph_break [] 2025-12-04T10:49:11.2088595Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ad779801dbf0137d.xml - 2025-12-04T10:49:11.2088662Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2089358Z FAILED [0.6277s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 133120 and is now reported as 199680 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2089374Z 2025-12-04T10:49:11.2089456Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2089770Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2089773Z 2025-12-04T10:49:11.2089868Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2089936Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2090010Z ================== 1 failed, 57 deselected, 2 rerun in 11.17s ================== 2025-12-04T10:49:11.2090050Z Got exit code 1 2025-12-04T10:49:11.2090313Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2090455Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.2090670Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-21f50debd020f240.xml 2025-12-04T10:49:11.2090749Z ============================= test session starts ============================== 2025-12-04T10:49:11.2090872Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2090919Z cachedir: .pytest_cache 2025-12-04T10:49:11.2091101Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2091154Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2091198Z configfile: pytest.ini 2025-12-04T10:49:11.2091378Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2091458Z collecting ... collected 58 items / 52 deselected / 6 selected 2025-12-04T10:49:11.2091516Z stepcurrent: skipping 52 already run items. 2025-12-04T10:49:11.2091564Z Running 6 items in this shard 2025-12-04T10:49:11.2091566Z 2025-12-04T10:49:11.2091919Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.7000s] [ 16%] 2025-12-04T10:49:11.2092189Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.7693s] [ 16%] 2025-12-04T10:49:11.2092438Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 FAILED [0.7691s] [ 16%] 2025-12-04T10:49:11.2092441Z 2025-12-04T10:49:11.2092499Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2092663Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2092715Z Traceback (most recent call last): 2025-12-04T10:49:11.2092888Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2092934Z method(*args, **kwargs) 2025-12-04T10:49:11.2093101Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2093145Z method(*args, **kwargs) 2025-12-04T10:49:11.2093309Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2093370Z with policy(): 2025-12-04T10:49:11.2093536Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2093582Z raise RuntimeError(msg) 2025-12-04T10:49:11.2094017Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.2094023Z 2025-12-04T10:49:11.2094102Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2094420Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2094424Z 2025-12-04T10:49:11.2094516Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2094598Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2094657Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2094853Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2094947Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2094989Z graph_break [] 2025-12-04T10:49:11.2095166Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2095218Z Traceback (most recent call last): 2025-12-04T10:49:11.2095384Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2095429Z method(*args, **kwargs) 2025-12-04T10:49:11.2095592Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2095637Z method(*args, **kwargs) 2025-12-04T10:49:11.2095799Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2095855Z with policy(): 2025-12-04T10:49:11.2096024Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2096071Z raise RuntimeError(msg) 2025-12-04T10:49:11.2096515Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.2096520Z 2025-12-04T10:49:11.2096599Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2096915Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2096917Z 2025-12-04T10:49:11.2097012Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2097091Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2097151Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2097347Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2097425Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2097481Z graph_break [] 2025-12-04T10:49:11.2097559Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2097619Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2097695Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2097889Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2097931Z graph_break [] 2025-12-04T10:49:11.2097989Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2098157Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2098206Z Traceback (most recent call last): 2025-12-04T10:49:11.2098376Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2098420Z method(*args, **kwargs) 2025-12-04T10:49:11.2098585Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2098629Z method(*args, **kwargs) 2025-12-04T10:49:11.2098793Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2098847Z with policy(): 2025-12-04T10:49:11.2099012Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2099056Z raise RuntimeError(msg) 2025-12-04T10:49:11.2099510Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2099513Z 2025-12-04T10:49:11.2099592Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2099914Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2099930Z 2025-12-04T10:49:11.2100024Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2100102Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2100164Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2100353Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2100432Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2100473Z graph_break [] 2025-12-04T10:49:11.2100551Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2100609Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2100686Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2100875Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2100917Z graph_break [] 2025-12-04T10:49:11.2100993Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2101054Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2101129Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2101318Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2101372Z graph_break [] 2025-12-04T10:49:11.2101638Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-21f50debd020f240.xml - 2025-12-04T10:49:11.2101704Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2102448Z FAILED [0.7691s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2102452Z 2025-12-04T10:49:11.2102534Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2102846Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2102848Z 2025-12-04T10:49:11.2102943Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2103033Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2103106Z ================== 1 failed, 52 deselected, 2 rerun in 4.40s =================== 2025-12-04T10:49:11.2103146Z Got exit code 1 2025-12-04T10:49:11.2103205Z Retrying single test... 2025-12-04T10:49:11.2103421Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-7c698df328fa8627.xml 2025-12-04T10:49:11.2103486Z ============================= test session starts ============================== 2025-12-04T10:49:11.2103610Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2103655Z cachedir: .pytest_cache 2025-12-04T10:49:11.2103829Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2103896Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2103944Z configfile: pytest.ini 2025-12-04T10:49:11.2104124Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2104205Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.2104524Z stepcurrent: skipping 52 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2104572Z Running 1 items in this shard 2025-12-04T10:49:11.2104575Z 2025-12-04T10:49:11.2104973Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:44:34.744164462 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2104979Z 2025-12-04T10:49:11.2105146Z [W1204 10:44:41.380036898 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2105148Z 2025-12-04T10:49:11.2105313Z [W1204 10:44:41.380197846 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2105315Z 2025-12-04T10:49:11.2105477Z [W1204 10:44:41.384367923 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2105497Z 2025-12-04T10:49:11.2105658Z [W1204 10:44:41.384684499 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2105661Z 2025-12-04T10:49:11.2105824Z [W1204 10:44:41.384762348 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2105828Z 2025-12-04T10:49:11.2105989Z [W1204 10:44:41.387299135 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2105993Z 2025-12-04T10:49:11.2106156Z [W1204 10:44:41.387564652 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2106158Z 2025-12-04T10:49:11.2106320Z [W1204 10:44:41.387641821 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2106324Z 2025-12-04T10:49:11.2106380Z ('RERUN', {'yellow': True}) [10.2854s] [100%] 2025-12-04T10:49:11.2106775Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:44:43.598587150 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2106788Z 2025-12-04T10:49:11.2106952Z [W1204 10:44:43.599012825 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2106956Z 2025-12-04T10:49:11.2107131Z [W1204 10:44:43.599093974 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2107133Z 2025-12-04T10:49:11.2107295Z [W1204 10:44:43.600465266 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2107299Z 2025-12-04T10:49:11.2107459Z [W1204 10:44:43.600785412 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2107461Z 2025-12-04T10:49:11.2107623Z [W1204 10:44:43.600862311 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2107625Z 2025-12-04T10:49:11.2107797Z [W1204 10:44:43.603192042 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2107799Z 2025-12-04T10:49:11.2107964Z [W1204 10:44:43.603444948 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2107966Z 2025-12-04T10:49:11.2108128Z [W1204 10:44:43.603520967 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2108131Z 2025-12-04T10:49:11.2108184Z ('RERUN', {'yellow': True}) [0.6920s] [100%] 2025-12-04T10:49:11.2108576Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:44:43.260732897 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2108578Z 2025-12-04T10:49:11.2108741Z [W1204 10:44:43.261133672 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2108744Z 2025-12-04T10:49:11.2108907Z [W1204 10:44:43.261215481 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2108909Z 2025-12-04T10:49:11.2109073Z [W1204 10:44:43.262580914 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2109094Z 2025-12-04T10:49:11.2109255Z [W1204 10:44:43.262896160 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2109257Z 2025-12-04T10:49:11.2109419Z [W1204 10:44:43.262972939 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2109421Z 2025-12-04T10:49:11.2109584Z [W1204 10:44:43.265222840 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2109587Z 2025-12-04T10:49:11.2109750Z [W1204 10:44:43.265476307 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2109752Z 2025-12-04T10:49:11.2109914Z [W1204 10:44:43.265551386 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2109917Z 2025-12-04T10:49:11.2109958Z FAILED [0.6483s] [100%] 2025-12-04T10:49:11.2109962Z 2025-12-04T10:49:11.2110019Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2110184Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2110235Z Traceback (most recent call last): 2025-12-04T10:49:11.2110406Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2110467Z method(*args, **kwargs) 2025-12-04T10:49:11.2110633Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2110688Z method(*args, **kwargs) 2025-12-04T10:49:11.2110853Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2110895Z with policy(): 2025-12-04T10:49:11.2111062Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2111108Z raise RuntimeError(msg) 2025-12-04T10:49:11.2111556Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.2111562Z 2025-12-04T10:49:11.2111642Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2111998Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2112002Z 2025-12-04T10:49:11.2112096Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2112175Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2112235Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2112428Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2112509Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2112551Z graph_break [] 2025-12-04T10:49:11.2112628Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2113012Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2113083Z if out == self.unknown_value: 2025-12-04T10:49:11.2113249Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2113297Z Traceback (most recent call last): 2025-12-04T10:49:11.2113465Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2113510Z method(*args, **kwargs) 2025-12-04T10:49:11.2113676Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2113719Z method(*args, **kwargs) 2025-12-04T10:49:11.2113888Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2113930Z with policy(): 2025-12-04T10:49:11.2114100Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2114146Z raise RuntimeError(msg) 2025-12-04T10:49:11.2114592Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.2114609Z 2025-12-04T10:49:11.2114690Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2115019Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2115021Z 2025-12-04T10:49:11.2115117Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2115194Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2115256Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2115446Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2115525Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2115565Z graph_break [] 2025-12-04T10:49:11.2115664Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2116045Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2116093Z if out == self.unknown_value: 2025-12-04T10:49:11.2116172Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2116233Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2116311Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2116502Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2116543Z graph_break [] 2025-12-04T10:49:11.2116600Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2116765Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2116813Z Traceback (most recent call last): 2025-12-04T10:49:11.2116984Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2117027Z method(*args, **kwargs) 2025-12-04T10:49:11.2117192Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2117256Z method(*args, **kwargs) 2025-12-04T10:49:11.2117421Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2117461Z with policy(): 2025-12-04T10:49:11.2117629Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2117674Z raise RuntimeError(msg) 2025-12-04T10:49:11.2118122Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2118125Z 2025-12-04T10:49:11.2118206Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2118521Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2118523Z 2025-12-04T10:49:11.2118618Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2118709Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2118770Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2118973Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2119052Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2119093Z graph_break [] 2025-12-04T10:49:11.2119171Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2119548Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2119596Z if out == self.unknown_value: 2025-12-04T10:49:11.2119690Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2119750Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2119828Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2120019Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2120060Z graph_break [] 2025-12-04T10:49:11.2120136Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2120197Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2120273Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2120462Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2120501Z graph_break [] 2025-12-04T10:49:11.2120770Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-7c698df328fa8627.xml - 2025-12-04T10:49:11.2120833Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2121535Z FAILED [0.6483s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2121551Z 2025-12-04T10:49:11.2121630Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2121991Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2121996Z 2025-12-04T10:49:11.2122090Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2122157Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2122230Z ================== 1 failed, 57 deselected, 2 rerun in 11.78s ================== 2025-12-04T10:49:11.2122271Z Got exit code 1 2025-12-04T10:49:11.2122315Z Retrying single test... 2025-12-04T10:49:11.2122528Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-768014ba8531cfb6.xml 2025-12-04T10:49:11.2122590Z ============================= test session starts ============================== 2025-12-04T10:49:11.2122715Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2122785Z cachedir: .pytest_cache 2025-12-04T10:49:11.2122957Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2123021Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2123065Z configfile: pytest.ini 2025-12-04T10:49:11.2123244Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2123325Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.2123638Z stepcurrent: skipping 52 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2123688Z Running 1 items in this shard 2025-12-04T10:49:11.2123690Z 2025-12-04T10:49:11.2124102Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:44:53.740308062 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2124106Z 2025-12-04T10:49:11.2124276Z [W1204 10:45:00.066219575 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2124278Z 2025-12-04T10:49:11.2124442Z [W1204 10:45:00.066379803 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2124447Z 2025-12-04T10:49:11.2124609Z [W1204 10:45:00.070093085 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2124611Z 2025-12-04T10:49:11.2124774Z [W1204 10:45:00.070490740 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2124777Z 2025-12-04T10:49:11.2124937Z [W1204 10:45:00.070571319 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2124939Z 2025-12-04T10:49:11.2125103Z [W1204 10:45:00.073136756 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2125105Z 2025-12-04T10:49:11.2125265Z [W1204 10:45:00.073426792 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2125284Z 2025-12-04T10:49:11.2125444Z [W1204 10:45:00.073504341 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2125446Z 2025-12-04T10:49:11.2125502Z ('RERUN', {'yellow': True}) [9.9837s] [100%] 2025-12-04T10:49:11.2125897Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:45:01.224114212 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2125900Z 2025-12-04T10:49:11.2126064Z [W1204 10:45:01.224573436 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2126065Z 2025-12-04T10:49:11.2126226Z [W1204 10:45:01.224654285 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2126229Z 2025-12-04T10:49:11.2126390Z [W1204 10:45:01.226060887 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2126392Z 2025-12-04T10:49:11.2126554Z [W1204 10:45:01.226387533 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2126567Z 2025-12-04T10:49:11.2126728Z [W1204 10:45:01.226467042 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2126731Z 2025-12-04T10:49:11.2126905Z [W1204 10:45:01.228692003 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2126907Z 2025-12-04T10:49:11.2127068Z [W1204 10:45:01.228955330 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2127071Z 2025-12-04T10:49:11.2127233Z [W1204 10:45:01.229036659 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2127235Z 2025-12-04T10:49:11.2127289Z ('RERUN', {'yellow': True}) [0.6700s] [100%] 2025-12-04T10:49:11.2127689Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 [W1204 10:45:02.922189760 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2127693Z 2025-12-04T10:49:11.2127856Z [W1204 10:45:02.922625405 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2127858Z 2025-12-04T10:49:11.2128020Z [W1204 10:45:02.922716063 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2128023Z 2025-12-04T10:49:11.2128185Z [W1204 10:45:02.924174395 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2128187Z 2025-12-04T10:49:11.2128349Z [W1204 10:45:02.924507700 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2128354Z 2025-12-04T10:49:11.2128514Z [W1204 10:45:02.924585979 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2128516Z 2025-12-04T10:49:11.2128678Z [W1204 10:45:02.926832620 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2128680Z 2025-12-04T10:49:11.2128840Z [W1204 10:45:02.927101107 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2128853Z 2025-12-04T10:49:11.2129014Z [W1204 10:45:02.927179766 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2129016Z 2025-12-04T10:49:11.2129059Z FAILED [0.6918s] [100%] 2025-12-04T10:49:11.2129062Z 2025-12-04T10:49:11.2129120Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2129288Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2129339Z Traceback (most recent call last): 2025-12-04T10:49:11.2129513Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2129557Z method(*args, **kwargs) 2025-12-04T10:49:11.2129724Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2129767Z method(*args, **kwargs) 2025-12-04T10:49:11.2129932Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2129971Z with policy(): 2025-12-04T10:49:11.2130138Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2130196Z raise RuntimeError(msg) 2025-12-04T10:49:11.2130646Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 131072 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.2130650Z 2025-12-04T10:49:11.2130731Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2131052Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2131055Z 2025-12-04T10:49:11.2131150Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2131246Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2131309Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2131500Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2131581Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2131620Z graph_break [] 2025-12-04T10:49:11.2131699Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2132119Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2132169Z if out == self.unknown_value: 2025-12-04T10:49:11.2132333Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2132384Z Traceback (most recent call last): 2025-12-04T10:49:11.2132552Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2132597Z method(*args, **kwargs) 2025-12-04T10:49:11.2132762Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2132806Z method(*args, **kwargs) 2025-12-04T10:49:11.2132969Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2133030Z with policy(): 2025-12-04T10:49:11.2133195Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2133240Z raise RuntimeError(msg) 2025-12-04T10:49:11.2133688Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 131072 and is now reported as 262144 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.2133691Z 2025-12-04T10:49:11.2133770Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2134085Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2134089Z 2025-12-04T10:49:11.2134181Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2134259Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2134318Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2134510Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2134603Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2134644Z graph_break [] 2025-12-04T10:49:11.2134735Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2135110Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2135161Z if out == self.unknown_value: 2025-12-04T10:49:11.2135237Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2135296Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2135386Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2135580Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2135619Z graph_break [] 2025-12-04T10:49:11.2135677Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2135841Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2135892Z Traceback (most recent call last): 2025-12-04T10:49:11.2136058Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2136102Z method(*args, **kwargs) 2025-12-04T10:49:11.2136266Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2136310Z method(*args, **kwargs) 2025-12-04T10:49:11.2136475Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2136516Z with policy(): 2025-12-04T10:49:11.2136683Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2136728Z raise RuntimeError(msg) 2025-12-04T10:49:11.2137173Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2137193Z 2025-12-04T10:49:11.2137272Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2137588Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2137591Z 2025-12-04T10:49:11.2137683Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2137762Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2137821Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2138010Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2138088Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2138128Z graph_break [] 2025-12-04T10:49:11.2138205Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2138579Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2138641Z if out == self.unknown_value: 2025-12-04T10:49:11.2138732Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2138791Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2138869Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2139061Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2139101Z graph_break [] 2025-12-04T10:49:11.2139179Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2139236Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2139326Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2139515Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2139555Z graph_break [] 2025-12-04T10:49:11.2139817Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-768014ba8531cfb6.xml - 2025-12-04T10:49:11.2139882Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2140576Z FAILED [0.6918s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 262144 and is now reported as 393216 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2140581Z 2025-12-04T10:49:11.2140662Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2140977Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2140979Z 2025-12-04T10:49:11.2141084Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2141151Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2141222Z ================== 1 failed, 57 deselected, 2 rerun in 11.49s ================== 2025-12-04T10:49:11.2141263Z Got exit code 1 2025-12-04T10:49:11.2141526Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2141668Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.2141927Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-91786c14bc5adc43.xml 2025-12-04T10:49:11.2141990Z ============================= test session starts ============================== 2025-12-04T10:49:11.2142112Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2142159Z cachedir: .pytest_cache 2025-12-04T10:49:11.2142331Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2142382Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2142427Z configfile: pytest.ini 2025-12-04T10:49:11.2142605Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2142701Z collecting ... collected 58 items / 53 deselected / 5 selected 2025-12-04T10:49:11.2142759Z stepcurrent: skipping 53 already run items. 2025-12-04T10:49:11.2142821Z Running 5 items in this shard 2025-12-04T10:49:11.2142823Z 2025-12-04T10:49:11.2143094Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.4332s] [ 20%] 2025-12-04T10:49:11.2143364Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.4447s] [ 20%] 2025-12-04T10:49:11.2143618Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 FAILED [0.4368s] [ 20%] 2025-12-04T10:49:11.2143622Z 2025-12-04T10:49:11.2143680Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2143844Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2143894Z Traceback (most recent call last): 2025-12-04T10:49:11.2144065Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2144112Z method(*args, **kwargs) 2025-12-04T10:49:11.2144280Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2144324Z method(*args, **kwargs) 2025-12-04T10:49:11.2144489Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2144530Z with policy(): 2025-12-04T10:49:11.2144699Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2144742Z raise RuntimeError(msg) 2025-12-04T10:49:11.2145172Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.2145191Z 2025-12-04T10:49:11.2145269Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2145582Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2145584Z 2025-12-04T10:49:11.2145677Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2145756Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2145815Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2146007Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2146084Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2146126Z graph_break [] 2025-12-04T10:49:11.2146290Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2146338Z Traceback (most recent call last): 2025-12-04T10:49:11.2146506Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2146549Z method(*args, **kwargs) 2025-12-04T10:49:11.2146739Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2146782Z method(*args, **kwargs) 2025-12-04T10:49:11.2146965Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2147005Z with policy(): 2025-12-04T10:49:11.2147172Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2147218Z raise RuntimeError(msg) 2025-12-04T10:49:11.2147654Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.2147656Z 2025-12-04T10:49:11.2147745Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2148059Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2148062Z 2025-12-04T10:49:11.2148155Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2148233Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2148294Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2148483Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2148562Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2148601Z graph_break [] 2025-12-04T10:49:11.2148682Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2148740Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2148816Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2149006Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2149046Z graph_break [] 2025-12-04T10:49:11.2149102Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2149278Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2149326Z Traceback (most recent call last): 2025-12-04T10:49:11.2149494Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2149538Z method(*args, **kwargs) 2025-12-04T10:49:11.2149703Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2149745Z method(*args, **kwargs) 2025-12-04T10:49:11.2149913Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2149953Z with policy(): 2025-12-04T10:49:11.2150119Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2150164Z raise RuntimeError(msg) 2025-12-04T10:49:11.2150602Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2150617Z 2025-12-04T10:49:11.2150697Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2151019Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2151021Z 2025-12-04T10:49:11.2151117Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2151196Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2151258Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2151448Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2151527Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2151578Z graph_break [] 2025-12-04T10:49:11.2151659Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2151716Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2151794Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2152022Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2152065Z graph_break [] 2025-12-04T10:49:11.2152144Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2152202Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2152279Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2152468Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2152510Z graph_break [] 2025-12-04T10:49:11.2152774Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-91786c14bc5adc43.xml - 2025-12-04T10:49:11.2152840Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2153521Z FAILED [0.4368s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2153543Z 2025-12-04T10:49:11.2153624Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2153939Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2153942Z 2025-12-04T10:49:11.2154035Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2154103Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2154180Z ================== 1 failed, 53 deselected, 2 rerun in 3.45s =================== 2025-12-04T10:49:11.2154222Z Got exit code 1 2025-12-04T10:49:11.2154266Z Retrying single test... 2025-12-04T10:49:11.2154482Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-cbc93a51d2064ca4.xml 2025-12-04T10:49:11.2154543Z ============================= test session starts ============================== 2025-12-04T10:49:11.2154681Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2154726Z cachedir: .pytest_cache 2025-12-04T10:49:11.2154913Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2154964Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2155009Z configfile: pytest.ini 2025-12-04T10:49:11.2155185Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2155267Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.2155577Z stepcurrent: skipping 53 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2155627Z Running 1 items in this shard 2025-12-04T10:49:11.2155643Z 2025-12-04T10:49:11.2156039Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:45:21.107486553 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2156042Z 2025-12-04T10:49:11.2156209Z [W1204 10:45:28.456214333 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2156212Z 2025-12-04T10:49:11.2156377Z [W1204 10:45:28.456360831 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2156380Z 2025-12-04T10:49:11.2156545Z [W1204 10:45:28.459997564 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2156547Z 2025-12-04T10:49:11.2156711Z [W1204 10:45:28.460301260 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2156713Z 2025-12-04T10:49:11.2156877Z [W1204 10:45:28.460379249 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2156879Z 2025-12-04T10:49:11.2157040Z [W1204 10:45:28.462761377 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2157055Z 2025-12-04T10:49:11.2157219Z [W1204 10:45:28.463030954 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2157221Z 2025-12-04T10:49:11.2157380Z [W1204 10:45:28.463109233 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2157382Z 2025-12-04T10:49:11.2157438Z ('RERUN', {'yellow': True}) [9.9607s] [100%] 2025-12-04T10:49:11.2157829Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:45:30.695495360 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2157834Z 2025-12-04T10:49:11.2157996Z [W1204 10:45:30.695887014 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2158000Z 2025-12-04T10:49:11.2158164Z [W1204 10:45:30.695977623 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2158166Z 2025-12-04T10:49:11.2158327Z [W1204 10:45:30.697408315 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2158329Z 2025-12-04T10:49:11.2158495Z [W1204 10:45:30.697747920 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2158508Z 2025-12-04T10:49:11.2158669Z [W1204 10:45:30.697828829 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2158686Z 2025-12-04T10:49:11.2158848Z [W1204 10:45:30.700169838 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2158850Z 2025-12-04T10:49:11.2159015Z [W1204 10:45:30.700441675 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2159017Z 2025-12-04T10:49:11.2159177Z [W1204 10:45:30.700518874 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2159179Z 2025-12-04T10:49:11.2159233Z ('RERUN', {'yellow': True}) [0.7114s] [100%] 2025-12-04T10:49:11.2159630Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:45:30.302403606 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2159635Z 2025-12-04T10:49:11.2159797Z [W1204 10:45:30.302807190 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2159799Z 2025-12-04T10:49:11.2159961Z [W1204 10:45:30.302895959 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2159963Z 2025-12-04T10:49:11.2160123Z [W1204 10:45:30.304298511 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2160125Z 2025-12-04T10:49:11.2160289Z [W1204 10:45:30.304633046 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2160292Z 2025-12-04T10:49:11.2160452Z [W1204 10:45:30.304712985 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2160453Z 2025-12-04T10:49:11.2160617Z [W1204 10:45:30.307021175 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2160619Z 2025-12-04T10:49:11.2160783Z [W1204 10:45:30.307285672 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2160797Z 2025-12-04T10:49:11.2160957Z [W1204 10:45:30.307361151 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2160960Z 2025-12-04T10:49:11.2161003Z FAILED [0.6118s] [100%] 2025-12-04T10:49:11.2161005Z 2025-12-04T10:49:11.2161063Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2161227Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2161277Z Traceback (most recent call last): 2025-12-04T10:49:11.2161451Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2161497Z method(*args, **kwargs) 2025-12-04T10:49:11.2161664Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2161708Z method(*args, **kwargs) 2025-12-04T10:49:11.2161925Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2161966Z with policy(): 2025-12-04T10:49:11.2162135Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2162200Z raise RuntimeError(msg) 2025-12-04T10:49:11.2162645Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.2162648Z 2025-12-04T10:49:11.2162730Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2163047Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2163049Z 2025-12-04T10:49:11.2163147Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2163243Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2163306Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2163498Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2163580Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2163620Z graph_break [] 2025-12-04T10:49:11.2163700Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2164079Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2164126Z if out == self.unknown_value: 2025-12-04T10:49:11.2164293Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2164343Z Traceback (most recent call last): 2025-12-04T10:49:11.2164513Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2164558Z method(*args, **kwargs) 2025-12-04T10:49:11.2164725Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2164768Z method(*args, **kwargs) 2025-12-04T10:49:11.2164950Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2164990Z with policy(): 2025-12-04T10:49:11.2165157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2165201Z raise RuntimeError(msg) 2025-12-04T10:49:11.2165638Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.2165642Z 2025-12-04T10:49:11.2165721Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2166034Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2166037Z 2025-12-04T10:49:11.2166133Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2166210Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2166273Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2166483Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2166565Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2166619Z graph_break [] 2025-12-04T10:49:11.2166698Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2167072Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2167123Z if out == self.unknown_value: 2025-12-04T10:49:11.2167201Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2167262Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2167351Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2167543Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2167583Z graph_break [] 2025-12-04T10:49:11.2167641Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2167804Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2167856Z Traceback (most recent call last): 2025-12-04T10:49:11.2168027Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2168072Z method(*args, **kwargs) 2025-12-04T10:49:11.2168239Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2168285Z method(*args, **kwargs) 2025-12-04T10:49:11.2168449Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2168489Z with policy(): 2025-12-04T10:49:11.2168658Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2168702Z raise RuntimeError(msg) 2025-12-04T10:49:11.2169140Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2169155Z 2025-12-04T10:49:11.2169234Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2169549Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2169553Z 2025-12-04T10:49:11.2169647Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2169726Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2169787Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2169978Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2170059Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2170099Z graph_break [] 2025-12-04T10:49:11.2170179Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2170554Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2170616Z if out == self.unknown_value: 2025-12-04T10:49:11.2170706Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2170767Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2170843Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2171036Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2171075Z graph_break [] 2025-12-04T10:49:11.2171155Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2171213Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2171303Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2171491Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2171532Z graph_break [] 2025-12-04T10:49:11.2171798Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-cbc93a51d2064ca4.xml - 2025-12-04T10:49:11.2171898Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2172586Z FAILED [0.6118s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2172591Z 2025-12-04T10:49:11.2172669Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2172985Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2173008Z 2025-12-04T10:49:11.2173100Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2173168Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2173241Z ================== 1 failed, 57 deselected, 2 rerun in 11.43s ================== 2025-12-04T10:49:11.2173281Z Got exit code 1 2025-12-04T10:49:11.2173325Z Retrying single test... 2025-12-04T10:49:11.2173543Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-57b35200295092c8.xml 2025-12-04T10:49:11.2173606Z ============================= test session starts ============================== 2025-12-04T10:49:11.2173729Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2173776Z cachedir: .pytest_cache 2025-12-04T10:49:11.2173949Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2174001Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2174045Z configfile: pytest.ini 2025-12-04T10:49:11.2174226Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2174307Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.2174622Z stepcurrent: skipping 53 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2174683Z Running 1 items in this shard 2025-12-04T10:49:11.2174686Z 2025-12-04T10:49:11.2175093Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:45:39.456016234 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2175096Z 2025-12-04T10:49:11.2175265Z [W1204 10:45:47.044072423 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2175268Z 2025-12-04T10:49:11.2175431Z [W1204 10:45:47.044228411 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2175449Z 2025-12-04T10:49:11.2175613Z [W1204 10:45:47.048239048 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2175615Z 2025-12-04T10:49:11.2175777Z [W1204 10:45:47.048554354 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2175779Z 2025-12-04T10:49:11.2175941Z [W1204 10:45:47.048630813 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2175945Z 2025-12-04T10:49:11.2176105Z [W1204 10:45:47.051150769 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2176109Z 2025-12-04T10:49:11.2176269Z [W1204 10:45:47.051411736 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2176271Z 2025-12-04T10:49:11.2176435Z [W1204 10:45:47.051487655 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2176438Z 2025-12-04T10:49:11.2176493Z ('RERUN', {'yellow': True}) [10.0740s] [100%] 2025-12-04T10:49:11.2176882Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:45:48.019685456 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2176905Z 2025-12-04T10:49:11.2177068Z [W1204 10:45:48.020116740 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2177072Z 2025-12-04T10:49:11.2177232Z [W1204 10:45:48.020223899 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2177235Z 2025-12-04T10:49:11.2177398Z [W1204 10:45:48.021612151 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2177400Z 2025-12-04T10:49:11.2177565Z [W1204 10:45:48.021964036 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2177567Z 2025-12-04T10:49:11.2177730Z [W1204 10:45:48.022050125 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2177733Z 2025-12-04T10:49:11.2177894Z [W1204 10:45:48.024199246 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2177896Z 2025-12-04T10:49:11.2178059Z [W1204 10:45:48.024459943 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2178061Z 2025-12-04T10:49:11.2178235Z [W1204 10:45:48.024534432 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2178237Z 2025-12-04T10:49:11.2178291Z ('RERUN', {'yellow': True}) [0.4471s] [100%] 2025-12-04T10:49:11.2178694Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 [W1204 10:45:48.455006495 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2178698Z 2025-12-04T10:49:11.2178860Z [W1204 10:45:48.455401230 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2178862Z 2025-12-04T10:49:11.2179025Z [W1204 10:45:48.455493719 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2179029Z 2025-12-04T10:49:11.2179207Z [W1204 10:45:48.456880871 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2179210Z 2025-12-04T10:49:11.2179372Z [W1204 10:45:48.457213476 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2179374Z 2025-12-04T10:49:11.2179536Z [W1204 10:45:48.457293425 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2179539Z 2025-12-04T10:49:11.2179700Z [W1204 10:45:48.459461806 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2179702Z 2025-12-04T10:49:11.2179863Z [W1204 10:45:48.459716973 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2179865Z 2025-12-04T10:49:11.2180026Z [W1204 10:45:48.459790502 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2180032Z 2025-12-04T10:49:11.2180073Z FAILED [0.4299s] [100%] 2025-12-04T10:49:11.2180075Z 2025-12-04T10:49:11.2180134Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2180299Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2180350Z Traceback (most recent call last): 2025-12-04T10:49:11.2180540Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2180585Z method(*args, **kwargs) 2025-12-04T10:49:11.2180751Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2180795Z method(*args, **kwargs) 2025-12-04T10:49:11.2180961Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2181005Z with policy(): 2025-12-04T10:49:11.2181173Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2181220Z raise RuntimeError(msg) 2025-12-04T10:49:11.2181650Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8192 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.2181656Z 2025-12-04T10:49:11.2181734Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2182209Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2182235Z 2025-12-04T10:49:11.2182329Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2182424Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2182485Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2182681Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2182761Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2182802Z graph_break [] 2025-12-04T10:49:11.2182879Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2183272Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2183321Z if out == self.unknown_value: 2025-12-04T10:49:11.2183487Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2183536Z Traceback (most recent call last): 2025-12-04T10:49:11.2183704Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2183749Z method(*args, **kwargs) 2025-12-04T10:49:11.2183914Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2183959Z method(*args, **kwargs) 2025-12-04T10:49:11.2184122Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2184163Z with policy(): 2025-12-04T10:49:11.2184332Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2184377Z raise RuntimeError(msg) 2025-12-04T10:49:11.2184814Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 8192 and is now reported as 16384 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.2184833Z 2025-12-04T10:49:11.2184915Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2185229Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2185231Z 2025-12-04T10:49:11.2185328Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2185406Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2185467Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2185660Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2185738Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2185780Z graph_break [] 2025-12-04T10:49:11.2185857Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2186235Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2186295Z if out == self.unknown_value: 2025-12-04T10:49:11.2186373Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2186432Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2186522Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2186712Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2186755Z graph_break [] 2025-12-04T10:49:11.2186810Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2186975Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2187024Z Traceback (most recent call last): 2025-12-04T10:49:11.2187206Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2187250Z method(*args, **kwargs) 2025-12-04T10:49:11.2187418Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2187460Z method(*args, **kwargs) 2025-12-04T10:49:11.2187626Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2187667Z with policy(): 2025-12-04T10:49:11.2187835Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2187880Z raise RuntimeError(msg) 2025-12-04T10:49:11.2188320Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2188324Z 2025-12-04T10:49:11.2188403Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2188716Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2188719Z 2025-12-04T10:49:11.2188814Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2188903Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2188964Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2189158Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2189238Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2189278Z graph_break [] 2025-12-04T10:49:11.2189357Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2189737Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2189785Z if out == self.unknown_value: 2025-12-04T10:49:11.2189865Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2189925Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2190003Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2190193Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2190246Z graph_break [] 2025-12-04T10:49:11.2190322Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2190382Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2190469Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2190659Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2190700Z graph_break [] 2025-12-04T10:49:11.2190967Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-57b35200295092c8.xml - 2025-12-04T10:49:11.2191031Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2191738Z FAILED [0.4299s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 16384 and is now reported as 24576 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2191742Z 2025-12-04T10:49:11.2191821Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2192185Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2192187Z 2025-12-04T10:49:11.2192281Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2192347Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2192421Z ================== 1 failed, 57 deselected, 2 rerun in 11.09s ================== 2025-12-04T10:49:11.2192461Z Got exit code 1 2025-12-04T10:49:11.2192722Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2192861Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.2193077Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8edfd89fae7d2d0f.xml 2025-12-04T10:49:11.2193163Z ============================= test session starts ============================== 2025-12-04T10:49:11.2193286Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2193334Z cachedir: .pytest_cache 2025-12-04T10:49:11.2193507Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2193559Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2193603Z configfile: pytest.ini 2025-12-04T10:49:11.2193783Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2193862Z collecting ... collected 58 items / 54 deselected / 4 selected 2025-12-04T10:49:11.2193921Z stepcurrent: skipping 54 already run items. 2025-12-04T10:49:11.2193971Z Running 4 items in this shard 2025-12-04T10:49:11.2193973Z 2025-12-04T10:49:11.2194246Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.8542s] [ 25%] 2025-12-04T10:49:11.2194514Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.4390s] [ 25%] 2025-12-04T10:49:11.2194772Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 FAILED [0.4387s] [ 25%] 2025-12-04T10:49:11.2194789Z 2025-12-04T10:49:11.2194844Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2195007Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2195058Z Traceback (most recent call last): 2025-12-04T10:49:11.2195230Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2195274Z method(*args, **kwargs) 2025-12-04T10:49:11.2195453Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2195497Z method(*args, **kwargs) 2025-12-04T10:49:11.2195660Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2195701Z with policy(): 2025-12-04T10:49:11.2195867Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2195913Z raise RuntimeError(msg) 2025-12-04T10:49:11.2196340Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.2196344Z 2025-12-04T10:49:11.2196425Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2196743Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2197133Z 2025-12-04T10:49:11.2197228Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2197447Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2197632Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2198051Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2198463Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2198622Z graph_break [] 2025-12-04T10:49:11.2198849Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2199107Z Traceback (most recent call last): 2025-12-04T10:49:11.2199360Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2199613Z method(*args, **kwargs) 2025-12-04T10:49:11.2199850Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2200098Z method(*args, **kwargs) 2025-12-04T10:49:11.2200337Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2200581Z with policy(): 2025-12-04T10:49:11.2200813Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2201065Z raise RuntimeError(msg) 2025-12-04T10:49:11.2201576Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.2202130Z 2025-12-04T10:49:11.2202213Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2202650Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2203006Z 2025-12-04T10:49:11.2203179Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2203446Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2203630Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2204059Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2204474Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2204633Z graph_break [] 2025-12-04T10:49:11.2204770Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2204950Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2205125Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2205534Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2205906Z graph_break [] 2025-12-04T10:49:11.2206025Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2206291Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2206548Z Traceback (most recent call last): 2025-12-04T10:49:11.2206803Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2207054Z method(*args, **kwargs) 2025-12-04T10:49:11.2207319Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2207567Z method(*args, **kwargs) 2025-12-04T10:49:11.2207803Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2208047Z with policy(): 2025-12-04T10:49:11.2208276Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2208528Z raise RuntimeError(msg) 2025-12-04T10:49:11.2209043Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2209521Z 2025-12-04T10:49:11.2209604Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2210038Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2210391Z 2025-12-04T10:49:11.2210487Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2210714Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2210895Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2211305Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2211716Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2211919Z graph_break [] 2025-12-04T10:49:11.2212054Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2212232Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2212406Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2212834Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2213206Z graph_break [] 2025-12-04T10:49:11.2213342Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2213520Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2213694Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2214106Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2214479Z graph_break [] 2025-12-04T10:49:11.2214808Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-8edfd89fae7d2d0f.xml - 2025-12-04T10:49:11.2215180Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2215978Z FAILED [0.4387s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2216728Z 2025-12-04T10:49:11.2216808Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2217241Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2217599Z 2025-12-04T10:49:11.2217693Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2217895Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2218081Z ================== 1 failed, 54 deselected, 2 rerun in 3.87s =================== 2025-12-04T10:49:11.2218238Z Got exit code 1 2025-12-04T10:49:11.2218344Z Retrying single test... 2025-12-04T10:49:11.2218634Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c0f6eb0e92ccf526.xml 2025-12-04T10:49:11.2218957Z ============================= test session starts ============================== 2025-12-04T10:49:11.2219184Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2219391Z cachedir: .pytest_cache 2025-12-04T10:49:11.2219637Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2219910Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2220039Z configfile: pytest.ini 2025-12-04T10:49:11.2220315Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2220614Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.2221047Z stepcurrent: skipping 54 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2221442Z Running 1 items in this shard 2025-12-04T10:49:11.2221524Z 2025-12-04T10:49:11.2221989Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:46:08.764418165 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2222426Z 2025-12-04T10:49:11.2222598Z [W1204 10:46:15.527367598 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2222806Z 2025-12-04T10:49:11.2222974Z [W1204 10:46:15.527533916 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2223180Z 2025-12-04T10:49:11.2223346Z [W1204 10:46:15.531436264 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2223553Z 2025-12-04T10:49:11.2223720Z [W1204 10:46:15.531723830 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2223925Z 2025-12-04T10:49:11.2224090Z [W1204 10:46:15.531801099 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2224295Z 2025-12-04T10:49:11.2224459Z [W1204 10:46:15.534439873 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2224665Z 2025-12-04T10:49:11.2224831Z [W1204 10:46:15.534708270 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2225036Z 2025-12-04T10:49:11.2225199Z [W1204 10:46:15.534784869 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2225423Z 2025-12-04T10:49:11.2225478Z ('RERUN', {'yellow': True}) [9.7653s] [100%] 2025-12-04T10:49:11.2225960Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:46:15.333114118 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2226392Z 2025-12-04T10:49:11.2226557Z [W1204 10:46:15.333508693 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2226764Z 2025-12-04T10:49:11.2226931Z [W1204 10:46:15.333595652 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2227135Z 2025-12-04T10:49:11.2227301Z [W1204 10:46:15.334986603 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2227508Z 2025-12-04T10:49:11.2227677Z [W1204 10:46:15.335242970 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2227882Z 2025-12-04T10:49:11.2228048Z [W1204 10:46:15.335322399 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2228278Z 2025-12-04T10:49:11.2228444Z [W1204 10:46:15.337670077 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2228654Z 2025-12-04T10:49:11.2228832Z [W1204 10:46:15.337935764 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2229038Z 2025-12-04T10:49:11.2229202Z [W1204 10:46:15.338016073 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2229409Z 2025-12-04T10:49:11.2229464Z ('RERUN', {'yellow': True}) [0.6711s] [100%] 2025-12-04T10:49:11.2229961Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:46:16.051175711 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2230392Z 2025-12-04T10:49:11.2230556Z [W1204 10:46:16.051588426 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2230765Z 2025-12-04T10:49:11.2230928Z [W1204 10:46:16.051687265 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2231133Z 2025-12-04T10:49:11.2231298Z [W1204 10:46:16.053167275 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2231504Z 2025-12-04T10:49:11.2231668Z [W1204 10:46:16.053458031 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2231934Z 2025-12-04T10:49:11.2232098Z [W1204 10:46:16.053537660 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2232307Z 2025-12-04T10:49:11.2232472Z [W1204 10:46:16.055914418 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2232681Z 2025-12-04T10:49:11.2232846Z [W1204 10:46:16.056195024 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2233053Z 2025-12-04T10:49:11.2233217Z [W1204 10:46:16.056277563 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2233440Z 2025-12-04T10:49:11.2233484Z FAILED [0.7005s] [100%] 2025-12-04T10:49:11.2233554Z 2025-12-04T10:49:11.2233611Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2233879Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2234133Z Traceback (most recent call last): 2025-12-04T10:49:11.2234393Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2234650Z method(*args, **kwargs) 2025-12-04T10:49:11.2234890Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2235141Z method(*args, **kwargs) 2025-12-04T10:49:11.2235381Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2235626Z with policy(): 2025-12-04T10:49:11.2235856Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2236108Z raise RuntimeError(msg) 2025-12-04T10:49:11.2236616Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.2237105Z 2025-12-04T10:49:11.2237186Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2237640Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2237999Z 2025-12-04T10:49:11.2238097Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2238315Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2238497Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2238919Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2239345Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2239505Z graph_break [] 2025-12-04T10:49:11.2239644Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2240145Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2240610Z if out == self.unknown_value: 2025-12-04T10:49:11.2240858Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2241111Z Traceback (most recent call last): 2025-12-04T10:49:11.2241372Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2241627Z method(*args, **kwargs) 2025-12-04T10:49:11.2241916Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2242168Z method(*args, **kwargs) 2025-12-04T10:49:11.2242404Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2242648Z with policy(): 2025-12-04T10:49:11.2242895Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2243144Z raise RuntimeError(msg) 2025-12-04T10:49:11.2243657Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.2244133Z 2025-12-04T10:49:11.2244214Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2244649Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2245004Z 2025-12-04T10:49:11.2245103Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2245313Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2245492Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2245893Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2249413Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2249589Z graph_break [] 2025-12-04T10:49:11.2249765Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2250268Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2250733Z if out == self.unknown_value: 2025-12-04T10:49:11.2250892Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2251073Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2251252Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2251695Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2252107Z graph_break [] 2025-12-04T10:49:11.2252225Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2252494Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2252747Z Traceback (most recent call last): 2025-12-04T10:49:11.2253004Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2253255Z method(*args, **kwargs) 2025-12-04T10:49:11.2253494Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2253740Z method(*args, **kwargs) 2025-12-04T10:49:11.2253981Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2254225Z with policy(): 2025-12-04T10:49:11.2254456Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2254706Z raise RuntimeError(msg) 2025-12-04T10:49:11.2255227Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2255735Z 2025-12-04T10:49:11.2255817Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2256255Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2256610Z 2025-12-04T10:49:11.2256706Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2256921Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2257102Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2257500Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2257914Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2258072Z graph_break [] 2025-12-04T10:49:11.2258210Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2258707Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2259187Z if out == self.unknown_value: 2025-12-04T10:49:11.2259359Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2259540Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2259719Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2260134Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2260508Z graph_break [] 2025-12-04T10:49:11.2260644Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2260850Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2261026Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2261437Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2261808Z graph_break [] 2025-12-04T10:49:11.2262176Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c0f6eb0e92ccf526.xml - 2025-12-04T10:49:11.2262550Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2263360Z FAILED [0.7005s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2264083Z 2025-12-04T10:49:11.2264163Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2264598Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2264979Z 2025-12-04T10:49:11.2265075Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2265277Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2265461Z ================== 1 failed, 57 deselected, 2 rerun in 11.28s ================== 2025-12-04T10:49:11.2265614Z Got exit code 1 2025-12-04T10:49:11.2265719Z Retrying single test... 2025-12-04T10:49:11.2266008Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-0660d43a00abe6a3.xml 2025-12-04T10:49:11.2266326Z ============================= test session starts ============================== 2025-12-04T10:49:11.2266556Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2266765Z cachedir: .pytest_cache 2025-12-04T10:49:11.2267008Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2267268Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2267395Z configfile: pytest.ini 2025-12-04T10:49:11.2267648Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2267969Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.2268422Z stepcurrent: skipping 54 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2268813Z Running 1 items in this shard 2025-12-04T10:49:11.2268895Z 2025-12-04T10:49:11.2269293Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:46:26.042394516 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2269727Z 2025-12-04T10:49:11.2269896Z [W1204 10:46:34.588595022 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2270107Z 2025-12-04T10:49:11.2270299Z [W1204 10:46:34.588781119 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2270508Z 2025-12-04T10:49:11.2270673Z [W1204 10:46:34.592709056 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2270880Z 2025-12-04T10:49:11.2271047Z [W1204 10:46:34.592999252 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2271251Z 2025-12-04T10:49:11.2271416Z [W1204 10:46:34.593083511 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2271619Z 2025-12-04T10:49:11.2271784Z [W1204 10:46:34.595667496 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2272040Z 2025-12-04T10:49:11.2272207Z [W1204 10:46:34.595931852 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2272411Z 2025-12-04T10:49:11.2272576Z [W1204 10:46:34.596013511 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2272781Z 2025-12-04T10:49:11.2272840Z ('RERUN', {'yellow': True}) [10.5282s] [100%] 2025-12-04T10:49:11.2273321Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:46:34.436619471 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2273770Z 2025-12-04T10:49:11.2273934Z [W1204 10:46:34.437015405 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2274142Z 2025-12-04T10:49:11.2274308Z [W1204 10:46:34.437118264 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2274513Z 2025-12-04T10:49:11.2274678Z [W1204 10:46:34.438519805 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2274884Z 2025-12-04T10:49:11.2275046Z [W1204 10:46:34.438790121 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2275253Z 2025-12-04T10:49:11.2275418Z [W1204 10:46:34.438867650 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2275622Z 2025-12-04T10:49:11.2275786Z [W1204 10:46:34.441156139 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2275991Z 2025-12-04T10:49:11.2276158Z [W1204 10:46:34.441415926 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2276386Z 2025-12-04T10:49:11.2276552Z [W1204 10:46:34.441492805 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2276771Z 2025-12-04T10:49:11.2276827Z ('RERUN', {'yellow': True}) [0.7035s] [100%] 2025-12-04T10:49:11.2277306Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 [W1204 10:46:35.115317818 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2277735Z 2025-12-04T10:49:11.2277900Z [W1204 10:46:35.115713592 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2278105Z 2025-12-04T10:49:11.2278286Z [W1204 10:46:35.115811351 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2278492Z 2025-12-04T10:49:11.2278655Z [W1204 10:46:35.117235012 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2278861Z 2025-12-04T10:49:11.2279023Z [W1204 10:46:35.117503408 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2279227Z 2025-12-04T10:49:11.2279395Z [W1204 10:46:35.117580667 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2279599Z 2025-12-04T10:49:11.2279761Z [W1204 10:46:35.119882526 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2279967Z 2025-12-04T10:49:11.2280132Z [W1204 10:46:35.120147552 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2280337Z 2025-12-04T10:49:11.2280500Z [W1204 10:46:35.120231001 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2280704Z 2025-12-04T10:49:11.2280749Z FAILED [0.6793s] [100%] 2025-12-04T10:49:11.2280817Z 2025-12-04T10:49:11.2280879Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2281146Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2281417Z Traceback (most recent call last): 2025-12-04T10:49:11.2281678Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2281980Z method(*args, **kwargs) 2025-12-04T10:49:11.2282224Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2282476Z method(*args, **kwargs) 2025-12-04T10:49:11.2282713Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2282959Z with policy(): 2025-12-04T10:49:11.2283187Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2283437Z raise RuntimeError(msg) 2025-12-04T10:49:11.2283949Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 8704 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.2284419Z 2025-12-04T10:49:11.2284501Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2284956Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2285310Z 2025-12-04T10:49:11.2285419Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2285633Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2285814Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2286213Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2286627Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2286786Z graph_break [] 2025-12-04T10:49:11.2286943Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2287441Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2287899Z if out == self.unknown_value: 2025-12-04T10:49:11.2288143Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2288395Z Traceback (most recent call last): 2025-12-04T10:49:11.2288648Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2288899Z method(*args, **kwargs) 2025-12-04T10:49:11.2289140Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2289393Z method(*args, **kwargs) 2025-12-04T10:49:11.2289630Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2289875Z with policy(): 2025-12-04T10:49:11.2290105Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2290355Z raise RuntimeError(msg) 2025-12-04T10:49:11.2290866Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 8704 and is now reported as 17408 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.2291363Z 2025-12-04T10:49:11.2291444Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2291948Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2292302Z 2025-12-04T10:49:11.2292398Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2292612Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2292791Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2293188Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2293603Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2293762Z graph_break [] 2025-12-04T10:49:11.2293900Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2294414Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2294883Z if out == self.unknown_value: 2025-12-04T10:49:11.2295039Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2295218Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2295398Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2295809Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2296183Z graph_break [] 2025-12-04T10:49:11.2296316Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2296583Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2296832Z Traceback (most recent call last): 2025-12-04T10:49:11.2297085Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2297338Z method(*args, **kwargs) 2025-12-04T10:49:11.2297576Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2297828Z method(*args, **kwargs) 2025-12-04T10:49:11.2298065Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2298309Z with policy(): 2025-12-04T10:49:11.2298542Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2298794Z raise RuntimeError(msg) 2025-12-04T10:49:11.2299307Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2299782Z 2025-12-04T10:49:11.2299862Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2300313Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2300665Z 2025-12-04T10:49:11.2300760Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2300973Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2301155Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2301553Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2302020Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2302178Z graph_break [] 2025-12-04T10:49:11.2302316Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2302807Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2303265Z if out == self.unknown_value: 2025-12-04T10:49:11.2303438Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2303619Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2303800Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2304229Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2304604Z graph_break [] 2025-12-04T10:49:11.2304742Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2304925Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2305102Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2305532Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2305904Z graph_break [] 2025-12-04T10:49:11.2306231Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-0660d43a00abe6a3.xml - 2025-12-04T10:49:11.2306600Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2307398Z FAILED [0.6793s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 17408 and is now reported as 26112 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2308115Z 2025-12-04T10:49:11.2308193Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2308626Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2308982Z 2025-12-04T10:49:11.2309076Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2309293Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2309477Z ================== 1 failed, 57 deselected, 2 rerun in 12.05s ================== 2025-12-04T10:49:11.2309631Z Got exit code 1 2025-12-04T10:49:11.2309958Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2310399Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.2310801Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-dc49dcede20ac452.xml 2025-12-04T10:49:11.2311118Z ============================= test session starts ============================== 2025-12-04T10:49:11.2311347Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2311558Z cachedir: .pytest_cache 2025-12-04T10:49:11.2311805Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2312125Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2312256Z configfile: pytest.ini 2025-12-04T10:49:11.2312509Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2312823Z collecting ... collected 58 items / 55 deselected / 3 selected 2025-12-04T10:49:11.2312996Z stepcurrent: skipping 55 already run items. 2025-12-04T10:49:11.2313139Z Running 3 items in this shard 2025-12-04T10:49:11.2313220Z 2025-12-04T10:49:11.2313515Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.6698s] [ 33%] 2025-12-04T10:49:11.2314104Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6223s] [ 33%] 2025-12-04T10:49:11.2314659Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 FAILED [0.6160s] [ 33%] 2025-12-04T10:49:11.2314944Z 2025-12-04T10:49:11.2315025Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2315294Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2315548Z Traceback (most recent call last): 2025-12-04T10:49:11.2315807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2316060Z method(*args, **kwargs) 2025-12-04T10:49:11.2316300Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2316550Z method(*args, **kwargs) 2025-12-04T10:49:11.2316787Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2317031Z with policy(): 2025-12-04T10:49:11.2317262Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2317513Z raise RuntimeError(msg) 2025-12-04T10:49:11.2318029Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.2318510Z 2025-12-04T10:49:11.2318592Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2319050Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2319408Z 2025-12-04T10:49:11.2319503Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2319718Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2319899Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2320195Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2320508Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2320665Z graph_break [] 2025-12-04T10:49:11.2320891Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2321144Z Traceback (most recent call last): 2025-12-04T10:49:11.2321396Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2321648Z method(*args, **kwargs) 2025-12-04T10:49:11.2321928Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2322202Z method(*args, **kwargs) 2025-12-04T10:49:11.2322438Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2322681Z with policy(): 2025-12-04T10:49:11.2322933Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2323184Z raise RuntimeError(msg) 2025-12-04T10:49:11.2323709Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.2324193Z 2025-12-04T10:49:11.2324273Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2324733Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2325088Z 2025-12-04T10:49:11.2325185Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2325399Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2325578Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2325873Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2326181Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2326337Z graph_break [] 2025-12-04T10:49:11.2326474Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2326652Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2326827Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2327134Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2327404Z graph_break [] 2025-12-04T10:49:11.2327518Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2327810Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2328061Z Traceback (most recent call last): 2025-12-04T10:49:11.2328313Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2328563Z method(*args, **kwargs) 2025-12-04T10:49:11.2328802Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2329051Z method(*args, **kwargs) 2025-12-04T10:49:11.2329286Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2329531Z with policy(): 2025-12-04T10:49:11.2329757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2330008Z raise RuntimeError(msg) 2025-12-04T10:49:11.2330535Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2331013Z 2025-12-04T10:49:11.2331094Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2331540Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2331944Z 2025-12-04T10:49:11.2332042Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2332255Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2332437Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2332728Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2333037Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2333193Z graph_break [] 2025-12-04T10:49:11.2333347Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2333527Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2333700Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2334009Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2334278Z graph_break [] 2025-12-04T10:49:11.2334412Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2334591Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2334766Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2335072Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2335339Z graph_break [] 2025-12-04T10:49:11.2335672Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-dc49dcede20ac452.xml - 2025-12-04T10:49:11.2336043Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2336867Z FAILED [0.6160s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2337625Z 2025-12-04T10:49:11.2337704Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2338137Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2338493Z 2025-12-04T10:49:11.2338587Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2338789Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2338970Z ================== 1 failed, 55 deselected, 2 rerun in 4.05s =================== 2025-12-04T10:49:11.2339124Z Got exit code 1 2025-12-04T10:49:11.2339227Z Retrying single test... 2025-12-04T10:49:11.2339514Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c05e150630cfc0fc.xml 2025-12-04T10:49:11.2339831Z ============================= test session starts ============================== 2025-12-04T10:49:11.2340059Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2340280Z cachedir: .pytest_cache 2025-12-04T10:49:11.2340523Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2340792Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2340920Z configfile: pytest.ini 2025-12-04T10:49:11.2341168Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2341465Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.2341961Z stepcurrent: skipping 55 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2342357Z Running 1 items in this shard 2025-12-04T10:49:11.2342435Z 2025-12-04T10:49:11.2342858Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:46:55.202735032 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2343293Z 2025-12-04T10:49:11.2343462Z [W1204 10:47:03.843726367 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2343671Z 2025-12-04T10:49:11.2343838Z [W1204 10:47:03.843869185 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2344048Z 2025-12-04T10:49:11.2344213Z [W1204 10:47:03.847486235 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2344418Z 2025-12-04T10:49:11.2344583Z [W1204 10:47:03.847780711 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2344789Z 2025-12-04T10:49:11.2344953Z [W1204 10:47:03.847856890 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2345158Z 2025-12-04T10:49:11.2345321Z [W1204 10:47:03.850248167 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2345526Z 2025-12-04T10:49:11.2345690Z [W1204 10:47:03.850514424 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2345914Z 2025-12-04T10:49:11.2346077Z [W1204 10:47:03.850590852 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2346283Z 2025-12-04T10:49:11.2346338Z ('RERUN', {'yellow': True}) [10.2865s] [100%] 2025-12-04T10:49:11.2346827Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:47:04.051340727 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2347261Z 2025-12-04T10:49:11.2347428Z [W1204 10:47:04.051748552 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2347633Z 2025-12-04T10:49:11.2347797Z [W1204 10:47:04.051845510 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2348010Z 2025-12-04T10:49:11.2348175Z [W1204 10:47:04.053229981 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2348379Z 2025-12-04T10:49:11.2348547Z [W1204 10:47:04.053560757 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2348772Z 2025-12-04T10:49:11.2348937Z [W1204 10:47:04.053639896 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2349141Z 2025-12-04T10:49:11.2349320Z [W1204 10:47:04.055989453 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2349527Z 2025-12-04T10:49:11.2349689Z [W1204 10:47:04.056251460 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2349897Z 2025-12-04T10:49:11.2350059Z [W1204 10:47:04.056328829 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2350265Z 2025-12-04T10:49:11.2350319Z ('RERUN', {'yellow': True}) [0.7077s] [100%] 2025-12-04T10:49:11.2350815Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:47:05.736343185 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2351248Z 2025-12-04T10:49:11.2351415Z [W1204 10:47:05.736742629 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2351618Z 2025-12-04T10:49:11.2351783Z [W1204 10:47:05.736838148 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2352034Z 2025-12-04T10:49:11.2352197Z [W1204 10:47:05.738300608 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2352400Z 2025-12-04T10:49:11.2352564Z [W1204 10:47:05.738651633 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2352769Z 2025-12-04T10:49:11.2352933Z [W1204 10:47:05.738733672 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2353136Z 2025-12-04T10:49:11.2353301Z [W1204 10:47:05.741076490 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2353506Z 2025-12-04T10:49:11.2353668Z [W1204 10:47:05.741347816 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2353903Z 2025-12-04T10:49:11.2354065Z [W1204 10:47:05.741426575 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2354271Z 2025-12-04T10:49:11.2354313Z FAILED [0.6734s] [100%] 2025-12-04T10:49:11.2354382Z 2025-12-04T10:49:11.2354441Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2354712Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2354968Z Traceback (most recent call last): 2025-12-04T10:49:11.2355229Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2355482Z method(*args, **kwargs) 2025-12-04T10:49:11.2355725Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2355977Z method(*args, **kwargs) 2025-12-04T10:49:11.2356216Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2356462Z with policy(): 2025-12-04T10:49:11.2356690Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2356957Z raise RuntimeError(msg) 2025-12-04T10:49:11.2357492Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.2357967Z 2025-12-04T10:49:11.2358048Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2358488Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2358844Z 2025-12-04T10:49:11.2358942Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2359172Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2359355Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2359648Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2359958Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2360117Z graph_break [] 2025-12-04T10:49:11.2360254Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2360756Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2361216Z if out == self.unknown_value: 2025-12-04T10:49:11.2361384Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2361437Z Traceback (most recent call last): 2025-12-04T10:49:11.2361605Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2361649Z method(*args, **kwargs) 2025-12-04T10:49:11.2361817Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2361903Z method(*args, **kwargs) 2025-12-04T10:49:11.2362069Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2362132Z with policy(): 2025-12-04T10:49:11.2362298Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2362344Z raise RuntimeError(msg) 2025-12-04T10:49:11.2362792Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.2362797Z 2025-12-04T10:49:11.2362878Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2363199Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2363203Z 2025-12-04T10:49:11.2363297Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2363377Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2363439Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2363647Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2363727Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2363768Z graph_break [] 2025-12-04T10:49:11.2363863Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2364239Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2364290Z if out == self.unknown_value: 2025-12-04T10:49:11.2364369Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2364430Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2364526Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2364720Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2364760Z graph_break [] 2025-12-04T10:49:11.2364820Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2364985Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2365036Z Traceback (most recent call last): 2025-12-04T10:49:11.2365205Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2365249Z method(*args, **kwargs) 2025-12-04T10:49:11.2365416Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2365460Z method(*args, **kwargs) 2025-12-04T10:49:11.2365629Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2365671Z with policy(): 2025-12-04T10:49:11.2365839Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2365884Z raise RuntimeError(msg) 2025-12-04T10:49:11.2366330Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2366346Z 2025-12-04T10:49:11.2366424Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2366744Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2366747Z 2025-12-04T10:49:11.2366843Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2366924Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2366984Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2367176Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2367256Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2367297Z graph_break [] 2025-12-04T10:49:11.2367374Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2367751Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2367816Z if out == self.unknown_value: 2025-12-04T10:49:11.2367907Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2367967Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2368046Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2368237Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2368280Z graph_break [] 2025-12-04T10:49:11.2368359Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2368417Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2368509Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2368699Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2368742Z graph_break [] 2025-12-04T10:49:11.2369009Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c05e150630cfc0fc.xml - 2025-12-04T10:49:11.2369076Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2369775Z FAILED [0.6734s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2369779Z 2025-12-04T10:49:11.2369859Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2370177Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2370180Z 2025-12-04T10:49:11.2370287Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2370357Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2370432Z ================== 1 failed, 57 deselected, 2 rerun in 11.82s ================== 2025-12-04T10:49:11.2370472Z Got exit code 1 2025-12-04T10:49:11.2370516Z Retrying single test... 2025-12-04T10:49:11.2370733Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-011e06cae8459d0d.xml 2025-12-04T10:49:11.2370796Z ============================= test session starts ============================== 2025-12-04T10:49:11.2370921Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2370966Z cachedir: .pytest_cache 2025-12-04T10:49:11.2371140Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2371191Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2371235Z configfile: pytest.ini 2025-12-04T10:49:11.2371414Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2371496Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.2371815Z stepcurrent: skipping 55 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2371917Z Running 1 items in this shard 2025-12-04T10:49:11.2371919Z 2025-12-04T10:49:11.2372334Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:47:15.678967684 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2372338Z 2025-12-04T10:49:11.2372507Z [W1204 10:47:22.301248133 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2372509Z 2025-12-04T10:49:11.2372676Z [W1204 10:47:22.301405191 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2372679Z 2025-12-04T10:49:11.2372862Z [W1204 10:47:22.306295193 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2372864Z 2025-12-04T10:49:11.2373029Z [W1204 10:47:22.306621989 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2373031Z 2025-12-04T10:49:11.2373194Z [W1204 10:47:22.306704078 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2373197Z 2025-12-04T10:49:11.2373359Z [W1204 10:47:22.309180194 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2373361Z 2025-12-04T10:49:11.2373524Z [W1204 10:47:22.309461850 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2373526Z 2025-12-04T10:49:11.2373691Z [W1204 10:47:22.309565798 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2373693Z 2025-12-04T10:49:11.2373749Z ('RERUN', {'yellow': True}) [10.2158s] [100%] 2025-12-04T10:49:11.2374148Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:47:23.296967448 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2374179Z 2025-12-04T10:49:11.2374343Z [W1204 10:47:23.297378472 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2374345Z 2025-12-04T10:49:11.2374508Z [W1204 10:47:23.297487821 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2374510Z 2025-12-04T10:49:11.2374675Z [W1204 10:47:23.298867961 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2374677Z 2025-12-04T10:49:11.2374841Z [W1204 10:47:23.299214527 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2374843Z 2025-12-04T10:49:11.2375005Z [W1204 10:47:23.299298436 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2375009Z 2025-12-04T10:49:11.2375170Z [W1204 10:47:23.301523325 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2375172Z 2025-12-04T10:49:11.2375334Z [W1204 10:47:23.301785911 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2375336Z 2025-12-04T10:49:11.2375499Z [W1204 10:47:23.301864670 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2375530Z 2025-12-04T10:49:11.2375585Z ('RERUN', {'yellow': True}) [0.4952s] [100%] 2025-12-04T10:49:11.2375990Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 [W1204 10:47:24.799737860 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2375995Z 2025-12-04T10:49:11.2376158Z [W1204 10:47:24.800119465 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2376160Z 2025-12-04T10:49:11.2376324Z [W1204 10:47:24.800202493 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2376326Z 2025-12-04T10:49:11.2376501Z [W1204 10:47:24.801561995 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2376504Z 2025-12-04T10:49:11.2376668Z [W1204 10:47:24.801883170 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2376670Z 2025-12-04T10:49:11.2376831Z [W1204 10:47:24.801963249 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2376834Z 2025-12-04T10:49:11.2376997Z [W1204 10:47:24.804157069 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2376999Z 2025-12-04T10:49:11.2377162Z [W1204 10:47:24.804418215 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2377164Z 2025-12-04T10:49:11.2377328Z [W1204 10:47:24.804495454 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2377331Z 2025-12-04T10:49:11.2377374Z FAILED [0.4788s] [100%] 2025-12-04T10:49:11.2377376Z 2025-12-04T10:49:11.2377432Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2377600Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2377650Z Traceback (most recent call last): 2025-12-04T10:49:11.2377837Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2377882Z method(*args, **kwargs) 2025-12-04T10:49:11.2378051Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2378094Z method(*args, **kwargs) 2025-12-04T10:49:11.2378262Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2378303Z with policy(): 2025-12-04T10:49:11.2378471Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2378516Z raise RuntimeError(msg) 2025-12-04T10:49:11.2378960Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 147456 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.2378963Z 2025-12-04T10:49:11.2379046Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2379367Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2379380Z 2025-12-04T10:49:11.2379476Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2379555Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2379628Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2379822Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2379904Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2379943Z graph_break [] 2025-12-04T10:49:11.2380023Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2380416Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2380465Z if out == self.unknown_value: 2025-12-04T10:49:11.2380635Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2380684Z Traceback (most recent call last): 2025-12-04T10:49:11.2380853Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2380897Z method(*args, **kwargs) 2025-12-04T10:49:11.2381064Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2381107Z method(*args, **kwargs) 2025-12-04T10:49:11.2381274Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2381314Z with policy(): 2025-12-04T10:49:11.2381485Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2381529Z raise RuntimeError(msg) 2025-12-04T10:49:11.2382029Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 147456 and is now reported as 294912 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.2382052Z 2025-12-04T10:49:11.2382132Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2382448Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2382451Z 2025-12-04T10:49:11.2382547Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2382626Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2382688Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2382881Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2382962Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2383005Z graph_break [] 2025-12-04T10:49:11.2383084Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2383461Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2383524Z if out == self.unknown_value: 2025-12-04T10:49:11.2383602Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2383663Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2383741Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2383950Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2383990Z graph_break [] 2025-12-04T10:49:11.2384049Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2384215Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 _ 2025-12-04T10:49:11.2384265Z Traceback (most recent call last): 2025-12-04T10:49:11.2384450Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2384495Z method(*args, **kwargs) 2025-12-04T10:49:11.2384661Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2384703Z method(*args, **kwargs) 2025-12-04T10:49:11.2384869Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2384909Z with policy(): 2025-12-04T10:49:11.2385078Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2385122Z raise RuntimeError(msg) 2025-12-04T10:49:11.2385572Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2385576Z 2025-12-04T10:49:11.2385654Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2385974Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2385976Z 2025-12-04T10:49:11.2386071Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2386165Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2386226Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2386418Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2386497Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2386538Z graph_break [] 2025-12-04T10:49:11.2386617Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2386992Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2387040Z if out == self.unknown_value: 2025-12-04T10:49:11.2387118Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2387179Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2387256Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2387447Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2387499Z graph_break [] 2025-12-04T10:49:11.2387577Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2387635Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2387729Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2387920Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2387962Z graph_break [] 2025-12-04T10:49:11.2388227Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-011e06cae8459d0d.xml - 2025-12-04T10:49:11.2388293Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2389012Z FAILED [0.4788s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16! Caching allocator allocated memory was 294912 and is now reported as 442368 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2389016Z 2025-12-04T10:49:11.2389095Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2389415Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2389417Z 2025-12-04T10:49:11.2389511Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2389579Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2389655Z ================== 1 failed, 57 deselected, 2 rerun in 11.34s ================== 2025-12-04T10:49:11.2389697Z Got exit code 1 2025-12-04T10:49:11.2389961Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16 2025-12-04T10:49:11.2390102Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.2390338Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-fb2052acdf3cdf36.xml 2025-12-04T10:49:11.2390400Z ============================= test session starts ============================== 2025-12-04T10:49:11.2390523Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2390567Z cachedir: .pytest_cache 2025-12-04T10:49:11.2390745Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2390794Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2390838Z configfile: pytest.ini 2025-12-04T10:49:11.2391024Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2391106Z collecting ... collected 58 items / 56 deselected / 2 selected 2025-12-04T10:49:11.2391163Z stepcurrent: skipping 56 already run items. 2025-12-04T10:49:11.2391214Z Running 2 items in this shard 2025-12-04T10:49:11.2391216Z 2025-12-04T10:49:11.2391488Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [2.7825s] [ 50%] 2025-12-04T10:49:11.2391756Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.6562s] [ 50%] 2025-12-04T10:49:11.2392084Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 FAILED [0.6574s] [ 50%] 2025-12-04T10:49:11.2392087Z 2025-12-04T10:49:11.2392143Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2392309Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2392360Z Traceback (most recent call last): 2025-12-04T10:49:11.2392533Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2392576Z method(*args, **kwargs) 2025-12-04T10:49:11.2392768Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2392812Z method(*args, **kwargs) 2025-12-04T10:49:11.2392979Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2393019Z with policy(): 2025-12-04T10:49:11.2393189Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2393234Z raise RuntimeError(msg) 2025-12-04T10:49:11.2393669Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.2393673Z 2025-12-04T10:49:11.2393755Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2394073Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2394076Z 2025-12-04T10:49:11.2394173Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2394251Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2394312Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2394523Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2394603Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2394643Z graph_break [] 2025-12-04T10:49:11.2394808Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2394858Z Traceback (most recent call last): 2025-12-04T10:49:11.2395031Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2395074Z method(*args, **kwargs) 2025-12-04T10:49:11.2395241Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2395284Z method(*args, **kwargs) 2025-12-04T10:49:11.2395451Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2395493Z with policy(): 2025-12-04T10:49:11.2395662Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2395707Z raise RuntimeError(msg) 2025-12-04T10:49:11.2396146Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.2396162Z 2025-12-04T10:49:11.2396255Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2396570Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2396573Z 2025-12-04T10:49:11.2396668Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2396747Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2396808Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2397010Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2397091Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2397130Z graph_break [] 2025-12-04T10:49:11.2397210Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2397270Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2397348Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2397541Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2397581Z graph_break [] 2025-12-04T10:49:11.2397638Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2397802Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2397853Z Traceback (most recent call last): 2025-12-04T10:49:11.2398022Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2398067Z method(*args, **kwargs) 2025-12-04T10:49:11.2398232Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2398278Z method(*args, **kwargs) 2025-12-04T10:49:11.2398465Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2398506Z with policy(): 2025-12-04T10:49:11.2398673Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2398718Z raise RuntimeError(msg) 2025-12-04T10:49:11.2399156Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2399160Z 2025-12-04T10:49:11.2399243Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2399558Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2399561Z 2025-12-04T10:49:11.2399655Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2399734Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2399793Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2399998Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2400076Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2400117Z graph_break [] 2025-12-04T10:49:11.2400210Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2400270Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2400347Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2400540Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2400579Z graph_break [] 2025-12-04T10:49:11.2400658Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2400727Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2400807Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2400996Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2401037Z graph_break [] 2025-12-04T10:49:11.2401305Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-fb2052acdf3cdf36.xml - 2025-12-04T10:49:11.2401373Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2402145Z FAILED [0.6574s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2402149Z 2025-12-04T10:49:11.2402229Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2402544Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2402567Z 2025-12-04T10:49:11.2402660Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2402730Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2402805Z ================== 1 failed, 56 deselected, 2 rerun in 4.26s =================== 2025-12-04T10:49:11.2402848Z Got exit code 1 2025-12-04T10:49:11.2402893Z Retrying single test... 2025-12-04T10:49:11.2403114Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-fe919aeb10c5c13b.xml 2025-12-04T10:49:11.2403179Z ============================= test session starts ============================== 2025-12-04T10:49:11.2403304Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2403350Z cachedir: .pytest_cache 2025-12-04T10:49:11.2403523Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2403574Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2403617Z configfile: pytest.ini 2025-12-04T10:49:11.2403795Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2403875Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.2404203Z stepcurrent: skipping 56 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2404251Z Running 1 items in this shard 2025-12-04T10:49:11.2404270Z 2025-12-04T10:49:11.2404667Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:47:44.421585505 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2404671Z 2025-12-04T10:49:11.2404841Z [W1204 10:47:52.756578192 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2404843Z 2025-12-04T10:49:11.2405030Z [W1204 10:47:52.756736260 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2405035Z 2025-12-04T10:49:11.2405199Z [W1204 10:47:52.759845266 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2405201Z 2025-12-04T10:49:11.2405364Z [W1204 10:47:52.760152392 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2405366Z 2025-12-04T10:49:11.2405532Z [W1204 10:47:52.760233361 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2405535Z 2025-12-04T10:49:11.2405698Z [W1204 10:47:52.762645437 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2405701Z 2025-12-04T10:49:11.2405864Z [W1204 10:47:52.762915793 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2405867Z 2025-12-04T10:49:11.2406029Z [W1204 10:47:52.762992822 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2406031Z 2025-12-04T10:49:11.2406085Z ('RERUN', {'yellow': True}) [9.9026s] [100%] 2025-12-04T10:49:11.2406478Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:47:53.725747864 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2406495Z 2025-12-04T10:49:11.2406658Z [W1204 10:47:53.726113289 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2406662Z 2025-12-04T10:49:11.2406824Z [W1204 10:47:53.726197538 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2406827Z 2025-12-04T10:49:11.2406989Z [W1204 10:47:53.727550389 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2406991Z 2025-12-04T10:49:11.2407153Z [W1204 10:47:53.727873194 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2407155Z 2025-12-04T10:49:11.2407318Z [W1204 10:47:53.727951273 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2407322Z 2025-12-04T10:49:11.2407483Z [W1204 10:47:53.730116473 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2407485Z 2025-12-04T10:49:11.2407646Z [W1204 10:47:53.730371019 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2407663Z 2025-12-04T10:49:11.2407827Z [W1204 10:47:53.730446478 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2407829Z 2025-12-04T10:49:11.2407883Z ('RERUN', {'yellow': True}) [0.4568s] [100%] 2025-12-04T10:49:11.2408285Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:47:53.178926797 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2408289Z 2025-12-04T10:49:11.2408452Z [W1204 10:47:53.179314952 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2408455Z 2025-12-04T10:49:11.2408617Z [W1204 10:47:53.179412481 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2408631Z 2025-12-04T10:49:11.2408794Z [W1204 10:47:53.180805321 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2408796Z 2025-12-04T10:49:11.2408958Z [W1204 10:47:53.181154026 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2408960Z 2025-12-04T10:49:11.2409122Z [W1204 10:47:53.181238115 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2409125Z 2025-12-04T10:49:11.2409285Z [W1204 10:47:53.183435974 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2409287Z 2025-12-04T10:49:11.2409448Z [W1204 10:47:53.183703910 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2409452Z 2025-12-04T10:49:11.2409614Z [W1204 10:47:53.183783459 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2409618Z 2025-12-04T10:49:11.2409660Z FAILED [0.4456s] [100%] 2025-12-04T10:49:11.2409662Z 2025-12-04T10:49:11.2409720Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2409884Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2409947Z Traceback (most recent call last): 2025-12-04T10:49:11.2410119Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2410164Z method(*args, **kwargs) 2025-12-04T10:49:11.2410332Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2410375Z method(*args, **kwargs) 2025-12-04T10:49:11.2410545Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2410585Z with policy(): 2025-12-04T10:49:11.2410754Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2410799Z raise RuntimeError(msg) 2025-12-04T10:49:11.2411230Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.2411235Z 2025-12-04T10:49:11.2411314Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2411629Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2411644Z 2025-12-04T10:49:11.2411739Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2411831Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2411932Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2412125Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2412205Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2412247Z graph_break [] 2025-12-04T10:49:11.2412324Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2412724Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2412773Z if out == self.unknown_value: 2025-12-04T10:49:11.2412937Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2412987Z Traceback (most recent call last): 2025-12-04T10:49:11.2413157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2413200Z method(*args, **kwargs) 2025-12-04T10:49:11.2413368Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2413411Z method(*args, **kwargs) 2025-12-04T10:49:11.2413575Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2413617Z with policy(): 2025-12-04T10:49:11.2413784Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2413830Z raise RuntimeError(msg) 2025-12-04T10:49:11.2414267Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.2414287Z 2025-12-04T10:49:11.2414368Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2414678Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2414683Z 2025-12-04T10:49:11.2414778Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2414855Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2414916Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2415111Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2415191Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2415233Z graph_break [] 2025-12-04T10:49:11.2415310Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2415688Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2415751Z if out == self.unknown_value: 2025-12-04T10:49:11.2415830Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2415889Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2415985Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2416175Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2416216Z graph_break [] 2025-12-04T10:49:11.2416273Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2416441Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2416492Z Traceback (most recent call last): 2025-12-04T10:49:11.2416673Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2416718Z method(*args, **kwargs) 2025-12-04T10:49:11.2416884Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2416928Z method(*args, **kwargs) 2025-12-04T10:49:11.2417093Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2417132Z with policy(): 2025-12-04T10:49:11.2417302Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2417346Z raise RuntimeError(msg) 2025-12-04T10:49:11.2417790Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2417793Z 2025-12-04T10:49:11.2417874Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2418189Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2418191Z 2025-12-04T10:49:11.2418286Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2418375Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2418437Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2418627Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2418708Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2418749Z graph_break [] 2025-12-04T10:49:11.2418827Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2419203Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2419250Z if out == self.unknown_value: 2025-12-04T10:49:11.2419329Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2419388Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2419466Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2419656Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2419707Z graph_break [] 2025-12-04T10:49:11.2419784Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2419844Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2419937Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2420128Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2420169Z graph_break [] 2025-12-04T10:49:11.2420438Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-fe919aeb10c5c13b.xml - 2025-12-04T10:49:11.2420503Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2421203Z FAILED [0.4456s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2421206Z 2025-12-04T10:49:11.2421287Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2421600Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2421603Z 2025-12-04T10:49:11.2421696Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2421763Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2421837Z ================== 1 failed, 57 deselected, 2 rerun in 10.97s ================== 2025-12-04T10:49:11.2421925Z Got exit code 1 2025-12-04T10:49:11.2421970Z Retrying single test... 2025-12-04T10:49:11.2422187Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-51f3c272deefc6da.xml 2025-12-04T10:49:11.2422251Z ============================= test session starts ============================== 2025-12-04T10:49:11.2422391Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2422437Z cachedir: .pytest_cache 2025-12-04T10:49:11.2422611Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2422662Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2422706Z configfile: pytest.ini 2025-12-04T10:49:11.2422887Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2422968Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.2423278Z stepcurrent: skipping 56 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2423327Z Running 1 items in this shard 2025-12-04T10:49:11.2423331Z 2025-12-04T10:49:11.2423723Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:48:03.530710659 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2423726Z 2025-12-04T10:49:11.2423896Z [W1204 10:48:10.099072649 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2423912Z 2025-12-04T10:49:11.2424077Z [W1204 10:48:10.099246706 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2424082Z 2025-12-04T10:49:11.2424260Z [W1204 10:48:10.102883775 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2424263Z 2025-12-04T10:49:11.2424425Z [W1204 10:48:10.103277489 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2424429Z 2025-12-04T10:49:11.2424591Z [W1204 10:48:10.103367538 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2424593Z 2025-12-04T10:49:11.2424773Z [W1204 10:48:10.106046250 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2424776Z 2025-12-04T10:49:11.2424938Z [W1204 10:48:10.106360776 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2424940Z 2025-12-04T10:49:11.2425104Z [W1204 10:48:10.106439985 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2425106Z 2025-12-04T10:49:11.2425161Z ('RERUN', {'yellow': True}) [10.2446s] [100%] 2025-12-04T10:49:11.2425554Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:48:11.170639411 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2425556Z 2025-12-04T10:49:11.2425724Z [W1204 10:48:11.171053515 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2425727Z 2025-12-04T10:49:11.2425889Z [W1204 10:48:11.171148104 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2425891Z 2025-12-04T10:49:11.2426056Z [W1204 10:48:11.172573514 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2426059Z 2025-12-04T10:49:11.2426221Z [W1204 10:48:11.172908689 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2426238Z 2025-12-04T10:49:11.2426400Z [W1204 10:48:11.172988978 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2426402Z 2025-12-04T10:49:11.2426565Z [W1204 10:48:11.175325265 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2426568Z 2025-12-04T10:49:11.2426729Z [W1204 10:48:11.175600671 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2426731Z 2025-12-04T10:49:11.2426894Z [W1204 10:48:11.175676640 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2426896Z 2025-12-04T10:49:11.2426949Z ('RERUN', {'yellow': True}) [0.5479s] [100%] 2025-12-04T10:49:11.2427342Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 [W1204 10:48:12.679511775 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2427344Z 2025-12-04T10:49:11.2427511Z [W1204 10:48:12.679899010 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2427524Z 2025-12-04T10:49:11.2427686Z [W1204 10:48:12.679986418 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2427688Z 2025-12-04T10:49:11.2427863Z [W1204 10:48:12.681390319 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2427865Z 2025-12-04T10:49:11.2428027Z [W1204 10:48:12.681722184 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2428031Z 2025-12-04T10:49:11.2428192Z [W1204 10:48:12.681801303 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2428195Z 2025-12-04T10:49:11.2428366Z [W1204 10:48:12.684103340 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2428369Z 2025-12-04T10:49:11.2428530Z [W1204 10:48:12.684367616 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2428533Z 2025-12-04T10:49:11.2428696Z [W1204 10:48:12.684442915 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2428698Z 2025-12-04T10:49:11.2428740Z FAILED [0.5139s] [100%] 2025-12-04T10:49:11.2428742Z 2025-12-04T10:49:11.2428801Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2428965Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2429016Z Traceback (most recent call last): 2025-12-04T10:49:11.2429189Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2429236Z method(*args, **kwargs) 2025-12-04T10:49:11.2429408Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2429451Z method(*args, **kwargs) 2025-12-04T10:49:11.2429618Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2429659Z with policy(): 2025-12-04T10:49:11.2429827Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2429886Z raise RuntimeError(msg) 2025-12-04T10:49:11.2430317Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9216 on device 0. CUDA driver allocated memory was 807403520 and is now 1207959552. 2025-12-04T10:49:11.2430322Z 2025-12-04T10:49:11.2430402Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2430721Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2430724Z 2025-12-04T10:49:11.2430818Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2430899Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2430961Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2431155Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2431236Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2431296Z graph_break [] 2025-12-04T10:49:11.2431375Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2431764Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2431814Z if out == self.unknown_value: 2025-12-04T10:49:11.2432033Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2432084Z Traceback (most recent call last): 2025-12-04T10:49:11.2432252Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2432296Z method(*args, **kwargs) 2025-12-04T10:49:11.2432476Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2432522Z method(*args, **kwargs) 2025-12-04T10:49:11.2432686Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2432727Z with policy(): 2025-12-04T10:49:11.2432894Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2432941Z raise RuntimeError(msg) 2025-12-04T10:49:11.2433377Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 9216 and is now reported as 18432 on device 0. CUDA driver allocated memory was 1207959552 and is now 1222639616. 2025-12-04T10:49:11.2433382Z 2025-12-04T10:49:11.2433461Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2433779Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2433782Z 2025-12-04T10:49:11.2433877Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2433957Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2434017Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2434228Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2434307Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2434349Z graph_break [] 2025-12-04T10:49:11.2434426Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2434804Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2434852Z if out == self.unknown_value: 2025-12-04T10:49:11.2434930Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2434991Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2435071Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2435264Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2435303Z graph_break [] 2025-12-04T10:49:11.2435360Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2435524Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 _ 2025-12-04T10:49:11.2435596Z Traceback (most recent call last): 2025-12-04T10:49:11.2435778Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2435822Z method(*args, **kwargs) 2025-12-04T10:49:11.2435987Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2436032Z method(*args, **kwargs) 2025-12-04T10:49:11.2436196Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2436236Z with policy(): 2025-12-04T10:49:11.2436402Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2436448Z raise RuntimeError(msg) 2025-12-04T10:49:11.2436904Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2436907Z 2025-12-04T10:49:11.2436987Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2437301Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2437304Z 2025-12-04T10:49:11.2437398Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2437477Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2437538Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2437730Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2437809Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2437850Z graph_break [] 2025-12-04T10:49:11.2437927Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2438301Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2438363Z if out == self.unknown_value: 2025-12-04T10:49:11.2438442Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2438501Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2438581Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2438770Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2438812Z graph_break [] 2025-12-04T10:49:11.2438890Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2438950Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2439031Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2439219Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2439260Z graph_break [] 2025-12-04T10:49:11.2439525Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-51f3c272deefc6da.xml - 2025-12-04T10:49:11.2439603Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2440298Z FAILED [0.5139s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16! Caching allocator allocated memory was 18432 and is now reported as 27648 on device 0. CUDA driver allocated memory was 1222639616 and is now 1237319680. 2025-12-04T10:49:11.2440302Z 2025-12-04T10:49:11.2440383Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2440710Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2440713Z 2025-12-04T10:49:11.2440806Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2440874Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2440948Z ================== 1 failed, 57 deselected, 2 rerun in 11.46s ================== 2025-12-04T10:49:11.2440989Z Got exit code 1 2025-12-04T10:49:11.2441248Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16 2025-12-04T10:49:11.2441390Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.2441608Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ca81d5088b6bba0e.xml 2025-12-04T10:49:11.2441672Z ============================= test session starts ============================== 2025-12-04T10:49:11.2441795Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2441841Z cachedir: .pytest_cache 2025-12-04T10:49:11.2442063Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2442113Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2442156Z configfile: pytest.ini 2025-12-04T10:49:11.2442357Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2442437Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.2442496Z stepcurrent: skipping 57 already run items. 2025-12-04T10:49:11.2442543Z Running 1 items in this shard 2025-12-04T10:49:11.2442547Z 2025-12-04T10:49:11.2442818Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [3.0042s] [100%] 2025-12-04T10:49:11.2443088Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 ('RERUN', {'yellow': True}) [0.4895s] [100%] 2025-12-04T10:49:11.2443330Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 FAILED [0.4496s] [100%] 2025-12-04T10:49:11.2443333Z 2025-12-04T10:49:11.2443388Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2443551Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2443602Z Traceback (most recent call last): 2025-12-04T10:49:11.2443775Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2443834Z method(*args, **kwargs) 2025-12-04T10:49:11.2444001Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2444061Z method(*args, **kwargs) 2025-12-04T10:49:11.2444227Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2444270Z with policy(): 2025-12-04T10:49:11.2444438Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2444483Z raise RuntimeError(msg) 2025-12-04T10:49:11.2444925Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.2444929Z 2025-12-04T10:49:11.2445008Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2445325Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2445327Z 2025-12-04T10:49:11.2445421Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2445500Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2445561Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2445865Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2445945Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2445987Z graph_break [] 2025-12-04T10:49:11.2446151Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2446200Z Traceback (most recent call last): 2025-12-04T10:49:11.2446369Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2446425Z method(*args, **kwargs) 2025-12-04T10:49:11.2446590Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2446632Z method(*args, **kwargs) 2025-12-04T10:49:11.2446796Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2446838Z with policy(): 2025-12-04T10:49:11.2447005Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2447048Z raise RuntimeError(msg) 2025-12-04T10:49:11.2447483Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.2447487Z 2025-12-04T10:49:11.2447565Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2447881Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2447895Z 2025-12-04T10:49:11.2447989Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2448068Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2448130Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2448439Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2448520Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2448560Z graph_break [] 2025-12-04T10:49:11.2448639Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2448698Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2448774Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2449079Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2449120Z graph_break [] 2025-12-04T10:49:11.2449178Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2449344Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2449457Z Traceback (most recent call last): 2025-12-04T10:49:11.2449655Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2449711Z method(*args, **kwargs) 2025-12-04T10:49:11.2452642Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2452699Z method(*args, **kwargs) 2025-12-04T10:49:11.2452871Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2452912Z with policy(): 2025-12-04T10:49:11.2453082Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2453126Z raise RuntimeError(msg) 2025-12-04T10:49:11.2453564Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2453629Z 2025-12-04T10:49:11.2453710Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2454027Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2454031Z 2025-12-04T10:49:11.2454126Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2454206Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2454267Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2454564Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2454643Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2454683Z graph_break [] 2025-12-04T10:49:11.2454761Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2454836Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2454912Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2455218Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2455260Z graph_break [] 2025-12-04T10:49:11.2455337Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2455400Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2455476Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2455784Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2455825Z graph_break [] 2025-12-04T10:49:11.2456093Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-ca81d5088b6bba0e.xml - 2025-12-04T10:49:11.2456161Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2456852Z FAILED [0.4496s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2456856Z 2025-12-04T10:49:11.2456937Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2457251Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2457255Z 2025-12-04T10:49:11.2457351Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2457418Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2457505Z ================== 1 failed, 57 deselected, 2 rerun in 4.11s =================== 2025-12-04T10:49:11.2457548Z Got exit code 1 2025-12-04T10:49:11.2457591Z Retrying single test... 2025-12-04T10:49:11.2457810Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-3f25067ed0dbd601.xml 2025-12-04T10:49:11.2457873Z ============================= test session starts ============================== 2025-12-04T10:49:11.2457999Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2458042Z cachedir: .pytest_cache 2025-12-04T10:49:11.2458219Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2458269Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2458314Z configfile: pytest.ini 2025-12-04T10:49:11.2458493Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2458575Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.2458887Z stepcurrent: skipping 57 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2458937Z Running 1 items in this shard 2025-12-04T10:49:11.2458954Z 2025-12-04T10:49:11.2459358Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:48:33.294179172 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2459363Z 2025-12-04T10:49:11.2459531Z [W1204 10:48:41.026117632 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2459534Z 2025-12-04T10:49:11.2459700Z [W1204 10:48:41.026306959 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2459702Z 2025-12-04T10:49:11.2459864Z [W1204 10:48:41.030200303 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2459866Z 2025-12-04T10:49:11.2460041Z [W1204 10:48:41.030566288 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2460043Z 2025-12-04T10:49:11.2460204Z [W1204 10:48:41.030649547 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2460206Z 2025-12-04T10:49:11.2460369Z [W1204 10:48:41.033349228 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2460372Z 2025-12-04T10:49:11.2460533Z [W1204 10:48:41.033650354 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2460535Z 2025-12-04T10:49:11.2460694Z [W1204 10:48:41.033730013 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2460696Z 2025-12-04T10:49:11.2460756Z ('RERUN', {'yellow': True}) [10.8802s] [100%] 2025-12-04T10:49:11.2461149Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:48:42.868170557 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2461151Z 2025-12-04T10:49:11.2461316Z [W1204 10:48:42.868651741 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2461331Z 2025-12-04T10:49:11.2461493Z [W1204 10:48:42.868753219 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2461495Z 2025-12-04T10:49:11.2461654Z [W1204 10:48:42.870438245 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2461658Z 2025-12-04T10:49:11.2461821Z [W1204 10:48:42.870740751 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2461823Z 2025-12-04T10:49:11.2462032Z [W1204 10:48:42.870818660 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2462034Z 2025-12-04T10:49:11.2462194Z [W1204 10:48:42.873325274 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2462197Z 2025-12-04T10:49:11.2462359Z [W1204 10:48:42.873600420 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2462361Z 2025-12-04T10:49:11.2462520Z [W1204 10:48:42.873675829 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2462522Z 2025-12-04T10:49:11.2462577Z ('RERUN', {'yellow': True}) [0.6348s] [100%] 2025-12-04T10:49:11.2462991Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:48:42.458070186 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2462994Z 2025-12-04T10:49:11.2463156Z [W1204 10:48:42.458508980 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2463160Z 2025-12-04T10:49:11.2463319Z [W1204 10:48:42.458622318 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2463323Z 2025-12-04T10:49:11.2463482Z [W1204 10:48:42.460276105 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2463484Z 2025-12-04T10:49:11.2463660Z [W1204 10:48:42.460605990 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2463663Z 2025-12-04T10:49:11.2463822Z [W1204 10:48:42.460695439 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2463826Z 2025-12-04T10:49:11.2463987Z [W1204 10:48:42.463228453 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2463989Z 2025-12-04T10:49:11.2464149Z [W1204 10:48:42.463539898 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2464151Z 2025-12-04T10:49:11.2464312Z [W1204 10:48:42.463622357 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2464314Z 2025-12-04T10:49:11.2464356Z FAILED [0.5495s] [100%] 2025-12-04T10:49:11.2464359Z 2025-12-04T10:49:11.2464417Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2464582Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2464631Z Traceback (most recent call last): 2025-12-04T10:49:11.2464804Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2464848Z method(*args, **kwargs) 2025-12-04T10:49:11.2465033Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2465076Z method(*args, **kwargs) 2025-12-04T10:49:11.2465241Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2465281Z with policy(): 2025-12-04T10:49:11.2465449Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2465493Z raise RuntimeError(msg) 2025-12-04T10:49:11.2465925Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.2465927Z 2025-12-04T10:49:11.2466007Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2466325Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2466327Z 2025-12-04T10:49:11.2466421Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2466515Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2466576Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2466891Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2466971Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2467012Z graph_break [] 2025-12-04T10:49:11.2467090Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2467465Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2467527Z if out == self.unknown_value: 2025-12-04T10:49:11.2467690Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2467740Z Traceback (most recent call last): 2025-12-04T10:49:11.2467908Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2467950Z method(*args, **kwargs) 2025-12-04T10:49:11.2468116Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2468160Z method(*args, **kwargs) 2025-12-04T10:49:11.2468324Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2468363Z with policy(): 2025-12-04T10:49:11.2468532Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2468577Z raise RuntimeError(msg) 2025-12-04T10:49:11.2469017Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.2469020Z 2025-12-04T10:49:11.2469099Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2469423Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2469426Z 2025-12-04T10:49:11.2469519Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2469600Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2469662Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2469959Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2470038Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2470078Z graph_break [] 2025-12-04T10:49:11.2470157Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2470527Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2470575Z if out == self.unknown_value: 2025-12-04T10:49:11.2470653Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2470725Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2470802Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2471111Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2471151Z graph_break [] 2025-12-04T10:49:11.2471209Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2471372Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2471423Z Traceback (most recent call last): 2025-12-04T10:49:11.2471602Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2471647Z method(*args, **kwargs) 2025-12-04T10:49:11.2471810Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2471927Z method(*args, **kwargs) 2025-12-04T10:49:11.2472093Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2472133Z with policy(): 2025-12-04T10:49:11.2472299Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2472344Z raise RuntimeError(msg) 2025-12-04T10:49:11.2472782Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2472785Z 2025-12-04T10:49:11.2472862Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2473173Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2473175Z 2025-12-04T10:49:11.2473267Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2473363Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2473423Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2473717Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2473797Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2473836Z graph_break [] 2025-12-04T10:49:11.2473913Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2474286Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2474334Z if out == self.unknown_value: 2025-12-04T10:49:11.2474411Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2474470Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2474546Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2474841Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2474895Z graph_break [] 2025-12-04T10:49:11.2474973Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2475046Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2475123Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2475414Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2475455Z graph_break [] 2025-12-04T10:49:11.2475725Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-3f25067ed0dbd601.xml - 2025-12-04T10:49:11.2475807Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2476492Z FAILED [0.5495s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2476496Z 2025-12-04T10:49:11.2476574Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2476886Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2476890Z 2025-12-04T10:49:11.2476983Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2477050Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2477125Z ================== 1 failed, 57 deselected, 2 rerun in 12.23s ================== 2025-12-04T10:49:11.2477165Z Got exit code 1 2025-12-04T10:49:11.2477209Z Retrying single test... 2025-12-04T10:49:11.2477422Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6f793b3d12672c6e.xml 2025-12-04T10:49:11.2477501Z ============================= test session starts ============================== 2025-12-04T10:49:11.2477624Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2477669Z cachedir: .pytest_cache 2025-12-04T10:49:11.2477842Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2477894Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2477938Z configfile: pytest.ini 2025-12-04T10:49:11.2478117Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2478196Z collecting ... collected 58 items / 57 deselected / 1 selected 2025-12-04T10:49:11.2478505Z stepcurrent: skipping 57 already run items. Running only test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2478554Z Running 1 items in this shard 2025-12-04T10:49:11.2478556Z 2025-12-04T10:49:11.2478949Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:48:53.062541681 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2478965Z 2025-12-04T10:49:11.2479132Z [W1204 10:49:00.469932419 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2479134Z 2025-12-04T10:49:11.2479307Z [W1204 10:49:00.470121247 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2479310Z 2025-12-04T10:49:11.2479471Z [W1204 10:49:00.473372120 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2479476Z 2025-12-04T10:49:11.2479636Z [W1204 10:49:00.473708845 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2479639Z 2025-12-04T10:49:11.2479812Z [W1204 10:49:00.473788084 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2479815Z 2025-12-04T10:49:11.2479977Z [W1204 10:49:00.476718742 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2479979Z 2025-12-04T10:49:11.2480139Z [W1204 10:49:00.477092646 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2480141Z 2025-12-04T10:49:11.2480303Z [W1204 10:49:00.477172325 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2480306Z 2025-12-04T10:49:11.2480361Z ('RERUN', {'yellow': True}) [10.4766s] [100%] 2025-12-04T10:49:11.2480755Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:49:01.203373184 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2480759Z 2025-12-04T10:49:11.2480920Z [W1204 10:49:01.204040944 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2480923Z 2025-12-04T10:49:11.2481087Z [W1204 10:49:01.204215512 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2481088Z 2025-12-04T10:49:11.2481249Z [W1204 10:49:01.205956017 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2481264Z 2025-12-04T10:49:11.2481424Z [W1204 10:49:01.206278782 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2481426Z 2025-12-04T10:49:11.2481588Z [W1204 10:49:01.206361871 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2481591Z 2025-12-04T10:49:11.2481751Z [W1204 10:49:01.208630978 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2481754Z 2025-12-04T10:49:11.2481946Z [W1204 10:49:01.208890615 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2481948Z 2025-12-04T10:49:11.2482110Z [W1204 10:49:01.208966114 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2482113Z 2025-12-04T10:49:11.2482166Z ('RERUN', {'yellow': True}) [0.5882s] [100%] 2025-12-04T10:49:11.2482555Z inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 [W1204 10:49:02.804985469 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2482582Z 2025-12-04T10:49:11.2482743Z [W1204 10:49:02.805443942 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2482745Z 2025-12-04T10:49:11.2482921Z [W1204 10:49:02.805553681 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2482923Z 2025-12-04T10:49:11.2483083Z [W1204 10:49:02.807087939 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2483086Z 2025-12-04T10:49:11.2483249Z [W1204 10:49:02.807409994 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2483251Z 2025-12-04T10:49:11.2483413Z [W1204 10:49:02.807497043 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2483430Z 2025-12-04T10:49:11.2483590Z [W1204 10:49:02.809785110 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2483592Z 2025-12-04T10:49:11.2483755Z [W1204 10:49:02.810118705 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2483757Z 2025-12-04T10:49:11.2483918Z [W1204 10:49:02.810210044 Module.cpp:201] symbolizing C++ stack trace for exception; if this hangs, rerun with TORCH_DISABLE_ADDR2LINE=1... 2025-12-04T10:49:11.2483923Z 2025-12-04T10:49:11.2483965Z FAILED [0.5806s] [100%] 2025-12-04T10:49:11.2483967Z 2025-12-04T10:49:11.2484024Z ==================================== RERUNS ==================================== 2025-12-04T10:49:11.2484188Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2484240Z Traceback (most recent call last): 2025-12-04T10:49:11.2484413Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2484458Z method(*args, **kwargs) 2025-12-04T10:49:11.2484627Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2484671Z method(*args, **kwargs) 2025-12-04T10:49:11.2484835Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2484890Z with policy(): 2025-12-04T10:49:11.2485056Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2485101Z raise RuntimeError(msg) 2025-12-04T10:49:11.2485536Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 0 and is now reported as 9728 on device 0. CUDA driver allocated memory was 807403520 and is now 1298137088. 2025-12-04T10:49:11.2485541Z 2025-12-04T10:49:11.2485621Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2485938Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2485942Z 2025-12-04T10:49:11.2486034Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2486113Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2486172Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2486470Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2486560Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2486601Z graph_break [] 2025-12-04T10:49:11.2486687Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2487063Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2487112Z if out == self.unknown_value: 2025-12-04T10:49:11.2487274Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2487323Z Traceback (most recent call last): 2025-12-04T10:49:11.2487503Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2487549Z method(*args, **kwargs) 2025-12-04T10:49:11.2487716Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2487760Z method(*args, **kwargs) 2025-12-04T10:49:11.2487924Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2487965Z with policy(): 2025-12-04T10:49:11.2488132Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2488176Z raise RuntimeError(msg) 2025-12-04T10:49:11.2488615Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 9728 and is now reported as 19456 on device 0. CUDA driver allocated memory was 1298137088 and is now 1312817152. 2025-12-04T10:49:11.2488618Z 2025-12-04T10:49:11.2488699Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2489015Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2489019Z 2025-12-04T10:49:11.2489112Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2489203Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2489263Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2489560Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2489639Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2489681Z graph_break [] 2025-12-04T10:49:11.2489757Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2490133Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2490181Z if out == self.unknown_value: 2025-12-04T10:49:11.2490259Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2490317Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2490395Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2490690Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2490741Z graph_break [] 2025-12-04T10:49:11.2490807Z =================================== FAILURES =================================== 2025-12-04T10:49:11.2490973Z _ TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 _ 2025-12-04T10:49:11.2491022Z Traceback (most recent call last): 2025-12-04T10:49:11.2491191Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2491235Z method(*args, **kwargs) 2025-12-04T10:49:11.2491398Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T10:49:11.2491442Z method(*args, **kwargs) 2025-12-04T10:49:11.2491618Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T10:49:11.2491659Z with policy(): 2025-12-04T10:49:11.2491825Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T10:49:11.2491944Z raise RuntimeError(msg) 2025-12-04T10:49:11.2492381Z RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2492385Z 2025-12-04T10:49:11.2492464Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2492779Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2492782Z 2025-12-04T10:49:11.2492876Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2492955Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2493014Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2493309Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2493405Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2493445Z graph_break [] 2025-12-04T10:49:11.2493522Z ----------------------------- Captured stderr call ----------------------------- 2025-12-04T10:49:11.2493897Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/constant_folding.py:256: UserWarning: Unsupported unwinding pattern: Address not in range (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/profiler/unwind/unwind.cpp:219.) 2025-12-04T10:49:11.2493945Z if out == self.unknown_value: 2025-12-04T10:49:11.2494023Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2494082Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2494159Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2494453Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2494494Z graph_break [] 2025-12-04T10:49:11.2494570Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T10:49:11.2494645Z stats [('calls_captured', 3), ('unique_graphs', 1)] 2025-12-04T10:49:11.2494721Z aot_autograd [('total', 1), ('autograd_cache_bypass', 1), ('ok', 1)] 2025-12-04T10:49:11.2495029Z inductor [('pattern_matcher_nodes', 8), ('woq_matcher_nodes', 6), ('pattern_matcher_count', 3), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pad_mm_bench', 1), ('fxgraph_cache_miss', 1), ('woq_matcher_count', 1), ('extern_calls', 1)] 2025-12-04T10:49:11.2495071Z graph_break [] 2025-12-04T10:49:11.2495334Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-6f793b3d12672c6e.xml - 2025-12-04T10:49:11.2495400Z =========================== short test summary info ============================ 2025-12-04T10:49:11.2496109Z FAILED [0.5806s] inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 - RuntimeError: CUDA driver API confirmed a leak in __main__.TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16! Caching allocator allocated memory was 19456 and is now reported as 29184 on device 0. CUDA driver allocated memory was 1312817152 and is now 1327497216. 2025-12-04T10:49:11.2496114Z 2025-12-04T10:49:11.2496194Z To execute this test, run the following from the base repo dir: 2025-12-04T10:49:11.2496504Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/inductor/test_cuda_select_algorithm.py TestSelectAlgorithmCudaCUDA.test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2496508Z 2025-12-04T10:49:11.2496599Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T10:49:11.2496666Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T10:49:11.2496739Z ================== 1 failed, 57 deselected, 2 rerun in 11.82s ================== 2025-12-04T10:49:11.2496780Z Got exit code 1 2025-12-04T10:49:11.2497037Z FAILED CONSISTENTLY: test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16 2025-12-04T10:49:11.2497177Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T10:49:11.2497392Z Test results will be stored in test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c473632d229e6f83.xml 2025-12-04T10:49:11.2497476Z ============================= test session starts ============================== 2025-12-04T10:49:11.2497597Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T10:49:11.2497643Z cachedir: .pytest_cache 2025-12-04T10:49:11.2497815Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T10:49:11.2497867Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T10:49:11.2497910Z configfile: pytest.ini 2025-12-04T10:49:11.2498088Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T10:49:11.2498168Z collecting ... collected 58 items / 58 deselected / 0 selected 2025-12-04T10:49:11.2498225Z stepcurrent: skipping 58 already run items. 2025-12-04T10:49:11.2498276Z Running 0 items in this shard 2025-12-04T10:49:11.2498279Z 2025-12-04T10:49:11.2498542Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_cuda_select_algorithm/inductor.test_cuda_select_algorithm-c473632d229e6f83.xml - 2025-12-04T10:49:11.2498606Z ============================ 58 deselected in 0.01s ============================ 2025-12-04T10:49:11.2511796Z The following tests failed consistently: ['test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_concat_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_17_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_1_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_1024_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_128_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_1_in_features_144_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_1024_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_128_out_features_65_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_1024_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_64_cuda_bfloat16', 'test/inductor/test_cuda_select_algorithm.py::TestSelectAlgorithmCudaCUDA::test_int8_woq_mm_cuda_batch_size_32_mid_dim_8_in_features_144_out_features_65_cuda_bfloat16'] 2025-12-04T10:49:11.2511933Z 2025-12-04T10:49:11.2512142Z FINISHED PRINTING LOG FILE of inductor/test_cuda_select_algorithm 1/1 (test/test-reports/inductor.test_cuda_select_algorithm_1.1_a2cc8512cf78dd46_.log) 2025-12-04T10:49:11.2512147Z 2025-12-04T10:49:11.2512286Z Finished inductor/test_cuda_select_algorithm 1/1 ... [2025-12-04 10:49:10.926179][3571859.450989461], took 45.65min 2025-12-04T10:49:11.2512540Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T10:49:11.2512638Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:49:11.2512741Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T10:49:11.2512794Z Uploading artifacts took 0.00 seconds 2025-12-04T10:49:11.2512857Z inductor/test_cuda_select_algorithm 1/1 failed! 2025-12-04T10:49:11.2512981Z Running inductor/test_aot_inductor_arrayref 1/1 ... [2025-12-04 10:49:10.932362][3571859.457175602] 2025-12-04T10:49:11.2513034Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:49:11.2513383Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor_arrayref.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:49:10.932551] 2025-12-04T10:57:33.4334544Z 2025-12-04T10:57:33.4335257Z inductor/test_aot_inductor_arrayref 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_arrayref_1.1_771813d5c1070670_.log 2025-12-04T10:57:33.4405943Z Running 309 items in this shard: test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__int_mm_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__weight_int4pack_mm_m_32_n_64_q_group_32_num_groups_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__weight_int4pack_mm_m_32_n_64_q_group_64_num_groups_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_32_num_groups_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test__weight_int4pack_mm_with_scales_and_zeros_m_32_n_64_q_group_64_num_groups_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_add_complex_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_addmm_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_addmm_multiple_dynamic_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aliased_buffer_reuse_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_amp_fallback_random_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aot_inductor_consts_cpp_build_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_constant_tensor_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_constant_tensor_name_collision_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_debug_printer_codegen_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_debug_printer_cpp_kernel_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_debug_printer_fp8_dtype_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_debug_printer_sym_inputs_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_debug_printer_user_defined_triton_kernel_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_debug_printing_model_inputs_codegen_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_profiler_enable_kernel_profile_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_profiler_enable_kernel_profile_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_runtime_asserts_backed_symint_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_runtime_asserts_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_aoti_user_defined_triton_kernel_profiling_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_assert_async_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_assert_tensor_meta_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_autotune_int64_user_defined_triton_kernel_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_autotune_with_constant_folding_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_autotuning_args_reuse_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_backward_no_op_logging_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_bmm_multiple_dynamic_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_bool_input_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_boolean_indexing_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_3_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_4_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_mutation_and_force_mmap_weights_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_buffer_reuse_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_clamp_decomposition_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_codegen_int_array_var_fix_memory_leak_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_composed_dynamic_size_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_cpu_predicate_cuda_operands_max_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_cpu_predicate_cuda_operands_max_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_mismatched_branch_output_dynamic_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_mismatched_branch_output_dynamic_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_nested_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_non_tensor_predicates_dynamic_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_non_tensor_predicates_dynamic_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_predicate_on_cpu_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_share_predicate_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_simple_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_symint_input_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_symint_input_disable_one_pass_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_unbacked_symint_closure_dynamic_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_unbacked_symint_closure_dynamic_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_use_buffers_from_outer_scope_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_with_multiple_outputs_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_with_outer_code_before_after_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_with_parameters_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_with_reinterpret_view_inputs_outputs_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_cond_with_replace_view_ops_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_consecutive_compiles_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_constant_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_constant_folding_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_constant_folding_with_update_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_constant_original_fqn_and_dtype_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_constant_type_propagation_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_conv3d_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_conv_freezing_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_convolution_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_copy_non_blocking_is_pinned_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_custom_op_in_subgraph_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_d2h_copy_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_deconv_freezing_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_device_moved_constant_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_dup_unbacked_sym_decl_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_dup_unbacked_sym_decl_with_refinement_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_duplicate_constant_folding_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_duplicated_params_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_dynamic_cat_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_dynamic_scalar_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_dynamic_smem_above_default_limit_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_embedding_bag_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_empty_cat_dtype_promotion_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_empty_constant_folding_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_empty_graph_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_extract_constants_map_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fake_tensor_device_validation_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fallback_kernel_with_symexpr_output_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fallback_mem_leak_fix_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fft_c2c_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fill__fallback_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_foreach_multiple_dynamic_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fp8_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fp8_view_of_param_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fqn_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_free_inactive_buffer_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_freezing_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_fx_gm_return_tuple_validation_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_index_put_fallback_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_index_put_with_none_index_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_inf_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_input_codegen_with_sympy_expr_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_int_list_input_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_issue_140766_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_large_dynamic_dim_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_large_grid_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_large_mmaped_weights_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_large_mmaped_weights_on_disk_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_large_weight_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_libtorch_free_so_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_linear_dynamic_maxautotune_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_linear_freezing_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_load_package_multiple_gpus_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_masked_select_dynamic_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_misaligned_input_1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_misaligned_input_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_misc_1_max_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_misc_1_max_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_missing_cubin_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_missing_output_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_mixed_device_1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_model_modified_weights_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_multi_device_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_multiple_output_alias_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_nan_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_narrow_fallback_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_nested_tensor_from_jagged_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_no_args_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_non_contiguous_output_alias_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_non_default_gpu_device_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_non_tensor_input_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_none_args_aot_codegen_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_normal_functional_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_on_gpu_device1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_output_misaligned_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_output_path_1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_output_path_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_pad_fallback_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_pad_non_zero_memory_leak_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_poi_multiple_dynamic_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_profile_benchmark_harness_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_proxy_executor_abs_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_proxy_executor_hann_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_proxy_executor_permute_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_proxy_executor_squeeze_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_pytree_inputs_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_quanatized_int8_linear_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_quantized_linear_bias_none_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_quantized_linear_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_repeat_interleave_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_repeat_output_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_repeated_calling_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_repeated_user_defined_triton_kernel_embed_kernel_binary_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_repeated_user_defined_triton_kernel_embed_kernel_binary_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_replace_unbacked_symbol_with_backed_expr_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_replicate_on_devices_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_return_constant_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_return_view_constant_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_reuse_kernel_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_reuse_kernel_dynamic_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_rocm_triton_autotuning_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_run_with_grad_enabled_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_complex_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_device_type_failed_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_dtype_failed_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_fp8_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_large_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_runtime_checks_shape_failed_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_same_backing_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scaled_dot_product_efficient_attention_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scaled_grouped_mm_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scatter_fallback_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_scatter_reduce_fallback_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_sdpa_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_sdpa_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_seq_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_shifted_constraint_ranges_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_simple_dynamic_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_simple_embed_kernel_binary_False_max_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_simple_embed_kernel_binary_False_max_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_simple_embed_kernel_binary_True_max_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_simple_embed_kernel_binary_True_max_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_simple_multi_arch_embed_kernel_binary_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_simple_split_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_size_from_multi_output_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_size_with_unbacked_add_and_mul_expr_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_size_with_unbacked_add_expr_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_size_with_unbacked_add_expr_transitive_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_small_constant_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_so_without_weight_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_stft_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_stride_with_unbacked_expr_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_subclasses_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_sym_expr_indexing_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_sym_i64_input_codegen_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_symbool_item_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_symfloat_item_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_symint_item_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_sympy_cpp_printer_min_max_minmax0_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_sympy_cpp_printer_min_max_minmax1_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_torchvision_transforms_functional_tensor_resize_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_autotuning_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_dynamic_launcher_grid_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_dynamic_launcher_grid_infer_from_tensor_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_bool_param_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_dynamic_grid_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_dynamic_shape_with_div_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_arg_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_float_arg_dynamic_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_equal_to_1_float_arg_dynamic_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_extern_kernel_arg_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_1_dynamic_False_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_1_dynamic_True_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_2_dynamic_False_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_1_num_dims_2_dynamic_True_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_1_dynamic_False_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_1_dynamic_True_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_2_dynamic_False_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_2_num_dims_2_dynamic_True_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_1_dynamic_False_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_1_dynamic_True_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_2_dynamic_False_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_grid_type_3_num_dims_2_dynamic_True_autotune_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_multi_output_arg_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_on_device_tma_dynamic_False_tma_version_new_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_on_device_tma_dynamic_False_tma_version_old_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_on_device_tma_dynamic_True_tma_version_new_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_on_device_tma_dynamic_True_tma_version_old_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_reinterpret_view_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_reinterpret_view_mem_leak_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_sympy_expr_arg_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_sympy_fn_like_arg_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_new_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_tma_descriptor_1d_dynamic_False_tma_version_old_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_new_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_tma_descriptor_1d_dynamic_True_tma_version_old_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_new_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_tma_descriptor_2d_dynamic_False_tma_version_old_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_new_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_tma_descriptor_2d_dynamic_True_tma_version_old_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_False_autotuning_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_unbacked_symint_in_grid_dynamic_True_autotuning_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_weird_param_order_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_with_none_input_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_kernel_with_none_inputs_and_equal_to_1_arg_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_mutated_autotuning_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_triton_next_power_of_2_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_unbacked_equals_input_size_runtime_assertion_mark_unbacked_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_unbacked_expr_replacements_shift_k_0_use_static_size_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_unbacked_expr_replacements_shift_k_0_use_static_size_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_unbacked_expr_replacements_shift_k_1_use_static_size_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_unbacked_expr_replacements_shift_k_1_use_static_size_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_unbacked_expr_replacements_shift_k_2_use_static_size_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_unbacked_expr_replacements_shift_k_2_use_static_size_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_unbacked_expr_replacements_shift_k_3_use_static_size_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_unbacked_expr_replacements_shift_k_3_use_static_size_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_unbounded_expr_substitutions_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_update_constant_buffer_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_update_constant_buffer_simple_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_update_inactive_constant_buffer_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_update_user_managed_buffer_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_upper_bound_i64_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_using_model_name_for_files_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_view_outputs_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_weight_on_disk_legacy_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_nested_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_simple_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_conv_dynamic_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_conv_dynamic_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_mixed_device_dynamic_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_mixed_device_dynamic_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_outer_buffers_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_outer_code_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_parameters_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_pytree_inputs_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_sym_expr_cond_dynamic_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_sym_expr_cond_dynamic_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_unbacked_symint_closure_dynamic_False_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_while_loop_with_unbacked_symint_closure_dynamic_True_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_with_cudagraphs_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_with_no_triton_profiler_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_with_offset_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_with_profiler_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_zero_grid_with_backed_symbols_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_zero_grid_with_unbacked_symbols_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_zero_size_buffer_cpu_with_stack_allocation, test/inductor/test_aot_inductor_arrayref.py::AOTInductorTestABICompatibleCpuWithStackAllocation::test_zero_size_weight_cpu_with_stack_allocation 2025-12-04T10:57:33.4474780Z 2025-12-04T10:57:33.4474918Z Finished inductor/test_aot_inductor_arrayref 1/1 ... [2025-12-04 10:57:33.433370][3572361.958181231], took 8.38min 2025-12-04T10:57:33.4475326Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T10:57:33.4475710Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:57:33.4475942Z Running inductor/test_deterministic 1/4 ... [2025-12-04 10:57:33.439500][3572361.964313374] 2025-12-04T10:57:33.4476159Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:57:33.4476557Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_deterministic.py', '--shard-id=1', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:57:33.439700] 2025-12-04T10:59:13.2056398Z 2025-12-04T10:59:13.2057198Z inductor/test_deterministic 1/4 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_deterministic_1.4_78df5470457b5502_.log 2025-12-04T10:59:13.2059424Z Running 6 items in this shard: test/inductor/test_deterministic.py::DeterministicTest::test_run2run_determinism_model_name_BertForMaskedLM_training_or_inference_inference_precision_amp, test/inductor/test_deterministic.py::DeterministicTest::test_run2run_determinism_model_name_DistillGPT2_training_or_inference_inference_precision_bfloat16, test/inductor/test_deterministic.py::DeterministicTest::test_run2run_determinism_model_name_GoogleFnet_training_or_inference_inference_precision_amp, test/inductor/test_deterministic.py::DeterministicTest::test_run2run_determinism_model_name_GoogleFnet_training_or_inference_inference_precision_float16, test/inductor/test_deterministic.py::DeterministicTest::test_run2run_determinism_model_name_GoogleFnet_training_or_inference_training_precision_amp, test/inductor/test_deterministic.py::DeterministicTest::test_run2run_determinism_model_name_GoogleFnet_training_or_inference_training_precision_float16 2025-12-04T10:59:13.2060759Z 2025-12-04T10:59:13.2060893Z Finished inductor/test_deterministic 1/4 ... [2025-12-04 10:59:13.205272][3572461.730080746], took 1.66min 2025-12-04T10:59:13.2066341Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T10:59:13.2122552Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:59:13.2126307Z Running inductor/test_inductor_utils 1/1 ... [2025-12-04 10:59:13.212332][3572461.737145893] 2025-12-04T10:59:13.2126506Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:59:13.2128142Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_inductor_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:59:13.212520] 2025-12-04T10:59:16.6828131Z 2025-12-04T10:59:16.6829232Z inductor/test_inductor_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_inductor_utils_1.1_68ed929b705644e8_.log 2025-12-04T10:59:16.6830535Z Running 2 items in this shard: test/inductor/test_inductor_utils.py::TestBench::test_benchmarker, test/inductor/test_inductor_utils.py::TestBench::test_do_bench_using_profiling 2025-12-04T10:59:16.6831150Z 2025-12-04T10:59:16.6831532Z Finished inductor/test_inductor_utils 1/1 ... [2025-12-04 10:59:16.682417][3572465.207225382], took 0.06min 2025-12-04T10:59:16.6838435Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T10:59:16.6894299Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:59:16.6896341Z Running inductor/test_template_heuristics_registry 1/1 ... [2025-12-04 10:59:16.689466][3572465.214279898] 2025-12-04T10:59:16.6896755Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:59:16.6897863Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_template_heuristics_registry.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:59:16.689655] 2025-12-04T10:59:19.3583470Z 2025-12-04T10:59:19.3585281Z inductor/test_template_heuristics_registry 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_template_heuristics_registry_1.1_8e5468da7acf6eab_.log 2025-12-04T10:59:19.3587694Z Running 5 items in this shard: test/inductor/test_template_heuristics_registry.py::TestTemplateHeuristicsRegistry::test_assertion_existing_class, test/inductor/test_template_heuristics_registry.py::TestTemplateHeuristicsRegistry::test_fallback_behavior, test/inductor/test_template_heuristics_registry.py::TestTemplateHeuristicsRegistry::test_hierarchy_lookup, test/inductor/test_template_heuristics_registry.py::TestTemplateHeuristicsRegistry::test_partial_hierarchy_scenarios, test/inductor/test_template_heuristics_registry.py::TestTemplateHeuristicsRegistry::test_register_class 2025-12-04T10:59:19.3589424Z 2025-12-04T10:59:19.3589720Z Finished inductor/test_template_heuristics_registry 1/1 ... [2025-12-04 10:59:19.358012][3572467.882821084], took 0.04min 2025-12-04T10:59:19.3592969Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T10:59:19.3648150Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:59:19.3650397Z Running inductor/test_async_compile 1/1 ... [2025-12-04 10:59:19.364853][3572467.889666494] 2025-12-04T10:59:19.3650720Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:59:19.3651578Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_async_compile.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:59:19.365038] 2025-12-04T10:59:51.8313132Z 2025-12-04T10:59:51.8313951Z inductor/test_async_compile 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_async_compile_1.1_e3337bac7850f608_.log 2025-12-04T10:59:51.8315173Z Running 8 items in this shard: test/inductor/test_async_compile.py::TestAsyncCompile::test_autotune_lookup_table_method_fork, test/inductor/test_async_compile.py::TestAsyncCompile::test_autotune_lookup_table_method_spawn, test/inductor/test_async_compile.py::TestAsyncCompile::test_autotune_lookup_table_method_subprocess, test/inductor/test_async_compile.py::TestAsyncCompile::test_bad_kernel, test/inductor/test_async_compile.py::TestAsyncCompile::test_pool_method_fork, test/inductor/test_async_compile.py::TestAsyncCompile::test_pool_method_spawn, test/inductor/test_async_compile.py::TestAsyncCompile::test_pool_method_subprocess, test/inductor/test_async_compile.py::TestAsyncCompile::test_wait_pool_ready 2025-12-04T10:59:51.8316636Z 2025-12-04T10:59:51.8316759Z Finished inductor/test_async_compile 1/1 ... [2025-12-04 10:59:51.831002][3572500.355812554], took 0.54min 2025-12-04T10:59:51.8320721Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T10:59:51.8372968Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:59:51.8375465Z Running inductor/test_gpu_cpp_wrapper 1/1 ... [2025-12-04 10:59:51.837402][3572500.362214841] 2025-12-04T10:59:51.8375664Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:59:51.8377313Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_gpu_cpp_wrapper.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:59:51.837610] 2025-12-04T11:04:51.7461282Z 2025-12-04T11:04:51.7462527Z inductor/test_gpu_cpp_wrapper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_gpu_cpp_wrapper_1.1_144dd62ffd645933_.log 2025-12-04T11:04:51.7517094Z Running 295 items in this shard: test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_add_complex4_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_add_complex_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_adding_tensor_offsets_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_addmm_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_aoti_debug_printer_works_on_constants, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_as_strided_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_batch_norm_2d_2_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_bernoulli1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_bitwise_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_bmm1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_bmm2_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_buffer_use_after_remove_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_cat_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_cat_slice_cat_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_consecutive_split_cumprod_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_conv_backward_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_convolution1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_custom_op_1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_custom_op_2_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_custom_op_3_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_bfloat16_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float16_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float32_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_float64_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_fusion_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int16_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int32_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int64_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_int8_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dtypeview_uint8_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_dynamic_shapes_persistent_reduction_mixed_x_dim_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_embedding_bag_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_enable_dynamic_shapes_cpp_wrapper_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_fft_real_input_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_fft_real_input_real_output_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_foreach_cpp_wrapper_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_index_put_deterministic_fallback_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_index_tensor_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_inductor_layout_optimization_input_mutations_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_insignificant_strides_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_layer_norm_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_linear1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_linear2_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_linear_relu_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_mm_plus_mm2_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_mm_plus_mm3_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_mm_views_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_multi_device_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_multi_threading_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_non_tensor_args_wrapped_on_cpu, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_pointwise_hermite_polynomial_h_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_pointwise_hermite_polynomial_he_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_pow3_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_profiler_mark_wrapper_call_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_randint_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_reduction1_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_relu_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_repeat_interleave_2_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_roi_align_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_scalar_input_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_scaled_dot_product_attention_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_scaled_dot_product_efficient_attention_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_silu_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_sort_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_sum_dtype_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_sum_int_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_transpose_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_bfloat16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_float16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_float32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_float64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_int16_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_int32_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_int64_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_int8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::TestGpuWrapper::test_unspec_inputs_uint8_cuda_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_add_complex4_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_add_complex_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_adding_tensor_offsets_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_addmm_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_annotation_training, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_as_strided_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_batch_norm_2d_2_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_bernoulli1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_bitwise_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_bmm1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_bmm2_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_buffer_use_after_remove_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_cat_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_cat_slice_cat_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_consecutive_split_cumprod_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_conv_backward_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_convolution1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_custom_op_1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_custom_op_2_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_custom_op_3_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_bfloat16_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float16_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float32_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_float64_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_fusion_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int16_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int32_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int64_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_int8_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dtypeview_uint8_uint8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_dynamic_shapes_persistent_reduction_mixed_x_dim_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_embedding_bag_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_enable_dynamic_shapes_cpp_wrapper_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_fft_real_input_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_fft_real_input_real_output_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_foreach_cpp_wrapper_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_index_put_deterministic_fallback_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_index_tensor_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_inductor_layout_optimization_input_mutations_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_insignificant_strides_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_layer_norm_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_linear1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_linear2_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_linear_relu_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_mm_plus_mm2_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_mm_plus_mm3_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_mm_views_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_multi_device_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_multi_threading_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_pointwise_hermite_polynomial_h_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_pointwise_hermite_polynomial_he_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_pow3_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_profiler_mark_wrapper_call_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_randint_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_reduction1_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_relu_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_repeat_interleave_2_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_roi_align_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_scalar_input_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_scaled_dot_product_attention_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_scaled_dot_product_efficient_attention_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_silu_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_sort_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_sum_dtype_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_sum_int_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_transpose_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_bfloat16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_float16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_float32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_float64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_int16_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_int32_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_int64_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_int8_cuda_dynamic_shapes_gpu_wrapper, test/inductor/test_gpu_cpp_wrapper.py::DynamicShapesGpuWrapperGpuTests::test_unspec_inputs_uint8_cuda_dynamic_shapes_gpu_wrapper 2025-12-04T11:04:51.7561418Z 2025-12-04T11:04:51.7561551Z Finished inductor/test_gpu_cpp_wrapper 1/1 ... [2025-12-04 11:04:51.746208][3572800.271016059], took 5.00min 2025-12-04T11:04:51.7561991Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:04:51.7562344Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:04:51.7562561Z Running dynamo/test_utils 1/1 ... [2025-12-04 11:04:51.753324][3572800.278136802] 2025-12-04T11:04:51.7562736Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:04:51.7563113Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:04:51.753559] 2025-12-04T11:05:09.4938500Z 2025-12-04T11:05:09.4939324Z dynamo/test_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_utils_1.1_e322294a1c8fb293_.log 2025-12-04T11:05:09.4941252Z Running 17 items in this shard: test/dynamo/test_utils.py::TestUtils::test_graph_break_counting, test/dynamo/test_utils.py::TestUtils::test_larger_multiplier_for_even_smaller_tensor, test/dynamo/test_utils.py::TestUtils::test_larger_multiplier_for_smaller_tensor, test/dynamo/test_utils.py::TestUtils::test_nan, test/dynamo/test_utils.py::TestUtils::test_traced_code_query, test/dynamo/test_utils.py::TestDynamoTimed::test_compiler_config, test/dynamo/test_utils.py::TestDynamoTimed::test_dynamic_shape_feature_use, test/dynamo/test_utils.py::TestDynamoTimed::test_dynamo_timed, test/dynamo/test_utils.py::TestDynamoTimed::test_exception_stack_trace, test/dynamo/test_utils.py::TestDynamoTimed::test_graph_node_shapes, test/dynamo/test_utils.py::TestDynamoTimed::test_inductor_provenance, test/dynamo/test_utils.py::TestDynamoTimed::test_ir_count, test/dynamo/test_utils.py::TestDynamoTimed::test_log_dynamo_start, test/dynamo/test_utils.py::TestDynamoTimed::test_num_params, test/dynamo/test_utils.py::TestDynamoTimed::test_stack_trace, test/dynamo/test_utils.py::TestInductorConfigParsingForLogging::test_inductor_config_jsonify, test/dynamo/test_utils.py::TestInductorConfigParsingForLogging::test_inductor_config_parsing_non_conforming_items 2025-12-04T11:05:09.4943636Z 2025-12-04T11:05:09.4943743Z Finished dynamo/test_utils 1/1 ... [2025-12-04 11:05:09.493483][3572818.01829178], took 0.30min 2025-12-04T11:05:09.4945411Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:05:09.5006059Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:05:09.5008527Z Running inductor/test_provenance_tracing 1/1 ... [2025-12-04 11:05:09.500764][3572818.025574011] 2025-12-04T11:05:09.5008904Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:05:09.5011220Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_provenance_tracing.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:05:09.501018] 2025-12-04T11:05:54.5297831Z 2025-12-04T11:05:54.5298797Z inductor/test_provenance_tracing 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_provenance_tracing_1.1_5cbcfec75d1d5762_.log 2025-12-04T11:05:54.5304920Z Running 16 items in this shard: test/inductor/test_provenance_tracing.py::TestProvenanceTracingArtifact::test_triton_kernel_to_post_grad_tracing_combo_kernel, test/inductor/test_provenance_tracing.py::TestProvenanceTracingArtifact::test_triton_kernel_to_post_grad_tracing_cpu, test/inductor/test_provenance_tracing.py::TestProvenanceTracingArtifact::test_triton_kernel_to_post_grad_tracing_cuda, test/inductor/test_provenance_tracing.py::TestProvenanceTracingArtifact::test_triton_kernel_to_post_grad_tracing_extern_kernel, test/inductor/test_provenance_tracing.py::TestProvenanceTracingNodeMapping::test_create_node_mapping, test/inductor/test_provenance_tracing.py::TestProvenanceTracingNodeMeta::test_pattern_matcher_transfer_meta, test/inductor/test_provenance_tracing.py::TestProvenanceTracingStackTraces::test_cpu_extern_kernel, test/inductor/test_provenance_tracing.py::TestProvenanceTracingStackTraces::test_create_kernel_information_json_function, test/inductor/test_provenance_tracing.py::TestProvenanceTracingStackTraces::test_deferred_triton_kernels, test/inductor/test_provenance_tracing.py::TestProvenanceTracingStackTraces::test_kernel_information_generation, test/inductor/test_provenance_tracing.py::TestProvenanceTracingStackTraces::test_no_kernel_information_without_provenance_tracking, test/inductor/test_provenance_tracing.py::TestProvenanceTracingStackTraces::test_tlparse_kernel_stack_traces, test/inductor/test_provenance_tracing.py::TestProvenanceTracingKernelContextCpu::test_aoti_python_stack_traces_cpu, test/inductor/test_provenance_tracing.py::TestProvenanceTracingKernelContextCpu::test_jit_inductor_with_flag_cpu, test/inductor/test_provenance_tracing.py::TestProvenanceTracingKernelContextGpu::test_aoti_python_stack_traces_cuda, test/inductor/test_provenance_tracing.py::TestProvenanceTracingKernelContextGpu::test_jit_inductor_with_flag_cuda 2025-12-04T11:05:54.5309359Z 2025-12-04T11:05:54.5309657Z Finished inductor/test_provenance_tracing 1/1 ... [2025-12-04 11:05:54.529410][3572863.054220035], took 0.75min 2025-12-04T11:05:54.5310291Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:05:54.5364228Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:05:54.5364541Z Running dynamo/test_interop 1/1 ... [2025-12-04 11:05:54.536301][3572863.061115142] 2025-12-04T11:05:54.5364771Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:05:54.5367944Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_interop.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:05:54.536528] 2025-12-04T11:05:57.3556981Z 2025-12-04T11:05:57.3560879Z dynamo/test_interop 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_interop_1.1_a382c4746be1f87a_.log 2025-12-04T11:05:57.3574623Z Running 5 items in this shard: test/dynamo/test_interop.py::InteropTests::test_fx_fn, test/dynamo/test_interop.py::InteropTests::test_script_fn, test/dynamo/test_interop.py::InteropTests::test_staticmethod_script_fn, test/dynamo/test_interop.py::InteropTests::test_trace_fn, test/dynamo/test_interop.py::InteropTests::test_vmap_in_graph 2025-12-04T11:05:57.3575842Z 2025-12-04T11:05:57.3576047Z Finished dynamo/test_interop 1/1 ... [2025-12-04 11:05:57.355359][3572865.880165846], took 0.05min 2025-12-04T11:05:57.3576822Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:05:57.3627697Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:05:57.3630399Z Running functorch/test_eager_transforms 1/1 ... [2025-12-04 11:05:57.362837][3572865.887651293] 2025-12-04T11:05:57.3630631Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:05:57.3632252Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_eager_transforms.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:05:57.363059] 2025-12-04T11:06:12.1022492Z 2025-12-04T11:06:12.1024102Z functorch/test_eager_transforms 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_eager_transforms_1.1_5e0a90de568542ed_.log 2025-12-04T11:06:12.1081162Z Running 360 items in this shard: test/functorch/test_eager_transforms.py::TestSliceArgnums::test_argnums_reorders, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_duplicate_argnums, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_flat_args_with_negative_int_argnum, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_flat_args_with_positive_int_argnum, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_flat_args_with_tuple_argnum, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_invalid_argnum_type, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_not_enough_argnums, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_out_of_bounds_argnum_values, test/functorch/test_eager_transforms.py::TestSliceArgnums::test_pytree_args, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_buffer_tying, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_combine_state_for_ensemble_error, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_combine_state_for_ensemble_smoke, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_correctness_mnist_mechanism_functional_call, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_correctness_mnist_mechanism_make_functional, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_disable_autograd_tracking_disable_autograd_tracking_False, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_disable_autograd_tracking_disable_autograd_tracking_True, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_make_functional_state_correctly_returned_after_forward_mechanism_functional_call, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_make_functional_state_correctly_returned_after_forward_mechanism_make_functional, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_parameter_tying, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_parameter_tying_ensemble, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_parameter_tying_grad, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_stack_module_state_error, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_stack_module_state_leaf, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_stack_module_state_mismatch_error, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_stack_module_state_smoke, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_using_detach_functional_call_detach_params_False, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_using_detach_functional_call_detach_params_True, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_with_buffers_disable_autograd_tracking_disable_autograd_tracking_False, test/functorch/test_eager_transforms.py::TestMakeFunctional::test_with_buffers_disable_autograd_tracking_disable_autograd_tracking_True, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_advanced_indexing_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_argnums_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_composed_with_autograd_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_composite_complicated_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_composite_simple_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_composite_two_ops_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_conj_bit_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_dtype_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_escaped_wrappers_are_ignored_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_escaped_wrappers_are_marked_as_dead_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_fn_with_kwargs_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_functional_init_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_functional_init_with_buffers_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_aux_pytree_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_aux_tensor_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_of_vjp_composition_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_of_vjp_of_grad_composition_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_grad_pytree_inputs_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_inplace_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_inplace_on_captures_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_inplace_on_view_base_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_inplace_on_view_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_invalid_argnums_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_is_cuda_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_layout_sparse_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_manual_seed_inside_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_negative_argnums_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_nesting_simple_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_inside_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_mixed_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_nested_complicated_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_nested_simple_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_outside_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_outside_vjp_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_outside_vjp_fn_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_outside_vjp_only_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_no_grad_value_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_numel_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_out_of_order_argnums_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_primitive_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_print_captured_tensor_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_shape_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_ctor_inside_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_grad_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_vmap_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_vmap_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_tensor_print_vmap_vmap_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_unrelated_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_unrelated_hessian_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_unrelated_vjp_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_unrelated_vjp_multiple_inputs_outputs_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_view_inplace_simple_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_views_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_aux_pytree_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_aux_tensor_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_of_grad_composition_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_outputs_can_any_pytree_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_pytree_error_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_pytree_input_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_pytree_output_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_vjp_two_outputs_cuda, test/functorch/test_eager_transforms.py::TestGradTransformCUDA::test_zero_grad_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_log_softmax_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_new_empty_materializes_tensor_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_new_zeros_materializes_tensor_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_per_sample_grads_embeddingnet_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_per_sample_grads_embeddingnet_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_per_sample_grads_inplace_view_cuda, test/functorch/test_eager_transforms.py::TestVmapOfGradCUDA::test_per_sample_grads_simple_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_correctness_different_devices_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_correctness_different_devices_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_default_arg_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_default_arg_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_multi_input_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_multi_input_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_multi_input_multi_output_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_multi_input_multi_output_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_simple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_simple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_unrelated_outputs_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_unrelated_outputs_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_zero_dim_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_against_reference_zero_dim_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_defaults_to_zero_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_defaults_to_zero_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_effect_on_return_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_effect_on_return_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_tuple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_argnums_tuple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_aux_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_aux_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_aux_tensor_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_aux_tensor_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev__preallocate_and_copy_False_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev__preallocate_and_copy_True_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev_chunksize_one__preallocate_and_copy_False_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev_chunksize_one__preallocate_and_copy_True_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev_composition__preallocate_and_copy_False_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_chunk_jacrev_composition__preallocate_and_copy_True_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_complex_error_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_diff_numel_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_diff_numel_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_dimensionality_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_dimensionality_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_empty_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_empty_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_empty_output_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_empty_output_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_float_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_float_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_hessian_simple_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_inplace_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_inplace_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_jac_with_non_tensor_args_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_jac_with_non_tensor_args_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_args_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_args_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_outputs_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_outputs_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_outputs_pytree_multidim_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_outputs_pytree_multidim_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_inputs_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_multiple_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_multiple_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_single_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_multiple_outputs_single_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_negative_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_negative_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_nested_jac_simple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_nested_jac_simple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_out_of_bounds_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_out_of_bounds_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_outputs_can_any_pytree_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_outputs_can_any_pytree_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_repeated_argnums_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_repeated_argnums_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_simple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_simple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_simple_not_flat_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_simple_not_flat_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_take_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_take_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_unrelated_input_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_unrelated_input_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_unrelated_output_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_unrelated_output_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_vmap_on_jac_simple_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestJacCUDA::test_vmap_on_jac_simple_jacrev_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_autograd_function_disables_fwd_grad_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_aux_pytree_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_aux_tensor_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_disable_fwd_grad_inside_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_disable_fwd_grad_mixed_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_disable_fwd_grad_outside_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_inplace_on_captures_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_inputs_are_tuples_of_tensors_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_jvp_inside_autograd_function_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_jvp_new_tensor_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_multiple_inputs_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_multiple_inputs_outputs_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_multiple_outputs_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_nonempty_primals_and_tangents_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_outputs_can_any_pytree_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_primals_tangents_length_mismatch_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_pytree_inputs_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_pytree_inputs_error_cases_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_simple_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_strict_mode_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_unrelated_input_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_unrelated_output_cuda, test/functorch/test_eager_transforms.py::TestJvpCUDA::test_zerotensor_vmapjvp_interaction_cuda, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_basic_cuda_float32, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_composition_grad_cuda_float32, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_composition_vmap_cuda_float32, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_errors_cuda, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_nested_input_nested_output_cuda_float32, test/functorch/test_eager_transforms.py::TestLinearizeCUDA::test_linearize_return_cuda_float32, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_all_dual_base_inplace_cuda, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_all_dual_base_view_inplace_cuda, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_all_dual_no_view_cuda, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_right_dual_base_prop_cuda, test/functorch/test_eager_transforms.py::TestVmapJvpInplaceViewCUDA::test_right_dual_view_prop_cuda, test/functorch/test_eager_transforms.py::TestHessianCUDA::test_hessian_vectorize_correctness_multi_input_cuda, test/functorch/test_eager_transforms.py::TestHessianCUDA::test_hessian_vectorize_correctness_simple_cuda, test/functorch/test_eager_transforms.py::TestHessianCUDA::test_hessian_vectorize_correctness_unrelated_outputs_cuda, test/functorch/test_eager_transforms.py::TestHessianCUDA::test_jacfwd_different_levels_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_functionalize_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_grad_and_value_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_hessian_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_jacrev_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_function_no_setup_context_transform_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_functional_jacfwd_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_functional_jacrev_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_functional_jvp_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_autograd_functional_vjp_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_can_use_functionalize_when_key_is_excluded_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_can_use_grad_when_key_is_excluded_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_can_use_vmap_when_key_is_excluded_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_functionalize_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_grad_and_value_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_hessian_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_transforms_transform_jacrev_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_deprecation_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_grad_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_grad_vjp_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_grad_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_jvp_supports_saved_tensor_hooks_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_make_fx_jacrev_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_make_fx_vjp_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_make_fx_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_no_warning_on_import_functorch_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_requires_grad_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_retain_grad_inside_transform_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_transforms_dont_support_saved_tensor_hooks_transform_grad_and_value_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_transforms_dont_support_saved_tensor_hooks_transform_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_transforms_dont_support_saved_tensor_hooks_transform_hessian_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_transforms_dont_support_saved_tensor_hooks_transform_jacrev_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vjp_doesnt_support_saved_tensor_hooks_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vjp_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vjp_vjp_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vjp_vmap_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vmap_grad_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vmap_vjp_cuda, test/functorch/test_eager_transforms.py::TestComposabilityCUDA::test_vmap_vmap_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_ensemble_regression_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_ensemble_regression_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_AlphaDropout_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_AlphaDropout_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_Dropout_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_Dropout_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_FeatureAlphaDropout_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_find_learning_rate_ensembling_FeatureAlphaDropout_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_lennard_jones_batched_jac_jac_jacfwd_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_lennard_jones_batched_jac_jac_jacrev_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_maml_omniglot_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_maml_omniglot_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_maml_regression_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_maml_regression_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_resnet18_per_sample_grads_mechanism_functional_call_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_resnet18_per_sample_grads_mechanism_make_functional_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_update_batch_norm_mechanism_functional_call_originally_track_running_stats_False_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_update_batch_norm_mechanism_functional_call_originally_track_running_stats_True_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_update_batch_norm_mechanism_make_functional_originally_track_running_stats_False_cuda, test/functorch/test_eager_transforms.py::TestExamplesCorrectnessCUDA::test_update_batch_norm_mechanism_make_functional_originally_track_running_stats_True_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_basic_sum_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_functional_call_multiple_dicts_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_grad_grad_sum_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_grad_name_wrapping_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_grad_sum_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_no_grad_inside_grad_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_no_grad_outside_grad_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_vmap_grad_sum_cuda, test/functorch/test_eager_transforms.py::TestHigherOrderOperatorInteractionCUDA::test_vmap_sum_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fake_tensors_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_multi_out_op_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_out_op_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_reapply_views_simple_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_simple_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_fx_transpose_simple_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_grad_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_nonfunctional_output_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_opt_tensor_list_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_optional_tensorlist1_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_functionalize_optional_tensorlist2_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_inplace_view_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_linear_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_multioutput_inplace_slice_view_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_multioutput_view_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_resize_program_inputs_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_simple_view_cuda, test/functorch/test_eager_transforms.py::TestFunctionalizeCUDA::test_vmap_functionalize_jvp_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_input_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_input_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_neither_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_neither_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_output_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_jvp_save_tensors_output_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_input_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_input_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_neither_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_neither_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_output_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_False_save_for_vjp_save_tensors_output_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_input_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_input_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_neither_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_neither_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_output_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_jvp_save_tensors_output_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_input_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_input_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_neither_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_neither_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_output_mark_dirty_False_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_function_returns_input_inner_requires_grad_True_save_for_vjp_save_tensors_output_mark_dirty_True_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_grad_fn_name_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_needs_input_grads_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_once_differentiable_autograd_vjp_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_once_differentiable_grad_vjp_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionCUDA::test_set_materialize_grads_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_has_vmap_staticmethod_and_has_generate_vmap_rule_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_in_dims_multiple_inputs_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_in_dims_single_input_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_incompatible_out_dims_error_msg_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_info_object_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_kwarg_only_tensors_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_no_vmap_staticmethod_and_no_generate_vmap_rule_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_none_returns_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_should_have_two_returns_cuda, test/functorch/test_eager_transforms.py::TestAutogradFunctionVmapAPICUDA::test_skips_empty_layer_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_CtxWithSavedTensors_error_if_name_collision_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_CtxWithSavedTensors_nesting_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_CtxWithSavedTensors_overrides_saved_tensors_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_CtxWithSavedTensors_passthrough_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_debug_unwrap_cuda, test/functorch/test_eager_transforms.py::TestHelpersCUDA::test_reductify_leaf_cuda, test/functorch/test_eager_transforms.py::TestCompileTransformsCUDA::test_compile_vmap_hessian_cuda, test/functorch/test_eager_transforms.py::TestCompileTransformsCUDA::test_grad_deprecated_api_cuda, test/functorch/test_eager_transforms.py::TestGradTrackingTensorToListCUDA::test_tolist_conj_neg_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTrackingTensorToListCUDA::test_tolist_multidimensional_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTrackingTensorToListCUDA::test_tolist_nested_grad_cuda, test/functorch/test_eager_transforms.py::TestGradTrackingTensorToListCUDA::test_tolist_with_grad_cuda 2025-12-04T11:06:12.1131268Z 2025-12-04T11:06:12.1131397Z Finished functorch/test_eager_transforms 1/1 ... [2025-12-04 11:06:12.102379][3572880.6271873], took 0.25min 2025-12-04T11:06:12.1131806Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:06:12.1132202Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:06:12.1132430Z Running inductor/test_benchmarking 1/1 ... [2025-12-04 11:06:12.109566][3572880.634379422] 2025-12-04T11:06:12.1132620Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:06:12.1133015Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_benchmarking.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:06:12.109780] 2025-12-04T11:06:18.6339038Z 2025-12-04T11:06:18.6340168Z inductor/test_benchmarking 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_benchmarking_1.1_cb97d79a2a4ae161_.log 2025-12-04T11:06:18.6345924Z Running 12 items in this shard: test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_cpu_smoke_benchmarker_cls0, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_cpu_smoke_benchmarker_cls1, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_gpu_smoke_benchmarker_cls0, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_gpu_smoke_benchmarker_cls1, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_safely_infers_device_many_devices_benchmarker_cls0, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_safely_infers_device_many_devices_benchmarker_cls1, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_safely_infers_device_no_devices_benchmarker_cls0, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_safely_infers_device_no_devices_benchmarker_cls1, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_smoke_benchmarker_cls0_device_cpu, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_smoke_benchmarker_cls0_device_cuda, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_smoke_benchmarker_cls1_device_cpu, test/inductor/test_benchmarking.py::TestBenchmarker::test_benchmark_smoke_benchmarker_cls1_device_cuda 2025-12-04T11:06:18.6349676Z 2025-12-04T11:06:18.6349923Z Finished inductor/test_benchmarking 1/1 ... [2025-12-04 11:06:18.633675][3572887.15848397], took 0.11min 2025-12-04T11:06:18.6350771Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:06:18.6409941Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:06:18.6412199Z Running inductor/test_helion_kernels 1/1 ... [2025-12-04 11:06:18.641113][3572887.165926258] 2025-12-04T11:06:18.6412499Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:06:18.6414855Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_helion_kernels.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:06:18.641356] 2025-12-04T11:06:24.0641708Z 2025-12-04T11:06:24.0643115Z inductor/test_helion_kernels 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_helion_kernels_1.1_7b3cbbe220d81261_.log 2025-12-04T11:06:24.0644547Z Running 2 items in this shard: test/inductor/test_helion_kernels.py::HelionTests::test_add_kernel, test/inductor/test_helion_kernels.py::HelionTests::test_softmax_view_reshape 2025-12-04T11:06:24.0645254Z 2025-12-04T11:06:24.0645596Z Finished inductor/test_helion_kernels 1/1 ... [2025-12-04 11:06:24.063822][3572892.588631333], took 0.09min 2025-12-04T11:06:24.0653400Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:06:24.0711788Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:06:24.0714278Z Running inductor/test_quantization 1/1 ... [2025-12-04 11:06:24.071267][3572892.59608007] 2025-12-04T11:06:24.0714497Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:06:24.0715930Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_quantization.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:06:24.071491] 2025-12-04T11:06:39.7092172Z 2025-12-04T11:06:39.7093601Z inductor/test_quantization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_quantization_1.1_106716b0f4220fe4_.log 2025-12-04T11:06:39.7095347Z Running 2 items in this shard: test/inductor/test_quantization.py::TestQuantization::test_activation_quantization_aten_with_scaling, test/inductor/test_quantization.py::TestQuantization::test_activation_quantization_aten_without_scaling 2025-12-04T11:06:39.7096379Z 2025-12-04T11:06:39.7096733Z Finished inductor/test_quantization 1/1 ... [2025-12-04 11:06:39.708861][3572908.233669228], took 0.26min 2025-12-04T11:06:39.7104381Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:06:39.7165451Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:06:39.7166018Z Running inductor/test_best_config 1/1 ... [2025-12-04 11:06:39.716411][3572908.241224614] 2025-12-04T11:06:39.7166363Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:06:39.7168546Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_best_config.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:06:39.716638] 2025-12-04T11:06:46.4402528Z 2025-12-04T11:06:46.4403443Z inductor/test_best_config 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_best_config_1.1_7f0b61aae7c316aa_.log 2025-12-04T11:06:46.4404814Z Running 1 items in this shard: test/inductor/test_best_config.py::TestKernelBestConfig::test_best_config_has_triton_cache_key 2025-12-04T11:06:46.4405142Z 2025-12-04T11:06:46.4405351Z Finished inductor/test_best_config 1/1 ... [2025-12-04 11:06:46.439960][3572914.964769394], took 0.11min 2025-12-04T11:06:46.4414377Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:06:46.4472736Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:06:46.4474763Z Running export/test_tools 1/1 ... [2025-12-04 11:06:46.447356][3572914.972169862] 2025-12-04T11:06:46.4475021Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:06:46.4478807Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_tools.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:06:46.447577] 2025-12-04T11:06:48.9667120Z 2025-12-04T11:06:48.9668157Z export/test_tools 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_tools_1.1_52ebe7b43453fb37_.log 2025-12-04T11:06:48.9669560Z Running 2 items in this shard: test/export/test_tools.py::TestExportTools::test_report_exportability_basic, test/export/test_tools.py::TestExportTools::test_report_exportability_with_issues 2025-12-04T11:06:48.9670336Z 2025-12-04T11:06:48.9670631Z Finished export/test_tools 1/1 ... [2025-12-04 11:06:48.966361][3572917.491170582], took 0.04min 2025-12-04T11:06:48.9676169Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:06:48.9736139Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:06:48.9736587Z Running inductor/test_compiled_optimizers 1/2 ... [2025-12-04 11:06:48.973438][3572917.498251645] 2025-12-04T11:06:48.9736942Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:06:48.9738386Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compiled_optimizers.py', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:06:48.973642] 2025-12-04T11:12:14.5826153Z 2025-12-04T11:12:14.5826906Z inductor/test_compiled_optimizers 1/2 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compiled_optimizers_1.2_05e9ac0156bcabb9_.log 2025-12-04T11:12:14.5881923Z Running 329 items in this shard: test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_rho_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_rho_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_tensor_lr_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adadelta_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_initial_accumulator_value_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_initial_accumulator_value_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_lr_decay_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_lr_decay_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_recompile, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cpu_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_tensor_lr_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adagrad_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_amsgrad_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_tensor_lr_tensor_betas_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_amsgrad_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_amsgrad_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adam_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_tensor_lr_weight_decay_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_maximize_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamax_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_recompile, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_amsgrad_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_amsgrad_capturable_foreach_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_tensor_lr_tensor_betas_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_amsgrad_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_amsgrad_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_amsgrad_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_adamw_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_lambd_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_lambd_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_maximize_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_recompile_foreach, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_recompile_single, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_t0_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_tensor_lr_weight_decay_maximize_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_maximize_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_maximize_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_asgd_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_compile_time_smoketest, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_guard_on_none_grads, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_momentum_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_momentum_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_momentum_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_tensor_lr_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_maximize_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_decoupled_weight_decay_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_decoupled_weight_decay_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_decoupled_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_decoupled_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_decoupled_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_nadam_weight_decay_momentum_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_capturable_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_capturable_weight_decay_decoupled_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_capturable_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_eps_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_eps_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_tensor_lr_capturable_weight_decay_decoupled_weight_decay_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_decoupled_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_radam_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_maximize_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_maximize_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_maximize_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_recompile, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_tensor_lr_capturable_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_momentum_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_centered_momentum_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rmsprop_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_capturable_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_etas_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_etas_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_step_sizes_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_rprop_tensor_lr_capturable_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_dampening_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_dampening_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_nesterov_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_nesterov_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_weight_decay_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_momentum_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_constantlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_exponentiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_linearlr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_multiplicativelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cpu_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_multisteplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_onecyclelr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_cuda_reducelronplateau, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_cosineannealinglr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_cosineannealingwarmrestarts, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_cycliclr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_lambdalr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_polynomiallr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_tensor_lr_foreach_cuda_steplr, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_weight_decay_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_weight_decay_foreach_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_weight_decay_maximize_cpu, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_sgd_weight_decay_maximize_cuda, test/inductor/test_compiled_optimizers.py::CompiledOptimizerTests::test_static_address_finalizer, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adadelta_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adafactor_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Adafactor_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_LBFGS_use_closure_False_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_LBFGS_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_RAdam_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_RMSprop_use_closure_True_cuda_float32, test/inductor/test_compiled_optimizers.py::CompiledOptimizerParityTestsCUDA::test_correctness_Rprop_use_closure_False_cuda_float32 2025-12-04T11:12:14.5935266Z 2025-12-04T11:12:14.5935396Z Finished inductor/test_compiled_optimizers 1/2 ... [2025-12-04 11:12:14.582634][3573243.107441237], took 5.43min 2025-12-04T11:12:14.5935803Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:12:14.5936188Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:12:14.5936404Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T11:12:14.5936582Z Uploading artifacts took 0.00 seconds 2025-12-04T11:12:14.5936773Z Running inductor/test_aot_inductor_utils 1/1 ... [2025-12-04 11:12:14.589857][3573243.114668423] 2025-12-04T11:12:14.5936968Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:12:14.5937368Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_aot_inductor_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:12:14.590095] 2025-12-04T11:12:19.9992791Z 2025-12-04T11:12:19.9993780Z inductor/test_aot_inductor_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_aot_inductor_utils_1.1_da2a5654caebfbc1_.log 2025-12-04T11:12:19.9994278Z Running 0 items in this shard: 2025-12-04T11:12:19.9994383Z 2025-12-04T11:12:19.9994560Z Finished inductor/test_aot_inductor_utils 1/1 ... [2025-12-04 11:12:19.998951][3573248.523759471], took 0.09min 2025-12-04T11:12:20.0005501Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:12:20.0066850Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:12:20.0067264Z Running dynamo/test_graph_region_tracker 1/1 ... [2025-12-04 11:12:20.006580][3573248.53139317] 2025-12-04T11:12:20.0067622Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:12:20.0069867Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_graph_region_tracker.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:12:20.006804] 2025-12-04T11:12:23.7792513Z 2025-12-04T11:12:23.7793370Z dynamo/test_graph_region_tracker 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_graph_region_tracker_1.1_2b79dca233a94169_.log 2025-12-04T11:12:23.7796016Z Running 13 items in this shard: test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_get_regions_multiple_region_groups, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_get_regions_single_region_group, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_mismatched_arg_shapes, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_mismatched_dtypes, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_mismatched_global_state, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_mutation_tracking_allow_in_graph, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_mutation_tracking_setitem, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_mutation_tracking_simple, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_nested_args, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_no_duplicate_tracking, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_no_single_node_regions, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_non_tensor_arg_hashing, test/dynamo/test_graph_region_tracker.py::GraphRegionTrackerTests::test_region_sorting 2025-12-04T11:12:23.7797883Z 2025-12-04T11:12:23.7798019Z Finished dynamo/test_graph_region_tracker 1/1 ... [2025-12-04 11:12:23.778944][3573252.303753946], took 0.06min 2025-12-04T11:12:23.7800816Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:12:23.7861581Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:12:23.7861820Z Running inductor/test_compile 1/1 ... [2025-12-04 11:12:23.786065][3573252.310878473] 2025-12-04T11:12:23.7862068Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:12:23.7865150Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_compile.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:12:23.786300] 2025-12-04T11:12:34.9218047Z 2025-12-04T11:12:34.9218823Z inductor/test_compile 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_compile_1.1_a6b7668030bba660_.log 2025-12-04T11:12:34.9221030Z Running 10 items in this shard: test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_generate_debug_compile, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_generate_debug_symbol, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_via_bare_module, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_via_export1, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_via_export2, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_via_fx, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_via_fx_dict_input, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_via_fx_tensor_return, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_via_make_fx, test/inductor/test_compile.py::TestStandaloneInductor::test_inductor_via_op_with_multiple_outputs 2025-12-04T11:12:34.9223507Z 2025-12-04T11:12:34.9223753Z Finished inductor/test_compile 1/1 ... [2025-12-04 11:12:34.921537][3573263.446347118], took 0.19min 2025-12-04T11:12:34.9228338Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:12:34.9287787Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:12:34.9288105Z Running inductor/test_scatter_optimization 1/1 ... [2025-12-04 11:12:34.928698][3573263.453512234] 2025-12-04T11:12:34.9288364Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:12:34.9290512Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_scatter_optimization.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:12:34.928920] 2025-12-04T11:12:46.8161340Z 2025-12-04T11:12:46.8162263Z inductor/test_scatter_optimization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_scatter_optimization_1.1_ccd87912e9d8fcd5_.log 2025-12-04T11:12:46.8163562Z Running 8 items in this shard: test/inductor/test_scatter_optimization.py::TestScatterOpt::test_3d_tensor, test/inductor/test_scatter_optimization.py::TestScatterOpt::test_can_not_optimize_due_to_dense, test/inductor/test_scatter_optimization.py::TestScatterOpt::test_can_not_optimize_due_to_non_const, test/inductor/test_scatter_optimization.py::TestScatterOpt::test_cross_entropy_loss, test/inductor/test_scatter_optimization.py::TestScatterOpt::test_neg_scatter_dim, test/inductor/test_scatter_optimization.py::TestScatterOpt::test_non_last_dim, test/inductor/test_scatter_optimization.py::TestScatterOpt::test_nonzero_const_tensor, test/inductor/test_scatter_optimization.py::TestScatterOpt::test_shorter_index_tensor 2025-12-04T11:12:46.8164569Z 2025-12-04T11:12:46.8164711Z Finished inductor/test_scatter_optimization 1/1 ... [2025-12-04 11:12:46.815827][3573275.340636211], took 0.20min 2025-12-04T11:12:46.8173605Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:12:46.8232798Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:12:46.8234801Z Running dynamo/test_functions 1/1 ... [2025-12-04 11:12:46.823337][3573275.348151351] 2025-12-04T11:12:46.8235013Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:12:46.8236628Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_functions.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:12:46.823566] 2025-12-04T11:13:14.2018205Z 2025-12-04T11:13:14.2019795Z dynamo/test_functions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_functions_1.1_d844509b7e4f95f1_.log 2025-12-04T11:13:14.2097949Z Running 478 items in this shard: test/dynamo/test_functions.py::FunctionTests::test_T, test/dynamo/test_functions.py::FunctionTests::test_add, test/dynamo/test_functions.py::FunctionTests::test_add_, test/dynamo/test_functions.py::FunctionTests::test_addcdiv, test/dynamo/test_functions.py::FunctionTests::test_addcdiv_, test/dynamo/test_functions.py::FunctionTests::test_addcmul_, test/dynamo/test_functions.py::FunctionTests::test_are_functorch_transforms_active, test/dynamo/test_functions.py::FunctionTests::test_attrgetter, test/dynamo/test_functions.py::FunctionTests::test_broadcast_foreach_pow, test/dynamo/test_functions.py::FunctionTests::test_build_list_unpack, test/dynamo/test_functions.py::FunctionTests::test_call_dict1, test/dynamo/test_functions.py::FunctionTests::test_call_dict2, test/dynamo/test_functions.py::FunctionTests::test_call_dict3, test/dynamo/test_functions.py::FunctionTests::test_call_dict4, test/dynamo/test_functions.py::FunctionTests::test_call_dict5, test/dynamo/test_functions.py::FunctionTests::test_callable_builtin, test/dynamo/test_functions.py::FunctionTests::test_callable_class, test/dynamo/test_functions.py::FunctionTests::test_callable_lambda, test/dynamo/test_functions.py::FunctionTests::test_callable_list, test/dynamo/test_functions.py::FunctionTests::test_callable_torch, test/dynamo/test_functions.py::FunctionTests::test_chunks1, test/dynamo/test_functions.py::FunctionTests::test_class_dict, test/dynamo/test_functions.py::FunctionTests::test_cls_eq, test/dynamo/test_functions.py::FunctionTests::test_cls_hasattr, test/dynamo/test_functions.py::FunctionTests::test_cls_is, test/dynamo/test_functions.py::FunctionTests::test_compare_constant_and_tensor, test/dynamo/test_functions.py::FunctionTests::test_complex_closure, test/dynamo/test_functions.py::FunctionTests::test_const_tuple_add1, test/dynamo/test_functions.py::FunctionTests::test_const_tuple_add2, test/dynamo/test_functions.py::FunctionTests::test_constant1, test/dynamo/test_functions.py::FunctionTests::test_constant2, test/dynamo/test_functions.py::FunctionTests::test_constant3, test/dynamo/test_functions.py::FunctionTests::test_constant4, test/dynamo/test_functions.py::FunctionTests::test_constant_set, test/dynamo/test_functions.py::FunctionTests::test_context_wrapping_nested_functions_no_closure, test/dynamo/test_functions.py::FunctionTests::test_cublas_allow_tf32, test/dynamo/test_functions.py::FunctionTests::test_custom_dict_kwargs, test/dynamo/test_functions.py::FunctionTests::test_default_dict_closure, test/dynamo/test_functions.py::FunctionTests::test_default_dict_constr, test/dynamo/test_functions.py::FunctionTests::test_default_dict_dict, test/dynamo/test_functions.py::FunctionTests::test_default_dict_lambda, test/dynamo/test_functions.py::FunctionTests::test_default_dict_list, test/dynamo/test_functions.py::FunctionTests::test_default_dict_set, test/dynamo/test_functions.py::FunctionTests::test_default_dict_tuple, test/dynamo/test_functions.py::FunctionTests::test_defaultdict_setdefault1, test/dynamo/test_functions.py::FunctionTests::test_defaultdict_setdefault2, test/dynamo/test_functions.py::FunctionTests::test_defaultdict_setdefault3, test/dynamo/test_functions.py::FunctionTests::test_del, test/dynamo/test_functions.py::FunctionTests::test_deque, test/dynamo/test_functions.py::FunctionTests::test_device, test/dynamo/test_functions.py::FunctionTests::test_device_constant, test/dynamo/test_functions.py::FunctionTests::test_dict_copy, test/dynamo/test_functions.py::FunctionTests::test_dict_fromkeys, test/dynamo/test_functions.py::FunctionTests::test_dict_hasattr, test/dynamo/test_functions.py::FunctionTests::test_dict_id_guard, test/dynamo/test_functions.py::FunctionTests::test_dict_items_sorted, test/dynamo/test_functions.py::FunctionTests::test_dict_key_set1, test/dynamo/test_functions.py::FunctionTests::test_dict_key_set2, test/dynamo/test_functions.py::FunctionTests::test_dict_key_set3, test/dynamo/test_functions.py::FunctionTests::test_dict_keys, test/dynamo/test_functions.py::FunctionTests::test_dict_kwargs, test/dynamo/test_functions.py::FunctionTests::test_dict_mutable_map, test/dynamo/test_functions.py::FunctionTests::test_dict_ops, test/dynamo/test_functions.py::FunctionTests::test_dict_param_keys, test/dynamo/test_functions.py::FunctionTests::test_dict_setdefault1, test/dynamo/test_functions.py::FunctionTests::test_dict_setdefault2, test/dynamo/test_functions.py::FunctionTests::test_dict_setdefault3, test/dynamo/test_functions.py::FunctionTests::test_dict_sorted, test/dynamo/test_functions.py::FunctionTests::test_dict_tuple_lazy_guard, test/dynamo/test_functions.py::FunctionTests::test_dict_update, test/dynamo/test_functions.py::FunctionTests::test_dict_update_kwargs, test/dynamo/test_functions.py::FunctionTests::test_dict_values, test/dynamo/test_functions.py::FunctionTests::test_distributed_is_available, test/dynamo/test_functions.py::FunctionTests::test_distributed_is_initialized, test/dynamo/test_functions.py::FunctionTests::test_dtype, test/dynamo/test_functions.py::FunctionTests::test_dtype_compare, test/dynamo/test_functions.py::FunctionTests::test_elipsis, test/dynamo/test_functions.py::FunctionTests::test_enumerate, test/dynamo/test_functions.py::FunctionTests::test_enumerate_custom, test/dynamo/test_functions.py::FunctionTests::test_enumerate_reconstruct, test/dynamo/test_functions.py::FunctionTests::test_filter, test/dynamo/test_functions.py::FunctionTests::test_filter_fallback, test/dynamo/test_functions.py::FunctionTests::test_filter_graph_break_reconstruct, test/dynamo/test_functions.py::FunctionTests::test_filter_infinite_iterator, test/dynamo/test_functions.py::FunctionTests::test_filter_reconstruct, test/dynamo/test_functions.py::FunctionTests::test_filter_with_graph_break, test/dynamo/test_functions.py::FunctionTests::test_finfo, test/dynamo/test_functions.py::FunctionTests::test_flat_param_same_storage_size, test/dynamo/test_functions.py::FunctionTests::test_float, test/dynamo/test_functions.py::FunctionTests::test_fn_with_self_set, test/dynamo/test_functions.py::FunctionTests::test_foreach_lerp_, test/dynamo/test_functions.py::FunctionTests::test_fstrings1, test/dynamo/test_functions.py::FunctionTests::test_fstrings2, test/dynamo/test_functions.py::FunctionTests::test_fstrings3, test/dynamo/test_functions.py::FunctionTests::test_fstrings4, test/dynamo/test_functions.py::FunctionTests::test_fstrings5, test/dynamo/test_functions.py::FunctionTests::test_fstrings6, test/dynamo/test_functions.py::FunctionTests::test_funcdef_closure, test/dynamo/test_functions.py::FunctionTests::test_functools_cache_guard, test/dynamo/test_functions.py::FunctionTests::test_functools_partial, test/dynamo/test_functions.py::FunctionTests::test_functools_partial_binding, test/dynamo/test_functions.py::FunctionTests::test_generic_namedtuple_hasattr, test/dynamo/test_functions.py::FunctionTests::test_generic_namedtuple_subclass, test/dynamo/test_functions.py::FunctionTests::test_generic_namedtuple_user_methods, test/dynamo/test_functions.py::FunctionTests::test_get_autocast_gpu_dtype, test/dynamo/test_functions.py::FunctionTests::test_get_calculate_correct_fan, test/dynamo/test_functions.py::FunctionTests::test_get_default_dtype, test/dynamo/test_functions.py::FunctionTests::test_get_device_properties_tensor_device, test/dynamo/test_functions.py::FunctionTests::test_get_privateuse1_name, test/dynamo/test_functions.py::FunctionTests::test_getattr, test/dynamo/test_functions.py::FunctionTests::test_getattr_metaclass, test/dynamo/test_functions.py::FunctionTests::test_globalfn, test/dynamo/test_functions.py::FunctionTests::test_globalmodule, test/dynamo/test_functions.py::FunctionTests::test_globalvar, test/dynamo/test_functions.py::FunctionTests::test_import1, test/dynamo/test_functions.py::FunctionTests::test_in_not_in, test/dynamo/test_functions.py::FunctionTests::test_index, test/dynamo/test_functions.py::FunctionTests::test_indexed_range, test/dynamo/test_functions.py::FunctionTests::test_indirect1, test/dynamo/test_functions.py::FunctionTests::test_indirect2, test/dynamo/test_functions.py::FunctionTests::test_indirect3, test/dynamo/test_functions.py::FunctionTests::test_inline_jit__unwrap_optional, test/dynamo/test_functions.py::FunctionTests::test_inline_jit_annotations, test/dynamo/test_functions.py::FunctionTests::test_inline_lru_cache_fn_with_default_args, test/dynamo/test_functions.py::FunctionTests::test_inline_script_if_tracing_fn_with_default_args, test/dynamo/test_functions.py::FunctionTests::test_inline_softmax, test/dynamo/test_functions.py::FunctionTests::test_inline_with_default, test/dynamo/test_functions.py::FunctionTests::test_inner_function, test/dynamo/test_functions.py::FunctionTests::test_is, test/dynamo/test_functions.py::FunctionTests::test_is_any_autocast_enabled, test/dynamo/test_functions.py::FunctionTests::test_is_checkpoint_valid, test/dynamo/test_functions.py::FunctionTests::test_is_complex, test/dynamo/test_functions.py::FunctionTests::test_is_contiguous_frame_counts, test/dynamo/test_functions.py::FunctionTests::test_is_contiguous_memory_format, test/dynamo/test_functions.py::FunctionTests::test_is_floating_point, test/dynamo/test_functions.py::FunctionTests::test_is_fx_tracing, test/dynamo/test_functions.py::FunctionTests::test_is_in_onnx_export, test/dynamo/test_functions.py::FunctionTests::test_is_inference_mode_global_recompilation, test/dynamo/test_functions.py::FunctionTests::test_is_inference_recompilation, test/dynamo/test_functions.py::FunctionTests::test_is_integer, test/dynamo/test_functions.py::FunctionTests::test_is_not, test/dynamo/test_functions.py::FunctionTests::test_is_not_null, test/dynamo/test_functions.py::FunctionTests::test_is_quantized, test/dynamo/test_functions.py::FunctionTests::test_is_sparse, test/dynamo/test_functions.py::FunctionTests::test_isinstance, test/dynamo/test_functions.py::FunctionTests::test_islice_chain, test/dynamo/test_functions.py::FunctionTests::test_itemgetter, test/dynamo/test_functions.py::FunctionTests::test_itertools_chain, test/dynamo/test_functions.py::FunctionTests::test_itertools_chain_from_iterable, test/dynamo/test_functions.py::FunctionTests::test_itertools_combinations, test/dynamo/test_functions.py::FunctionTests::test_itertools_compress, test/dynamo/test_functions.py::FunctionTests::test_itertools_compress_tensors, test/dynamo/test_functions.py::FunctionTests::test_itertools_filterfalse_basic, test/dynamo/test_functions.py::FunctionTests::test_itertools_pairwise, test/dynamo/test_functions.py::FunctionTests::test_itertools_permutations_args, test/dynamo/test_functions.py::FunctionTests::test_itertools_permutations_basic, test/dynamo/test_functions.py::FunctionTests::test_itertools_permutations_various_iterators, test/dynamo/test_functions.py::FunctionTests::test_itertools_product, test/dynamo/test_functions.py::FunctionTests::test_itertools_product_args, test/dynamo/test_functions.py::FunctionTests::test_itertools_product_various_iterators, test/dynamo/test_functions.py::FunctionTests::test_itertools_reconstruct, test/dynamo/test_functions.py::FunctionTests::test_jit_annotate, test/dynamo/test_functions.py::FunctionTests::test_len_constant_dict, test/dynamo/test_functions.py::FunctionTests::test_len_constant_list, test/dynamo/test_functions.py::FunctionTests::test_len_constant_misc_iterables, test/dynamo/test_functions.py::FunctionTests::test_len_tensor, test/dynamo/test_functions.py::FunctionTests::test_list_add, test/dynamo/test_functions.py::FunctionTests::test_list_add_then_mutate, test/dynamo/test_functions.py::FunctionTests::test_list_clear, test/dynamo/test_functions.py::FunctionTests::test_list_compare_polyfill, test/dynamo/test_functions.py::FunctionTests::test_list_compare_polyfill_non_lists, test/dynamo/test_functions.py::FunctionTests::test_list_convert, test/dynamo/test_functions.py::FunctionTests::test_list_expand_lhs, test/dynamo/test_functions.py::FunctionTests::test_list_index_with_constant_tensor, test/dynamo/test_functions.py::FunctionTests::test_list_reversed, test/dynamo/test_functions.py::FunctionTests::test_list_setitem, test/dynamo/test_functions.py::FunctionTests::test_list_setitem_slice, test/dynamo/test_functions.py::FunctionTests::test_list_slice, test/dynamo/test_functions.py::FunctionTests::test_list_slice_assignment, test/dynamo/test_functions.py::FunctionTests::test_list_sorted1, test/dynamo/test_functions.py::FunctionTests::test_list_sorted2, test/dynamo/test_functions.py::FunctionTests::test_list_truth, test/dynamo/test_functions.py::FunctionTests::test_listarg1, test/dynamo/test_functions.py::FunctionTests::test_listarg2, test/dynamo/test_functions.py::FunctionTests::test_listarg3, test/dynamo/test_functions.py::FunctionTests::test_listarg4, test/dynamo/test_functions.py::FunctionTests::test_listarg5, test/dynamo/test_functions.py::FunctionTests::test_load_global_bool, test/dynamo/test_functions.py::FunctionTests::test_lru_cache_warning_issued_during_tracing, test/dynamo/test_functions.py::FunctionTests::test_mT, test/dynamo/test_functions.py::FunctionTests::test_manual_seed, test/dynamo/test_functions.py::FunctionTests::test_map_call_function_ex, test/dynamo/test_functions.py::FunctionTests::test_map_deque_extendleft, test/dynamo/test_functions.py::FunctionTests::test_map_dict_fromkeys, test/dynamo/test_functions.py::FunctionTests::test_map_enumerate, test/dynamo/test_functions.py::FunctionTests::test_map_infinite, test/dynamo/test_functions.py::FunctionTests::test_map_iter, test/dynamo/test_functions.py::FunctionTests::test_map_list, test/dynamo/test_functions.py::FunctionTests::test_map_list_extend, test/dynamo/test_functions.py::FunctionTests::test_map_list_slice_assign, test/dynamo/test_functions.py::FunctionTests::test_map_max, test/dynamo/test_functions.py::FunctionTests::test_map_max_const, test/dynamo/test_functions.py::FunctionTests::test_map_partial_unpack, test/dynamo/test_functions.py::FunctionTests::test_map_reconstruct, test/dynamo/test_functions.py::FunctionTests::test_map_reduce, test/dynamo/test_functions.py::FunctionTests::test_map_return, test/dynamo/test_functions.py::FunctionTests::test_map_set, test/dynamo/test_functions.py::FunctionTests::test_map_sorted, test/dynamo/test_functions.py::FunctionTests::test_map_str_join, test/dynamo/test_functions.py::FunctionTests::test_map_sum, test/dynamo/test_functions.py::FunctionTests::test_map_tuple, test/dynamo/test_functions.py::FunctionTests::test_map_unpack_twice, test/dynamo/test_functions.py::FunctionTests::test_map_unpack_vars, test/dynamo/test_functions.py::FunctionTests::test_map_with_graph_break, test/dynamo/test_functions.py::FunctionTests::test_map_zip_dict, test/dynamo/test_functions.py::FunctionTests::test_match_mapping_and_match_keys, test/dynamo/test_functions.py::FunctionTests::test_match_sequence, test/dynamo/test_functions.py::FunctionTests::test_math_fma, test/dynamo/test_functions.py::FunctionTests::test_math_radians, test/dynamo/test_functions.py::FunctionTests::test_mean_sum_np, test/dynamo/test_functions.py::FunctionTests::test_methodcall1, test/dynamo/test_functions.py::FunctionTests::test_methodcall2, test/dynamo/test_functions.py::FunctionTests::test_methodcall3, test/dynamo/test_functions.py::FunctionTests::test_methodcaller, test/dynamo/test_functions.py::FunctionTests::test_min_max, test/dynamo/test_functions.py::FunctionTests::test_module_constant, test/dynamo/test_functions.py::FunctionTests::test_namedtuple, test/dynamo/test_functions.py::FunctionTests::test_namedtuple_defaults, test/dynamo/test_functions.py::FunctionTests::test_namedtuple_fields, test/dynamo/test_functions.py::FunctionTests::test_namedtuple_hasattr, test/dynamo/test_functions.py::FunctionTests::test_namedtuple_replace, test/dynamo/test_functions.py::FunctionTests::test_namedtuple_subclass, test/dynamo/test_functions.py::FunctionTests::test_namedtuple_user_methods, test/dynamo/test_functions.py::FunctionTests::test_ndarray_builtin_functions, test/dynamo/test_functions.py::FunctionTests::test_ndarray_method, test/dynamo/test_functions.py::FunctionTests::test_ndarray_methods_returning_scalar, test/dynamo/test_functions.py::FunctionTests::test_ndarray_reshape, test/dynamo/test_functions.py::FunctionTests::test_ndarray_transpose, test/dynamo/test_functions.py::FunctionTests::test_ndim, test/dynamo/test_functions.py::FunctionTests::test_no_recompile_inner_function, test/dynamo/test_functions.py::FunctionTests::test_no_recompile_inner_lambda, test/dynamo/test_functions.py::FunctionTests::test_non_inlined_closure, test/dynamo/test_functions.py::FunctionTests::test_not_list, test/dynamo/test_functions.py::FunctionTests::test_np_constant_collections_as_input_int_or_float_float, test/dynamo/test_functions.py::FunctionTests::test_np_constant_collections_as_input_int_or_float_int, test/dynamo/test_functions.py::FunctionTests::test_np_constant_collections_guards_float, test/dynamo/test_functions.py::FunctionTests::test_np_constant_collections_guards_int, test/dynamo/test_functions.py::FunctionTests::test_np_finfo, test/dynamo/test_functions.py::FunctionTests::test_np_iinfo, test/dynamo/test_functions.py::FunctionTests::test_number_method_method_as_integer_ratio_num_type0, test/dynamo/test_functions.py::FunctionTests::test_number_method_method_as_integer_ratio_num_type3, test/dynamo/test_functions.py::FunctionTests::test_number_method_method_bit_length_num_type1, test/dynamo/test_functions.py::FunctionTests::test_number_method_method_conjugate_num_type2, test/dynamo/test_functions.py::FunctionTests::test_number_method_method_conjugate_num_type4, test/dynamo/test_functions.py::FunctionTests::test_number_method_method_hex_num_type5, test/dynamo/test_functions.py::FunctionTests::test_number_method_method_is_integer_num_type6, test/dynamo/test_functions.py::FunctionTests::test_numpy_attributes, test/dynamo/test_functions.py::FunctionTests::test_numpy_dtype_argument_to_function, test/dynamo/test_functions.py::FunctionTests::test_numpy_dtype_call_in_function, test/dynamo/test_functions.py::FunctionTests::test_numpy_fft, test/dynamo/test_functions.py::FunctionTests::test_numpy_linalg, test/dynamo/test_functions.py::FunctionTests::test_numpy_meshgrid, test/dynamo/test_functions.py::FunctionTests::test_numpy_random, test/dynamo/test_functions.py::FunctionTests::test_numpy_size, test/dynamo/test_functions.py::FunctionTests::test_obj_eq, test/dynamo/test_functions.py::FunctionTests::test_obj_is, test/dynamo/test_functions.py::FunctionTests::test_ordered_dict_kwargs, test/dynamo/test_functions.py::FunctionTests::test_partial_across_graph_break_uninvoked, test/dynamo/test_functions.py::FunctionTests::test_partials_as_input_UDF, test/dynamo/test_functions.py::FunctionTests::test_partials_as_input_partials_lambda, test/dynamo/test_functions.py::FunctionTests::test_partials_as_input_partials_mod, test/dynamo/test_functions.py::FunctionTests::test_partials_graph_break_reconstruct, test/dynamo/test_functions.py::FunctionTests::test_partials_graph_break_reconstruct_args_and_kwargs, test/dynamo/test_functions.py::FunctionTests::test_partials_graph_break_reconstruct_mix, test/dynamo/test_functions.py::FunctionTests::test_partials_graph_break_reconstruct_mix_no_source, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___annotations__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___builtins__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___call__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___class__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___closure__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___code__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___defaults__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___delattr__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___dict__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___dir__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___doc__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___eq__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___format__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___ge__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___get__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___getattribute__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___globals__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___gt__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___hash__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___init__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___init_subclass__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___kwdefaults__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___le__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___lt__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___module__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___name__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___ne__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___new__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___qualname__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___reduce__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___reduce_ex__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___repr__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___setattr__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___sizeof__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___str__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr___subclasshook__, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr_args, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr_func, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_attr_keywords, test/dynamo/test_functions.py::FunctionTests::test_partials_hasattr_set_attr, test/dynamo/test_functions.py::FunctionTests::test_partials_lambda, test/dynamo/test_functions.py::FunctionTests::test_partials_recompilation, test/dynamo/test_functions.py::FunctionTests::test_partials_torch_op_arg, test/dynamo/test_functions.py::FunctionTests::test_partials_torch_op_kwarg, test/dynamo/test_functions.py::FunctionTests::test_partials_udf_arg, test/dynamo/test_functions.py::FunctionTests::test_partials_udf_kwarg, test/dynamo/test_functions.py::FunctionTests::test_partials_udf_kwarg_method, test/dynamo/test_functions.py::FunctionTests::test_partials_udf_kwarg_module, test/dynamo/test_functions.py::FunctionTests::test_pop, test/dynamo/test_functions.py::FunctionTests::test_pos, test/dynamo/test_functions.py::FunctionTests::test_pos_only_args_with_same_name_in_star_kwargs, test/dynamo/test_functions.py::FunctionTests::test_pow_int, test/dynamo/test_functions.py::FunctionTests::test_promote_types, test/dynamo/test_functions.py::FunctionTests::test_rand_inlined, test/dynamo/test_functions.py::FunctionTests::test_rand_tensor_partial, test/dynamo/test_functions.py::FunctionTests::test_range1, test/dynamo/test_functions.py::FunctionTests::test_range2, test/dynamo/test_functions.py::FunctionTests::test_range_iterator, test/dynamo/test_functions.py::FunctionTests::test_range_iterator_2, test/dynamo/test_functions.py::FunctionTests::test_range_iterator_graph_break, test/dynamo/test_functions.py::FunctionTests::test_range_iterator_graph_break_2, test/dynamo/test_functions.py::FunctionTests::test_range_length, test/dynamo/test_functions.py::FunctionTests::test_range_with_index, test/dynamo/test_functions.py::FunctionTests::test_range_with_slice_index, test/dynamo/test_functions.py::FunctionTests::test_reduce, test/dynamo/test_functions.py::FunctionTests::test_reduce_with_initial, test/dynamo/test_functions.py::FunctionTests::test_reduce_with_none_initial, test/dynamo/test_functions.py::FunctionTests::test_reduce_with_single, test/dynamo/test_functions.py::FunctionTests::test_reduce_with_single_with_initial, test/dynamo/test_functions.py::FunctionTests::test_return_dict, test/dynamo/test_functions.py::FunctionTests::test_return_dict2, test/dynamo/test_functions.py::FunctionTests::test_return_multiple_numpy_ndarray, test/dynamo/test_functions.py::FunctionTests::test_return_numpy_ndarray, test/dynamo/test_functions.py::FunctionTests::test_return_tuple1, test/dynamo/test_functions.py::FunctionTests::test_return_tuple2, test/dynamo/test_functions.py::FunctionTests::test_returning_recursive_func, test/dynamo/test_functions.py::FunctionTests::test_round, test/dynamo/test_functions.py::FunctionTests::test_set_add, test/dynamo/test_functions.py::FunctionTests::test_set_in_frozenset, test/dynamo/test_functions.py::FunctionTests::test_set_keys_view, test/dynamo/test_functions.py::FunctionTests::test_set_update_bytecode, test/dynamo/test_functions.py::FunctionTests::test_set_update_list_with_duplicated_items, test/dynamo/test_functions.py::FunctionTests::test_shape1, test/dynamo/test_functions.py::FunctionTests::test_shape2, test/dynamo/test_functions.py::FunctionTests::test_size_tuple_add, test/dynamo/test_functions.py::FunctionTests::test_slice1, test/dynamo/test_functions.py::FunctionTests::test_slice2, test/dynamo/test_functions.py::FunctionTests::test_slice3, test/dynamo/test_functions.py::FunctionTests::test_slice4, test/dynamo/test_functions.py::FunctionTests::test_slice5, test/dynamo/test_functions.py::FunctionTests::test_slice6, test/dynamo/test_functions.py::FunctionTests::test_slice_eq, test/dynamo/test_functions.py::FunctionTests::test_sliced_range, test/dynamo/test_functions.py::FunctionTests::test_sorted_const_key_non_const_items, test/dynamo/test_functions.py::FunctionTests::test_sourceless_build_method_type, test/dynamo/test_functions.py::FunctionTests::test_startswith, test/dynamo/test_functions.py::FunctionTests::test_sum, test/dynamo/test_functions.py::FunctionTests::test_sum_shortcut, test/dynamo/test_functions.py::FunctionTests::test_sum_shortcut_with_start_arg, test/dynamo/test_functions.py::FunctionTests::test_sum_shortcut_with_start_kwarg, test/dynamo/test_functions.py::FunctionTests::test_sum_with_start_arg, test/dynamo/test_functions.py::FunctionTests::test_sum_with_start_kwarg, test/dynamo/test_functions.py::FunctionTests::test_symbool_to_int, test/dynamo/test_functions.py::FunctionTests::test_tensor_dim, test/dynamo/test_functions.py::FunctionTests::test_tensor_element_size, test/dynamo/test_functions.py::FunctionTests::test_tensor_is_complex, test/dynamo/test_functions.py::FunctionTests::test_tensor_len, test/dynamo/test_functions.py::FunctionTests::test_tensor_new_with_shape, test/dynamo/test_functions.py::FunctionTests::test_tensor_new_with_size, test/dynamo/test_functions.py::FunctionTests::test_tensor_size, test/dynamo/test_functions.py::FunctionTests::test_tensor_size_indexed_by_symint, test/dynamo/test_functions.py::FunctionTests::test_tensor_type, test/dynamo/test_functions.py::FunctionTests::test_tensor_type2, test/dynamo/test_functions.py::FunctionTests::test_tensor_type3, test/dynamo/test_functions.py::FunctionTests::test_tensor_type4, test/dynamo/test_functions.py::FunctionTests::test_tensor_type5, test/dynamo/test_functions.py::FunctionTests::test_to, test/dynamo/test_functions.py::FunctionTests::test_torch_distributions_functions, test/dynamo/test_functions.py::FunctionTests::test_torch_from_numpy, test/dynamo/test_functions.py::FunctionTests::test_torch_get_device_module, test/dynamo/test_functions.py::FunctionTests::test_torch_size_as_dict_key, test/dynamo/test_functions.py::FunctionTests::test_torch_size_hasattr, test/dynamo/test_functions.py::FunctionTests::test_torch_source, test/dynamo/test_functions.py::FunctionTests::test_transpose_for_scores, test/dynamo/test_functions.py::FunctionTests::test_truth, test/dynamo/test_functions.py::FunctionTests::test_tuple1, test/dynamo/test_functions.py::FunctionTests::test_tuple2, test/dynamo/test_functions.py::FunctionTests::test_tuple_contains, test/dynamo/test_functions.py::FunctionTests::test_tuple_iadd, test/dynamo/test_functions.py::FunctionTests::test_tuple_map, test/dynamo/test_functions.py::FunctionTests::test_tuple_sorted, test/dynamo/test_functions.py::FunctionTests::test_two_point_iter, test/dynamo/test_functions.py::FunctionTests::test_unary_fold_op, test/dynamo/test_functions.py::FunctionTests::test_unary_fold_op_seq, test/dynamo/test_functions.py::FunctionTests::test_unpack1, test/dynamo/test_functions.py::FunctionTests::test_unpack2, test/dynamo/test_functions.py::FunctionTests::test_unpack3, test/dynamo/test_functions.py::FunctionTests::test_unpack_ex1, test/dynamo/test_functions.py::FunctionTests::test_unpack_ex2, test/dynamo/test_functions.py::FunctionTests::test_unpack_ex3, test/dynamo/test_functions.py::FunctionTests::test_unpack_mutable_map, test/dynamo/test_functions.py::FunctionTests::test_unsqueeze_inplace, test/dynamo/test_functions.py::FunctionTests::test_viamethod, test/dynamo/test_functions.py::FunctionTests::test_viatorch, test/dynamo/test_functions.py::FunctionTests::test_zip_longest, test/dynamo/test_functions.py::FunctionTests::test_zip_reconstruct, test/dynamo/test_functions.py::DefaultsTests::test_cast_tensor_single_elem, test/dynamo/test_functions.py::DefaultsTests::test_dataclass_factory, test/dynamo/test_functions.py::DefaultsTests::test_dataclass_nested, test/dynamo/test_functions.py::DefaultsTests::test_fn_with_attr, test/dynamo/test_functions.py::DefaultsTests::test_frozenset_construction, test/dynamo/test_functions.py::DefaultsTests::test_frozenset_illegal_call_method, test/dynamo/test_functions.py::DefaultsTests::test_frozenset_reconstruction, test/dynamo/test_functions.py::DefaultsTests::test_frozenset_return_type_method_name_copy, test/dynamo/test_functions.py::DefaultsTests::test_frozenset_return_type_method_name_difference, test/dynamo/test_functions.py::DefaultsTests::test_frozenset_return_type_method_name_intersection, test/dynamo/test_functions.py::DefaultsTests::test_frozenset_return_type_method_name_symmetric_difference, test/dynamo/test_functions.py::DefaultsTests::test_frozenset_return_type_method_name_union, test/dynamo/test_functions.py::DefaultsTests::test_full_with_tensor_fill_value, test/dynamo/test_functions.py::DefaultsTests::test_func_attrs, test/dynamo/test_functions.py::DefaultsTests::test_func_default_tensor_args, test/dynamo/test_functions.py::DefaultsTests::test_func_default_torch_args, test/dynamo/test_functions.py::DefaultsTests::test_functional_compile, test/dynamo/test_functions.py::DefaultsTests::test_functools_partial_id, test/dynamo/test_functions.py::DefaultsTests::test_fx_immutable_list_mutation_not_allowed, test/dynamo/test_functions.py::DefaultsTests::test_fx_map_aggregate, test/dynamo/test_functions.py::DefaultsTests::test_gpu_current_device, test/dynamo/test_functions.py::DefaultsTests::test_in_set_inplace, test/dynamo/test_functions.py::DefaultsTests::test_in_set_would_fail_broadcast, test/dynamo/test_functions.py::DefaultsTests::test_inspect_method_source, test/dynamo/test_functions.py::DefaultsTests::test_is_init_in_compile_mutated_tensor_tensor, test/dynamo/test_functions.py::DefaultsTests::test_is_init_in_compile_vmapped_mutated_tensor_tensor, test/dynamo/test_functions.py::DefaultsTests::test_is_init_in_compile_vmapped_mutated_tensor_tensor_multi_arg, test/dynamo/test_functions.py::DefaultsTests::test_is_mutated_tensor_tensor, test/dynamo/test_functions.py::DefaultsTests::test_is_mutated_tensor_tensor_across_graph_break, test/dynamo/test_functions.py::DefaultsTests::test_is_not_tensor_tensor, test/dynamo/test_functions.py::DefaultsTests::test_is_tensor_tensor, test/dynamo/test_functions.py::DefaultsTests::test_is_vmapped_mutated_tensor_tensor, test/dynamo/test_functions.py::DefaultsTests::test_keyword, test/dynamo/test_functions.py::DefaultsTests::test_listlike_of_tensors_contains_constant, test/dynamo/test_functions.py::DefaultsTests::test_map_strict, test/dynamo/test_functions.py::DefaultsTests::test_map_strict_with_graph_break, test/dynamo/test_functions.py::DefaultsTests::test_meth_default_tensor_args, test/dynamo/test_functions.py::DefaultsTests::test_property_class_transmute, test/dynamo/test_functions.py::DefaultsTests::test_property_functools_partial, test/dynamo/test_functions.py::DefaultsTests::test_pybind_object, test/dynamo/test_functions.py::DefaultsTests::test_reconstructed_name, test/dynamo/test_functions.py::DefaultsTests::test_set_call___init___frozenset, test/dynamo/test_functions.py::DefaultsTests::test_set_call___init___set, test/dynamo/test_functions.py::DefaultsTests::test_set_construction, test/dynamo/test_functions.py::DefaultsTests::test_skip_function_call_very_weird_value, test/dynamo/test_functions.py::DefaultsTests::test_str_handler_for_user_defined_object, test/dynamo/test_functions.py::DefaultsTests::test_sys_recursionlimit, test/dynamo/test_functions.py::DefaultsTests::test_tree_map, test/dynamo/test_functions.py::DefaultsTests::test_udf_list, test/dynamo/test_functions.py::DefaultsTests::test_udf_list_reconstruction, test/dynamo/test_functions.py::DefaultsTests::test_udf_list_slice, test/dynamo/test_functions.py::DefaultsTests::test_udf_namedtuple, test/dynamo/test_functions.py::DefaultsTests::test_udf_tuple, test/dynamo/test_functions.py::DefaultsTests::test_udf_tuple_construction, test/dynamo/test_functions.py::DefaultsTests::test_udf_tuple_construction_custom_new, test/dynamo/test_functions.py::DefaultsTests::test_udf_tuple_reconstruction, test/dynamo/test_functions.py::DefaultsTests::test_zip_strict 2025-12-04T11:13:14.2142179Z 2025-12-04T11:13:14.2142292Z Finished dynamo/test_functions 1/1 ... [2025-12-04 11:13:14.205385][3573302.7301877], took 0.46min 2025-12-04T11:13:14.2142709Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:13:14.2143066Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:13:14.2143292Z Running inductor/test_ordered_set 1/1 ... [2025-12-04 11:13:14.212811][3573302.737618672] 2025-12-04T11:13:14.2143477Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:13:14.2143870Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_ordered_set.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:13:14.213113] 2025-12-04T11:13:17.8384389Z 2025-12-04T11:13:17.8385087Z inductor/test_ordered_set 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_ordered_set_1.1_3ecbeca911af8cb5_.log 2025-12-04T11:13:17.8424858Z Running 401 items in this shard: test/inductor/test_ordered_set.py::TestJointOps::test_and, test/inductor/test_ordered_set.py::TestJointOps::test_badcmp, test/inductor/test_ordered_set.py::TestJointOps::test_container_iterator, test/inductor/test_ordered_set.py::TestJointOps::test_contains, test/inductor/test_ordered_set.py::TestJointOps::test_cyclical_repr, test/inductor/test_ordered_set.py::TestJointOps::test_deepcopy, test/inductor/test_ordered_set.py::TestJointOps::test_difference, test/inductor/test_ordered_set.py::TestJointOps::test_do_not_rehash_dict_keys, test/inductor/test_ordered_set.py::TestJointOps::test_equality, test/inductor/test_ordered_set.py::TestJointOps::test_free_after_iterating, test/inductor/test_ordered_set.py::TestJointOps::test_gc, test/inductor/test_ordered_set.py::TestJointOps::test_intersection, test/inductor/test_ordered_set.py::TestJointOps::test_isdisjoint, test/inductor/test_ordered_set.py::TestJointOps::test_iterator_pickling, test/inductor/test_ordered_set.py::TestJointOps::test_len, test/inductor/test_ordered_set.py::TestJointOps::test_new_or_init, test/inductor/test_ordered_set.py::TestJointOps::test_or, test/inductor/test_ordered_set.py::TestJointOps::test_pickling, test/inductor/test_ordered_set.py::TestJointOps::test_setOfFrozensets, test/inductor/test_ordered_set.py::TestJointOps::test_sub, test/inductor/test_ordered_set.py::TestJointOps::test_sub_and_super, test/inductor/test_ordered_set.py::TestJointOps::test_subclass_with_custom_hash, test/inductor/test_ordered_set.py::TestJointOps::test_symmetric_difference, test/inductor/test_ordered_set.py::TestJointOps::test_union, test/inductor/test_ordered_set.py::TestJointOps::test_uniquification, test/inductor/test_ordered_set.py::TestJointOps::test_xor, test/inductor/test_ordered_set.py::TestSet::test_add, test/inductor/test_ordered_set.py::TestSet::test_and, test/inductor/test_ordered_set.py::TestSet::test_badcmp, test/inductor/test_ordered_set.py::TestSet::test_clear, test/inductor/test_ordered_set.py::TestSet::test_constructor_identity, test/inductor/test_ordered_set.py::TestSet::test_container_iterator, test/inductor/test_ordered_set.py::TestSet::test_contains, test/inductor/test_ordered_set.py::TestSet::test_copy, test/inductor/test_ordered_set.py::TestSet::test_cyclical_repr, test/inductor/test_ordered_set.py::TestSet::test_deepcopy, test/inductor/test_ordered_set.py::TestSet::test_difference, test/inductor/test_ordered_set.py::TestSet::test_difference_update, test/inductor/test_ordered_set.py::TestSet::test_discard, test/inductor/test_ordered_set.py::TestSet::test_do_not_rehash_dict_keys, test/inductor/test_ordered_set.py::TestSet::test_equality, test/inductor/test_ordered_set.py::TestSet::test_free_after_iterating, test/inductor/test_ordered_set.py::TestSet::test_gc, test/inductor/test_ordered_set.py::TestSet::test_hash, test/inductor/test_ordered_set.py::TestSet::test_iand, test/inductor/test_ordered_set.py::TestSet::test_init, test/inductor/test_ordered_set.py::TestSet::test_inplace_on_self, test/inductor/test_ordered_set.py::TestSet::test_intersection, test/inductor/test_ordered_set.py::TestSet::test_intersection_update, test/inductor/test_ordered_set.py::TestSet::test_ior, test/inductor/test_ordered_set.py::TestSet::test_isdisjoint, test/inductor/test_ordered_set.py::TestSet::test_isub, test/inductor/test_ordered_set.py::TestSet::test_iterator_pickling, test/inductor/test_ordered_set.py::TestSet::test_ixor, test/inductor/test_ordered_set.py::TestSet::test_len, test/inductor/test_ordered_set.py::TestSet::test_new_or_init, test/inductor/test_ordered_set.py::TestSet::test_or, test/inductor/test_ordered_set.py::TestSet::test_pickling, test/inductor/test_ordered_set.py::TestSet::test_pop, test/inductor/test_ordered_set.py::TestSet::test_remove, test/inductor/test_ordered_set.py::TestSet::test_remove_keyerror_set, test/inductor/test_ordered_set.py::TestSet::test_remove_keyerror_unpacking, test/inductor/test_ordered_set.py::TestSet::test_rich_compare, test/inductor/test_ordered_set.py::TestSet::test_setOfFrozensets, test/inductor/test_ordered_set.py::TestSet::test_set_literal, test/inductor/test_ordered_set.py::TestSet::test_set_literal_evaluation_order, test/inductor/test_ordered_set.py::TestSet::test_set_literal_insertion_order, test/inductor/test_ordered_set.py::TestSet::test_sub, test/inductor/test_ordered_set.py::TestSet::test_sub_and_super, test/inductor/test_ordered_set.py::TestSet::test_subclass_with_custom_hash, test/inductor/test_ordered_set.py::TestSet::test_symmetric_difference, test/inductor/test_ordered_set.py::TestSet::test_symmetric_difference_update, test/inductor/test_ordered_set.py::TestSet::test_union, test/inductor/test_ordered_set.py::TestSet::test_uniquification, test/inductor/test_ordered_set.py::TestSet::test_update, test/inductor/test_ordered_set.py::TestSet::test_weakref, test/inductor/test_ordered_set.py::TestSet::test_xor, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_copy, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_empty_difference, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_empty_difference_rev, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_empty_intersection, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_empty_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_empty_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_empty_union, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_equivalent_equality, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_intersection_empty, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_isdisjoint_empty, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_issue_37219, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_iteration, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_length, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_pickling, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_repr, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_self_difference, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_self_equality, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_self_intersection, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_self_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_self_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_self_union, test/inductor/test_ordered_set.py::TestBasicOpsEmpty::test_union_empty, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_copy, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_empty_difference, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_empty_difference_rev, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_empty_intersection, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_empty_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_empty_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_empty_union, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_equivalent_equality, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_in, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_intersection_empty, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_isdisjoint_empty, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_issue_37219, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_iteration, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_length, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_not_in, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_pickling, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_repr, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_self_difference, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_self_equality, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_self_intersection, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_self_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_self_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_self_union, test/inductor/test_ordered_set.py::TestBasicOpsSingleton::test_union_empty, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_copy, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_empty_difference, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_empty_difference_rev, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_empty_intersection, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_empty_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_empty_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_empty_union, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_equivalent_equality, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_in, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_intersection_empty, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_isdisjoint_empty, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_issue_37219, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_iteration, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_length, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_not_in, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_pickling, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_repr, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_self_difference, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_self_equality, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_self_intersection, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_self_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_self_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_self_union, test/inductor/test_ordered_set.py::TestBasicOpsTuple::test_union_empty, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_copy, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_empty_difference, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_empty_difference_rev, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_empty_intersection, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_empty_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_empty_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_empty_union, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_equivalent_equality, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_intersection_empty, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_isdisjoint_empty, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_issue_37219, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_iteration, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_length, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_pickling, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_repr, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_self_difference, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_self_equality, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_self_intersection, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_self_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_self_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_self_union, test/inductor/test_ordered_set.py::TestBasicOpsTriple::test_union_empty, test/inductor/test_ordered_set.py::TestBasicOpsString::test_copy, test/inductor/test_ordered_set.py::TestBasicOpsString::test_empty_difference, test/inductor/test_ordered_set.py::TestBasicOpsString::test_empty_difference_rev, test/inductor/test_ordered_set.py::TestBasicOpsString::test_empty_intersection, test/inductor/test_ordered_set.py::TestBasicOpsString::test_empty_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsString::test_empty_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsString::test_empty_union, test/inductor/test_ordered_set.py::TestBasicOpsString::test_equivalent_equality, test/inductor/test_ordered_set.py::TestBasicOpsString::test_intersection_empty, test/inductor/test_ordered_set.py::TestBasicOpsString::test_isdisjoint_empty, test/inductor/test_ordered_set.py::TestBasicOpsString::test_issue_37219, test/inductor/test_ordered_set.py::TestBasicOpsString::test_iteration, test/inductor/test_ordered_set.py::TestBasicOpsString::test_length, test/inductor/test_ordered_set.py::TestBasicOpsString::test_pickling, test/inductor/test_ordered_set.py::TestBasicOpsString::test_repr, test/inductor/test_ordered_set.py::TestBasicOpsString::test_self_difference, test/inductor/test_ordered_set.py::TestBasicOpsString::test_self_equality, test/inductor/test_ordered_set.py::TestBasicOpsString::test_self_intersection, test/inductor/test_ordered_set.py::TestBasicOpsString::test_self_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsString::test_self_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsString::test_self_union, test/inductor/test_ordered_set.py::TestBasicOpsString::test_union_empty, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_copy, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_empty_difference, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_empty_difference_rev, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_empty_intersection, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_empty_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_empty_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_empty_union, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_equivalent_equality, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_intersection_empty, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_isdisjoint_empty, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_issue_37219, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_iteration, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_length, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_pickling, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_repr, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_self_difference, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_self_equality, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_self_intersection, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_self_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_self_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_self_union, test/inductor/test_ordered_set.py::TestBasicOpsBytes::test_union_empty, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_copy, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_empty_difference, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_empty_difference_rev, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_empty_intersection, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_empty_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_empty_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_empty_union, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_equivalent_equality, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_intersection_empty, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_isdisjoint_empty, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_issue_37219, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_iteration, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_length, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_pickling, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_repr, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_self_difference, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_self_equality, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_self_intersection, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_self_isdisjoint, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_self_symmetric_difference, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_self_union, test/inductor/test_ordered_set.py::TestBasicOpsMixedStringBytes::test_union_empty, test/inductor/test_ordered_set.py::TestExceptionPropagation::test_changingSizeWhileIterating, test/inductor/test_ordered_set.py::TestExceptionPropagation::test_instanceWithException, test/inductor/test_ordered_set.py::TestExceptionPropagation::test_instancesWithoutException, test/inductor/test_ordered_set.py::TestSetOfSets::test_constructor, test/inductor/test_ordered_set.py::TestBinaryOps::test_eq, test/inductor/test_ordered_set.py::TestBinaryOps::test_intersection_non_overlap, test/inductor/test_ordered_set.py::TestBinaryOps::test_intersection_overlap, test/inductor/test_ordered_set.py::TestBinaryOps::test_intersection_subset, test/inductor/test_ordered_set.py::TestBinaryOps::test_intersection_superset, test/inductor/test_ordered_set.py::TestBinaryOps::test_isdisjoint_non_overlap, test/inductor/test_ordered_set.py::TestBinaryOps::test_isdisjoint_overlap, test/inductor/test_ordered_set.py::TestBinaryOps::test_isdisjoint_subset, test/inductor/test_ordered_set.py::TestBinaryOps::test_isdisjoint_superset, test/inductor/test_ordered_set.py::TestBinaryOps::test_sym_difference_non_overlap, test/inductor/test_ordered_set.py::TestBinaryOps::test_sym_difference_overlap, test/inductor/test_ordered_set.py::TestBinaryOps::test_sym_difference_subset, test/inductor/test_ordered_set.py::TestBinaryOps::test_sym_difference_superset, test/inductor/test_ordered_set.py::TestBinaryOps::test_union_non_overlap, test/inductor/test_ordered_set.py::TestBinaryOps::test_union_overlap, test/inductor/test_ordered_set.py::TestBinaryOps::test_union_subset, test/inductor/test_ordered_set.py::TestBinaryOps::test_union_superset, test/inductor/test_ordered_set.py::TestUpdateOps::test_difference_method_call, test/inductor/test_ordered_set.py::TestUpdateOps::test_difference_non_overlap, test/inductor/test_ordered_set.py::TestUpdateOps::test_difference_overlap, test/inductor/test_ordered_set.py::TestUpdateOps::test_difference_subset, test/inductor/test_ordered_set.py::TestUpdateOps::test_difference_superset, test/inductor/test_ordered_set.py::TestUpdateOps::test_intersection_method_call, test/inductor/test_ordered_set.py::TestUpdateOps::test_intersection_non_overlap, test/inductor/test_ordered_set.py::TestUpdateOps::test_intersection_overlap, test/inductor/test_ordered_set.py::TestUpdateOps::test_intersection_subset, test/inductor/test_ordered_set.py::TestUpdateOps::test_intersection_superset, test/inductor/test_ordered_set.py::TestUpdateOps::test_sym_difference_method_call, test/inductor/test_ordered_set.py::TestUpdateOps::test_sym_difference_non_overlap, test/inductor/test_ordered_set.py::TestUpdateOps::test_sym_difference_overlap, test/inductor/test_ordered_set.py::TestUpdateOps::test_sym_difference_subset, test/inductor/test_ordered_set.py::TestUpdateOps::test_sym_difference_superset, test/inductor/test_ordered_set.py::TestUpdateOps::test_union_method_call, test/inductor/test_ordered_set.py::TestUpdateOps::test_union_non_overlap, test/inductor/test_ordered_set.py::TestUpdateOps::test_union_overlap, test/inductor/test_ordered_set.py::TestUpdateOps::test_union_subset, test/inductor/test_ordered_set.py::TestUpdateOps::test_union_superset, test/inductor/test_ordered_set.py::TestMutate::test_add_absent, test/inductor/test_ordered_set.py::TestMutate::test_add_present, test/inductor/test_ordered_set.py::TestMutate::test_add_until_full, test/inductor/test_ordered_set.py::TestMutate::test_clear, test/inductor/test_ordered_set.py::TestMutate::test_discard_absent, test/inductor/test_ordered_set.py::TestMutate::test_discard_present, test/inductor/test_ordered_set.py::TestMutate::test_pop, test/inductor/test_ordered_set.py::TestMutate::test_remove_absent, test/inductor/test_ordered_set.py::TestMutate::test_remove_present, test/inductor/test_ordered_set.py::TestMutate::test_remove_until_empty, test/inductor/test_ordered_set.py::TestMutate::test_update_empty_tuple, test/inductor/test_ordered_set.py::TestMutate::test_update_unit_tuple_non_overlap, test/inductor/test_ordered_set.py::TestMutate::test_update_unit_tuple_overlap, test/inductor/test_ordered_set.py::TestSubsets::test_issubset, test/inductor/test_ordered_set.py::TestSubsetEqualEmpty::test_issubset, test/inductor/test_ordered_set.py::TestSubsetEqualNonEmpty::test_issubset, test/inductor/test_ordered_set.py::TestSubsetEmptyNonEmpty::test_issubset, test/inductor/test_ordered_set.py::TestSubsetPartial::test_issubset, test/inductor/test_ordered_set.py::TestSubsetNonOverlap::test_issubset, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_difference, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_eq_ne, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_ge_gt_le_lt, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_intersection, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_intersection_update, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_intersection_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_sym_difference, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_sym_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_sym_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_union, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_update, test/inductor/test_ordered_set.py::TestOnlySetsNumeric::test_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_difference, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_eq_ne, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_ge_gt_le_lt, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_intersection, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_intersection_update, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_intersection_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_sym_difference, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_sym_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_sym_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_union, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_update, test/inductor/test_ordered_set.py::TestOnlySetsDict::test_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_difference, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_eq_ne, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_ge_gt_le_lt, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_intersection, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_intersection_update, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_intersection_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_sym_difference, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_sym_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_sym_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_union, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_update, test/inductor/test_ordered_set.py::TestOnlySetsOperator::test_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_difference, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_eq_ne, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_ge_gt_le_lt, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_intersection, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_intersection_update, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_intersection_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_sym_difference, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_sym_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_sym_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_union, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_update, test/inductor/test_ordered_set.py::TestOnlySetsTuple::test_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsString::test_difference, test/inductor/test_ordered_set.py::TestOnlySetsString::test_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsString::test_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsString::test_eq_ne, test/inductor/test_ordered_set.py::TestOnlySetsString::test_ge_gt_le_lt, test/inductor/test_ordered_set.py::TestOnlySetsString::test_intersection, test/inductor/test_ordered_set.py::TestOnlySetsString::test_intersection_update, test/inductor/test_ordered_set.py::TestOnlySetsString::test_intersection_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsString::test_sym_difference, test/inductor/test_ordered_set.py::TestOnlySetsString::test_sym_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsString::test_sym_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsString::test_union, test/inductor/test_ordered_set.py::TestOnlySetsString::test_update, test/inductor/test_ordered_set.py::TestOnlySetsString::test_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_difference, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_eq_ne, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_ge_gt_le_lt, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_intersection, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_intersection_update, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_intersection_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_sym_difference, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_sym_difference_update, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_sym_difference_update_operator, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_union, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_update, test/inductor/test_ordered_set.py::TestOnlySetsGenerator::test_update_operator, test/inductor/test_ordered_set.py::TestCopyingEmpty::test_copy, test/inductor/test_ordered_set.py::TestCopyingEmpty::test_deep_copy, test/inductor/test_ordered_set.py::TestCopyingSingleton::test_copy, test/inductor/test_ordered_set.py::TestCopyingSingleton::test_deep_copy, test/inductor/test_ordered_set.py::TestCopyingTriple::test_copy, test/inductor/test_ordered_set.py::TestCopyingTriple::test_deep_copy, test/inductor/test_ordered_set.py::TestCopyingTuple::test_copy, test/inductor/test_ordered_set.py::TestCopyingTuple::test_deep_copy, test/inductor/test_ordered_set.py::TestCopyingNested::test_copy, test/inductor/test_ordered_set.py::TestCopyingNested::test_deep_copy, test/inductor/test_ordered_set.py::TestIdentities::test_binopsVsSubsets, test/inductor/test_ordered_set.py::TestIdentities::test_commutativity, test/inductor/test_ordered_set.py::TestIdentities::test_exclusion, test/inductor/test_ordered_set.py::TestIdentities::test_summations, test/inductor/test_ordered_set.py::TestVariousIteratorArgs::test_constructor, test/inductor/test_ordered_set.py::TestVariousIteratorArgs::test_inline_methods, test/inductor/test_ordered_set.py::TestVariousIteratorArgs::test_inplace_methods, test/inductor/test_ordered_set.py::TestWeirdBugs::test_8420_set_merge, test/inductor/test_ordered_set.py::TestWeirdBugs::test_iter_and_mutate, test/inductor/test_ordered_set.py::TestWeirdBugs::test_merge_and_mutate, test/inductor/test_ordered_set.py::TestGraphs::test_cube, test/inductor/test_ordered_set.py::TestGraphs::test_cuboctahedron 2025-12-04T11:13:17.8463706Z 2025-12-04T11:13:17.8463827Z Finished inductor/test_ordered_set 1/1 ... [2025-12-04 11:13:17.838373][3573306.363178878], took 0.06min 2025-12-04T11:13:17.8464225Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:13:17.8464578Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:13:17.8464810Z Running dynamo/test_install_free_tensors 1/1 ... [2025-12-04 11:13:17.845705][3573306.370516841] 2025-12-04T11:13:17.8465007Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:13:17.8465407Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_install_free_tensors.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:13:17.845958] 2025-12-04T11:13:55.3248238Z 2025-12-04T11:13:55.3249326Z dynamo/test_install_free_tensors 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_install_free_tensors_1.1_e7643023c1785554_.log 2025-12-04T11:13:55.3253275Z Running 25 items in this shard: test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_breadth_linear, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_nested_linear, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_nets_as_input, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_optimizing_buffer_and_param_in_input, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_optimizing_buffer_in_input, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_optimizing_linear, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_optimizing_params_in_input, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_resnet_structure, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_simple_batchnorm, test/dynamo/test_install_free_tensors.py::InstallParamsAsGraphAttrTests::test_transformer, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_dict_of_tensor, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_global_tensor_export, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_list_of_tensor, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_modify_net_state, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_nested_list_of_tensor, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_nonlocal_closure, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_optimizing_buffer_and_param_in_input, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_optimizing_buffer_in_input, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_optimizing_params_in_input, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_resnet_structure, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_simple_batchnorm, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_simple_linear, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_tensors_as_nn_attr, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_transformer, test/dynamo/test_install_free_tensors.py::InstallParamsWhenExport::test_user_defined_object 2025-12-04T11:13:55.3256779Z 2025-12-04T11:13:55.3256909Z Finished dynamo/test_install_free_tensors 1/1 ... [2025-12-04 11:13:55.324520][3573343.849329952], took 0.62min 2025-12-04T11:13:55.3257452Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:13:55.3311742Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:13:55.3313298Z Running inductor/test_torchinductor_codegen_config_overrides 1/1 ... [2025-12-04 11:13:55.331195][3573343.856005695] 2025-12-04T11:13:55.3313548Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:13:55.3315888Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_torchinductor_codegen_config_overrides.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:13:55.331484] 2025-12-04T11:14:05.2298371Z 2025-12-04T11:14:05.2302923Z inductor/test_torchinductor_codegen_config_overrides 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_torchinductor_codegen_config_overrides_1.1_b771349d8ac4c7f3_.log 2025-12-04T11:14:05.2304528Z Running 4 items in this shard: test/inductor/test_torchinductor_codegen_config_overrides.py::CodegenInductorTest::test_cse_make_block_ptr_reduction, test/inductor/test_torchinductor_codegen_config_overrides.py::CodegenInductorTest::test_force_pointwise_cat_force_pointwise_cat_False, test/inductor/test_torchinductor_codegen_config_overrides.py::CodegenInductorTest::test_force_pointwise_cat_force_pointwise_cat_True, test/inductor/test_torchinductor_codegen_config_overrides.py::CodegenInductorTest::test_kernel_fusion_thresholds 2025-12-04T11:14:05.2305387Z 2025-12-04T11:14:05.2305556Z Finished inductor/test_torchinductor_codegen_config_overrides 1/1 ... [2025-12-04 11:14:05.229584][3573353.754393552], took 0.16min 2025-12-04T11:14:05.2342726Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:14:05.2360656Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:14:05.2363132Z Running export/test_passes 1/1 ... [2025-12-04 11:14:05.236231][3573353.761044035] 2025-12-04T11:14:05.2363373Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:14:05.2366327Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_passes.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:14:05.236447] 2025-12-04T11:14:29.4559718Z 2025-12-04T11:14:29.4561042Z export/test_passes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_passes_1.1_9c4953c4adf3b7fb_.log 2025-12-04T11:14:29.4564202Z Running 28 items in this shard: test/export/test_passes.py::TestPasses::test_constant_folding_pass, test/export/test_passes.py::TestPasses::test_custom_obj_tuple_out, test/export/test_passes.py::TestPasses::test_fakify_script_objects, test/export/test_passes.py::TestPasses::test_fakify_script_objects_properly_handle_containers, test/export/test_passes.py::TestPasses::test_functionalization_with_view_copy, test/export/test_passes.py::TestPasses::test_inline_, test/export/test_passes.py::TestPasses::test_math_ops, test/export/test_passes.py::TestPasses::test_move_device_example_inputs, test/export/test_passes.py::TestPasses::test_move_device_submod, test/export/test_passes.py::TestPasses::test_move_device_to, test/export/test_passes.py::TestPasses::test_move_to_device_pass, test/export/test_passes.py::TestPasses::test_predispatch_autocast, test/export/test_passes.py::TestPasses::test_predispatch_autocast_and_set_grad, test/export/test_passes.py::TestPasses::test_predispatch_set_grad, test/export/test_passes.py::TestPasses::test_remove_auto_functionalized_pass, test/export/test_passes.py::TestPasses::test_remove_auto_functionalized_pass_tuple, test/export/test_passes.py::TestPasses::test_remove_effect_token_kwargs, test/export/test_passes.py::TestPasses::test_runtime_assert_inline_constraints_for_cond, test/export/test_passes.py::TestPasses::test_runtime_assert_inline_constraints_for_item, test/export/test_passes.py::TestPasses::test_runtime_assert_inline_constraints_for_nonzero, test/export/test_passes.py::TestPasses::test_runtime_assert_multiple_dims, test/export/test_passes.py::TestPasses::test_runtime_assert_one_dim, test/export/test_passes.py::TestPasses::test_runtime_assert_some_dims_not_specified, test/export/test_passes.py::TestPasses::test_runtime_assert_some_inps_not_used, test/export/test_passes.py::TestPasses::test_sequential_split, test/export/test_passes.py::TestPasses::test_sequential_split_graph, test/export/test_passes.py::TestPasses::test_view_to_view_copy, test/export/test_passes.py::TestPasses::test_views_op_having_view_copy 2025-12-04T11:14:29.4567403Z 2025-12-04T11:14:29.4567518Z Finished export/test_passes 1/1 ... [2025-12-04 11:14:29.450448][3573377.975257811], took 0.40min 2025-12-04T11:14:29.4567910Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:14:29.4568385Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:14:29.4572034Z Running dynamo/test_autograd_function 1/1 ... [2025-12-04 11:14:29.457000][3573377.981813376] 2025-12-04T11:14:29.4572304Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:14:29.4573863Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_autograd_function.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:14:29.457261] 2025-12-04T11:14:41.4651792Z 2025-12-04T11:14:41.4653560Z dynamo/test_autograd_function 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_autograd_function_1.1_b4e7456b121c8667_.log 2025-12-04T11:14:41.4660149Z Running 41 items in this shard: test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_allow_in_graph, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_amp_custom_fwd_bwd, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_assert_is_contiguous_after_matmul, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_assert_is_contiguous_on_grad_output_directly, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_autograd_function_equivalence, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_autograd_function_has_graph_break, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_backward_returns_none_for_tensor_input, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_classmethod, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_data_in_bwd, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_default_values, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_enum_arg, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_forward_returns_constant, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_function_context_mark_and_save, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_function_context_save_and_mark, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_function_with_bound_free_variable, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_fwd_no_grad, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_fwd_propogation_correctness, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_linear_setup_context, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_mark_multi_output_non_differentiable, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_mark_non_differentiable, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_materialize_grad, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_multi_output, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_multiple_different_non_tensor_inputs, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_needs_input_grad, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_nonlocal_list_mutation_in_autograd_function, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_once_differentiable, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_print_in_bwd, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_repeated_save_for_backward_calls, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_requires_grad_in_bwd, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_save_for_bwd, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_set_materialize_grads_no_graph_break, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_smoke_from_test_autograd, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_smuggle_symint_issue_111031, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_smuggle_tensor_and_complex_structures, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_stride_in_bwd, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_tensor_list_as_input, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_tensor_subclass_intermediary_input, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_triton_kernel_basic, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_triton_kernel_multiple_out, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_tuple_arg, test/dynamo/test_autograd_function.py::AutogradFunctionTests::test_user_defined_object_as_input 2025-12-04T11:14:41.4665524Z 2025-12-04T11:14:41.4665656Z Finished dynamo/test_autograd_function 1/1 ... [2025-12-04 11:14:41.464916][3573389.989727252], took 0.20min 2025-12-04T11:14:41.4666069Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:14:41.4722627Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:14:41.4723222Z Running inductor/test_codecache 1/1 ... [2025-12-04 11:14:41.471634][3573389.996446664] 2025-12-04T11:14:41.4723421Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:14:41.4723840Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_codecache.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:14:41.471854] 2025-12-04T11:18:13.8151143Z 2025-12-04T11:18:13.8152543Z inductor/test_codecache 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_codecache_1.1_ddc66e006789740b_.log 2025-12-04T11:18:13.8207497Z Running 247 items in this shard: test/inductor/test_codecache.py::TestPyCodeCache::test_editable_cached_wrapper, test/inductor/test_codecache.py::TestPyCodeCache::test_linemaps_empty, test/inductor/test_codecache.py::TestFxGraphCache::test_async_compile_cache, test/inductor/test_codecache.py::TestFxGraphCache::test_auto_functionalized_caching_variant_v1, test/inductor/test_codecache.py::TestFxGraphCache::test_auto_functionalized_caching_variant_v2, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_clear, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_guard, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_guard_overspec, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_caching_precompile_dynamic_False_device_cpu_bfloat16, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_caching_precompile_dynamic_False_device_cpu_float32, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_caching_precompile_dynamic_False_device_cuda_bfloat16, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_caching_precompile_dynamic_False_device_cuda_float32, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_caching_precompile_dynamic_True_device_cpu_bfloat16, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_caching_precompile_dynamic_True_device_cpu_float32, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_caching_precompile_dynamic_True_device_cuda_bfloat16, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_caching_precompile_dynamic_True_device_cuda_float32, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_device_cpu_bfloat16_dynamic_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_device_cpu_bfloat16_dynamic_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_device_cpu_float32_dynamic_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_device_cpu_float32_dynamic_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_device_cuda_bfloat16_dynamic_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_device_cuda_bfloat16_dynamic_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_device_cuda_float32_dynamic_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_device_cuda_float32_dynamic_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_empty, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_generic, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_pgo, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_pgo_swap_file_names, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_hot_load_repeat, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_bfloat16_dynamic_False_bundle_triton_False_use_static_cuda_launcher_False_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_bfloat16_dynamic_False_bundle_triton_False_use_static_cuda_launcher_False_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_bfloat16_dynamic_False_bundle_triton_False_use_static_cuda_launcher_True_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_bfloat16_dynamic_False_bundle_triton_False_use_static_cuda_launcher_True_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_bfloat16_dynamic_False_bundle_triton_True_use_static_cuda_launcher_False_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_bfloat16_dynamic_False_bundle_triton_True_use_static_cuda_launcher_False_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_bfloat16_dynamic_False_bundle_triton_True_use_static_cuda_launcher_True_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_bfloat16_dynamic_False_bundle_triton_True_use_static_cuda_launcher_True_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_bfloat16_dynamic_True_bundle_triton_False_use_static_cuda_launcher_False_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_bfloat16_dynamic_True_bundle_triton_False_use_static_cuda_launcher_False_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_bfloat16_dynamic_True_bundle_triton_False_use_static_cuda_launcher_True_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_bfloat16_dynamic_True_bundle_triton_False_use_static_cuda_launcher_True_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_bfloat16_dynamic_True_bundle_triton_True_use_static_cuda_launcher_False_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_bfloat16_dynamic_True_bundle_triton_True_use_static_cuda_launcher_False_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_bfloat16_dynamic_True_bundle_triton_True_use_static_cuda_launcher_True_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_bfloat16_dynamic_True_bundle_triton_True_use_static_cuda_launcher_True_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_float32_dynamic_False_bundle_triton_False_use_static_cuda_launcher_False_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_float32_dynamic_False_bundle_triton_False_use_static_cuda_launcher_False_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_float32_dynamic_False_bundle_triton_False_use_static_cuda_launcher_True_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_float32_dynamic_False_bundle_triton_False_use_static_cuda_launcher_True_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_float32_dynamic_False_bundle_triton_True_use_static_cuda_launcher_False_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_float32_dynamic_False_bundle_triton_True_use_static_cuda_launcher_False_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_float32_dynamic_False_bundle_triton_True_use_static_cuda_launcher_True_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_float32_dynamic_False_bundle_triton_True_use_static_cuda_launcher_True_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_float32_dynamic_True_bundle_triton_False_use_static_cuda_launcher_False_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_float32_dynamic_True_bundle_triton_False_use_static_cuda_launcher_False_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_float32_dynamic_True_bundle_triton_False_use_static_cuda_launcher_True_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_float32_dynamic_True_bundle_triton_False_use_static_cuda_launcher_True_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_float32_dynamic_True_bundle_triton_True_use_static_cuda_launcher_False_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_float32_dynamic_True_bundle_triton_True_use_static_cuda_launcher_False_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_float32_dynamic_True_bundle_triton_True_use_static_cuda_launcher_True_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cpu_float32_dynamic_True_bundle_triton_True_use_static_cuda_launcher_True_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_bfloat16_dynamic_False_bundle_triton_False_use_static_cuda_launcher_False_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_bfloat16_dynamic_False_bundle_triton_False_use_static_cuda_launcher_False_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_bfloat16_dynamic_False_bundle_triton_False_use_static_cuda_launcher_True_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_bfloat16_dynamic_False_bundle_triton_False_use_static_cuda_launcher_True_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_bfloat16_dynamic_False_bundle_triton_True_use_static_cuda_launcher_False_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_bfloat16_dynamic_False_bundle_triton_True_use_static_cuda_launcher_False_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_bfloat16_dynamic_False_bundle_triton_True_use_static_cuda_launcher_True_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_bfloat16_dynamic_False_bundle_triton_True_use_static_cuda_launcher_True_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_bfloat16_dynamic_True_bundle_triton_False_use_static_cuda_launcher_False_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_bfloat16_dynamic_True_bundle_triton_False_use_static_cuda_launcher_False_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_bfloat16_dynamic_True_bundle_triton_False_use_static_cuda_launcher_True_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_bfloat16_dynamic_True_bundle_triton_False_use_static_cuda_launcher_True_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_bfloat16_dynamic_True_bundle_triton_True_use_static_cuda_launcher_False_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_bfloat16_dynamic_True_bundle_triton_True_use_static_cuda_launcher_False_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_bfloat16_dynamic_True_bundle_triton_True_use_static_cuda_launcher_True_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_bfloat16_dynamic_True_bundle_triton_True_use_static_cuda_launcher_True_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_float32_dynamic_False_bundle_triton_False_use_static_cuda_launcher_False_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_float32_dynamic_False_bundle_triton_False_use_static_cuda_launcher_False_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_float32_dynamic_False_bundle_triton_False_use_static_cuda_launcher_True_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_float32_dynamic_False_bundle_triton_False_use_static_cuda_launcher_True_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_float32_dynamic_False_bundle_triton_True_use_static_cuda_launcher_False_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_float32_dynamic_False_bundle_triton_True_use_static_cuda_launcher_False_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_float32_dynamic_False_bundle_triton_True_use_static_cuda_launcher_True_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_float32_dynamic_False_bundle_triton_True_use_static_cuda_launcher_True_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_float32_dynamic_True_bundle_triton_False_use_static_cuda_launcher_False_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_float32_dynamic_True_bundle_triton_False_use_static_cuda_launcher_False_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_float32_dynamic_True_bundle_triton_False_use_static_cuda_launcher_True_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_float32_dynamic_True_bundle_triton_False_use_static_cuda_launcher_True_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_float32_dynamic_True_bundle_triton_True_use_static_cuda_launcher_False_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_float32_dynamic_True_bundle_triton_True_use_static_cuda_launcher_False_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_float32_dynamic_True_bundle_triton_True_use_static_cuda_launcher_True_grad_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_function_device_cuda_float32_dynamic_True_bundle_triton_True_use_static_cuda_launcher_True_grad_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_model_device_cpu_float32_dynamic_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_model_device_cpu_float32_dynamic_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_model_device_cpu_float64_dynamic_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_model_device_cpu_float64_dynamic_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_model_device_cuda_float32_dynamic_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_model_device_cuda_float32_dynamic_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_model_device_cuda_float64_dynamic_False, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_model_device_cuda_float64_dynamic_True, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_with_guards_int32_bounds_device_cuda_bfloat16, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_with_guards_int32_bounds_device_cuda_float16, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_with_guards_static_bounds_device_cpu_bfloat16, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_with_guards_static_bounds_device_cpu_float32, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_with_guards_static_bounds_device_cuda_bfloat16, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_load_with_guards_static_bounds_device_cuda_float32, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_with_nt, test/inductor/test_codecache.py::TestFxGraphCache::test_cache_with_symint_non_arg_guard, test/inductor/test_codecache.py::TestFxGraphCache::test_constant_handling_device_cpu, test/inductor/test_codecache.py::TestFxGraphCache::test_constant_handling_device_cuda, test/inductor/test_codecache.py::TestFxGraphCache::test_flex_attention_caching, test/inductor/test_codecache.py::TestFxGraphCache::test_freezing_device_cpu_inlinable_False, test/inductor/test_codecache.py::TestFxGraphCache::test_freezing_device_cpu_inlinable_True, test/inductor/test_codecache.py::TestFxGraphCache::test_freezing_device_cuda_inlinable_False, test/inductor/test_codecache.py::TestFxGraphCache::test_freezing_device_cuda_inlinable_True, test/inductor/test_codecache.py::TestFxGraphCache::test_generated_kernel_count, test/inductor/test_codecache.py::TestFxGraphCache::test_higher_order_op_bypass_bundle_triton_False, test/inductor/test_codecache.py::TestFxGraphCache::test_higher_order_op_bypass_bundle_triton_True, test/inductor/test_codecache.py::TestFxGraphCache::test_inductor_counters, test/inductor/test_codecache.py::TestFxGraphCache::test_no_arguments_tensor_device_guards, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cpu_bfloat16_dynamic_False_bundle_triton_False_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cpu_bfloat16_dynamic_False_bundle_triton_False_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cpu_bfloat16_dynamic_False_bundle_triton_True_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cpu_bfloat16_dynamic_False_bundle_triton_True_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cpu_bfloat16_dynamic_True_bundle_triton_False_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cpu_bfloat16_dynamic_True_bundle_triton_False_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cpu_bfloat16_dynamic_True_bundle_triton_True_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cpu_bfloat16_dynamic_True_bundle_triton_True_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cpu_float32_dynamic_False_bundle_triton_False_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cpu_float32_dynamic_False_bundle_triton_False_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cpu_float32_dynamic_False_bundle_triton_True_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cpu_float32_dynamic_False_bundle_triton_True_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cpu_float32_dynamic_True_bundle_triton_False_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cpu_float32_dynamic_True_bundle_triton_False_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cpu_float32_dynamic_True_bundle_triton_True_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cpu_float32_dynamic_True_bundle_triton_True_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cuda_bfloat16_dynamic_False_bundle_triton_False_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cuda_bfloat16_dynamic_False_bundle_triton_False_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cuda_bfloat16_dynamic_False_bundle_triton_True_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cuda_bfloat16_dynamic_False_bundle_triton_True_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cuda_bfloat16_dynamic_True_bundle_triton_False_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cuda_bfloat16_dynamic_True_bundle_triton_False_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cuda_bfloat16_dynamic_True_bundle_triton_True_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cuda_bfloat16_dynamic_True_bundle_triton_True_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cuda_float32_dynamic_False_bundle_triton_False_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cuda_float32_dynamic_False_bundle_triton_False_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cuda_float32_dynamic_False_bundle_triton_True_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cuda_float32_dynamic_False_bundle_triton_True_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cuda_float32_dynamic_True_bundle_triton_False_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cuda_float32_dynamic_True_bundle_triton_False_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cuda_float32_dynamic_True_bundle_triton_True_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_remote_cache_load_function_device_cuda_float32_dynamic_True_bundle_triton_True_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_tensor_device_guards_cpu_tensor, test/inductor/test_codecache.py::TestFxGraphCache::test_triton_higher_order_op_bundle_triton_False, test/inductor/test_codecache.py::TestFxGraphCache::test_triton_higher_order_op_bundle_triton_True, test/inductor/test_codecache.py::TestFxGraphCache::test_triton_higher_order_op_different_configs_bundle_triton_False, test/inductor/test_codecache.py::TestFxGraphCache::test_triton_higher_order_op_different_configs_bundle_triton_True, test/inductor/test_codecache.py::TestFxGraphCache::test_triton_op_bundle_triton_False_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_triton_op_bundle_triton_False_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestFxGraphCache::test_triton_op_bundle_triton_True_use_static_cuda_launcher_False, test/inductor/test_codecache.py::TestFxGraphCache::test_triton_op_bundle_triton_True_use_static_cuda_launcher_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_backend_dynamic_shapes_from_example_inputs_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_backend_dynamic_shapes_from_example_inputs_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_backend_is_aot_False_dynamic_shapes_from_graph, test/inductor/test_codecache.py::TestStandaloneCompile::test_backend_is_aot_False_dynamic_shapes_from_tracing_context, test/inductor/test_codecache.py::TestStandaloneCompile::test_backend_is_aot_True_dynamic_shapes_from_graph, test/inductor/test_codecache.py::TestStandaloneCompile::test_backend_is_aot_True_dynamic_shapes_from_tracing_context, test/inductor/test_codecache.py::TestStandaloneCompile::test_backend_static_shapes_dynamic_shapes_from_example_inputs, test/inductor/test_codecache.py::TestStandaloneCompile::test_backend_static_shapes_dynamic_shapes_from_graph, test/inductor/test_codecache.py::TestStandaloneCompile::test_backend_static_shapes_dynamic_shapes_from_tracing_context, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cpu_format_binary_dynamic_False_graph_partition_False_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cpu_format_binary_dynamic_False_graph_partition_False_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cpu_format_binary_dynamic_False_graph_partition_True_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cpu_format_binary_dynamic_False_graph_partition_True_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cpu_format_binary_dynamic_True_graph_partition_False_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cpu_format_binary_dynamic_True_graph_partition_False_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cpu_format_binary_dynamic_True_graph_partition_True_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cpu_format_binary_dynamic_True_graph_partition_True_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cpu_format_unpacked_dynamic_False_graph_partition_False_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cpu_format_unpacked_dynamic_False_graph_partition_False_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cpu_format_unpacked_dynamic_False_graph_partition_True_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cpu_format_unpacked_dynamic_False_graph_partition_True_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cpu_format_unpacked_dynamic_True_graph_partition_False_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cpu_format_unpacked_dynamic_True_graph_partition_False_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cpu_format_unpacked_dynamic_True_graph_partition_True_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cpu_format_unpacked_dynamic_True_graph_partition_True_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cuda_format_binary_dynamic_False_graph_partition_False_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cuda_format_binary_dynamic_False_graph_partition_False_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cuda_format_binary_dynamic_False_graph_partition_True_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cuda_format_binary_dynamic_False_graph_partition_True_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cuda_format_binary_dynamic_True_graph_partition_False_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cuda_format_binary_dynamic_True_graph_partition_False_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cuda_format_binary_dynamic_True_graph_partition_True_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cuda_format_binary_dynamic_True_graph_partition_True_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cuda_format_unpacked_dynamic_False_graph_partition_False_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cuda_format_unpacked_dynamic_False_graph_partition_False_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cuda_format_unpacked_dynamic_False_graph_partition_True_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cuda_format_unpacked_dynamic_False_graph_partition_True_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cuda_format_unpacked_dynamic_True_graph_partition_False_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cuda_format_unpacked_dynamic_True_graph_partition_False_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cuda_format_unpacked_dynamic_True_graph_partition_True_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_basic_device_cuda_format_unpacked_dynamic_True_graph_partition_True_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_call_in_backend_dynamic_False_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_call_in_backend_dynamic_False_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_call_in_backend_dynamic_True_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_call_in_backend_dynamic_True_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_custom_pass_handling, test/inductor/test_codecache.py::TestStandaloneCompile::test_different_process, test/inductor/test_codecache.py::TestStandaloneCompile::test_dynamic_shapes_from_example_inputs_is_aot_False_config_patches_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_dynamic_shapes_from_example_inputs_is_aot_False_config_patches_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_dynamic_shapes_from_example_inputs_is_aot_True_config_patches_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_dynamic_shapes_from_example_inputs_is_aot_True_config_patches_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_dynamic_shapes_from_graph_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_dynamic_shapes_from_graph_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_modify_unpacked_file_device_cpu, test/inductor/test_codecache.py::TestStandaloneCompile::test_modify_unpacked_file_device_cuda, test/inductor/test_codecache.py::TestStandaloneCompile::test_save_in_new_path, test/inductor/test_codecache.py::TestStandaloneCompile::test_split_module_is_aot_False, test/inductor/test_codecache.py::TestStandaloneCompile::test_split_module_is_aot_True, test/inductor/test_codecache.py::TestStandaloneCompile::test_static_shapes_is_aot_False_dynamic_shapes_from_example_inputs, test/inductor/test_codecache.py::TestStandaloneCompile::test_static_shapes_is_aot_False_dynamic_shapes_from_graph, test/inductor/test_codecache.py::TestStandaloneCompile::test_static_shapes_is_aot_True_dynamic_shapes_from_example_inputs, test/inductor/test_codecache.py::TestStandaloneCompile::test_static_shapes_is_aot_True_dynamic_shapes_from_graph, test/inductor/test_codecache.py::TestFxGraphCacheHashing::test_bypass_unsupported, test/inductor/test_codecache.py::TestFxGraphCacheHashing::test_get_hash_for_files, test/inductor/test_codecache.py::TestFxGraphCacheHashing::test_hash_config_changes, test/inductor/test_codecache.py::TestFxGraphCacheHashing::test_hash_custom_backend_config, test/inductor/test_codecache.py::TestFxGraphCacheHashing::test_hash_custom_backend_pass, test/inductor/test_codecache.py::TestFxGraphCacheHashing::test_hash_custom_partitioner_fn, test/inductor/test_codecache.py::TestFxGraphCacheHashing::test_hash_custom_passes, test/inductor/test_codecache.py::TestFxGraphCacheHashing::test_hash_fake_tensors, test/inductor/test_codecache.py::TestFxGraphCacheHashing::test_hash_kwargs, test/inductor/test_codecache.py::TestFxGraphCacheHashing::test_hash_private_config_changes, test/inductor/test_codecache.py::TestFxGraphCacheHashing::test_non_serializable_custom_passes_causes_cache_miss, test/inductor/test_codecache.py::TestFxGraphCacheHashing::test_parameter_constants, test/inductor/test_codecache.py::TestFxGraphCacheHashing::test_stable_strings, test/inductor/test_codecache.py::TestCudaCompileCommand::test_cuda_compile_command, test/inductor/test_codecache.py::TestAutotuneCache::test_autotune_cache, test/inductor/test_codecache.py::TestAutotuneCache::test_autotune_cache_warm_start, test/inductor/test_codecache.py::TestAutotuneCache::test_bundled_autotune_remote_cache, test/inductor/test_codecache.py::TestAutotuneCache::test_modified_autotune_cache_remote_cache_False, test/inductor/test_codecache.py::TestAutotuneCache::test_modified_autotune_cache_remote_cache_True, test/inductor/test_codecache.py::TestRemoteAOTAutogradCache::test_autograd_remote_cache, test/inductor/test_codecache.py::TestRemoteAOTAutogradCache::test_autograd_remote_lazy_backward, test/inductor/test_codecache.py::TestUtils::test_force_disable_coordinate_descent, test/inductor/test_codecache.py::TestUtils::test_fresh_cache 2025-12-04T11:18:13.8249168Z 2025-12-04T11:18:13.8249286Z Finished inductor/test_codecache 1/1 ... [2025-12-04 11:18:13.815064][3573602.33987199], took 3.54min 2025-12-04T11:18:13.8249679Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:18:13.8250035Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:18:13.8250271Z Running inductor/test_distributed_patterns 1/1 ... [2025-12-04 11:18:13.822646][3573602.347459816] 2025-12-04T11:18:13.8250471Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:18:13.8250871Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_distributed_patterns.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:18:13.822865] 2025-12-04T11:18:34.0701981Z 2025-12-04T11:18:34.0703150Z inductor/test_distributed_patterns 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_distributed_patterns_1.1_b1e99b9ad9abc509_.log 2025-12-04T11:18:34.0709503Z Running 20 items in this shard: test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_fake_distributed_aot_eager, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_fake_distributed_inductor, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_intermediate_hook_with_closure, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_intermediate_hook_with_nested_closure, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_module_backward_hooks_aot, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_module_backward_hooks_eager, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_module_backward_hooks_inductor, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_module_backward_hooks_multi_layers, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_nn_param_return1, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_nn_param_return2, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_nn_param_return3, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_nn_param_return4, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_storage_resize_nonzero_cpu, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_storage_resize_nonzero_gpu, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_storage_resize_zero_cpu, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_storage_resize_zero_gpu, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_unsafe_preserve_version_counter1, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_unsafe_preserve_version_counter2, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_unsafe_set_version_counter1, test/inductor/test_distributed_patterns.py::DistributedPatternTests::test_unsafe_set_version_counter2 2025-12-04T11:18:34.0714133Z 2025-12-04T11:18:34.0714450Z Finished inductor/test_distributed_patterns 1/1 ... [2025-12-04 11:18:34.069851][3573622.594659129], took 0.34min 2025-12-04T11:18:34.0716754Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:18:34.0772500Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:18:34.0775179Z Running dynamo/test_fake_distributed 1/1 ... [2025-12-04 11:18:34.077289][3573622.602102217] 2025-12-04T11:18:34.0775425Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:18:34.0776366Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_fake_distributed.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:18:34.077514] 2025-12-04T11:18:37.0464707Z 2025-12-04T11:18:37.0465968Z dynamo/test_fake_distributed 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_fake_distributed_1.1_18a0648163316331_.log 2025-12-04T11:18:37.0467793Z Running 3 items in this shard: test/dynamo/test_fake_distributed.py::TestFakeDistributed::test_all_to_all_single_autograd, test/dynamo/test_fake_distributed.py::TestFakeDistributed::test_device_mesh_flatten, test/dynamo/test_fake_distributed.py::TestFakeDistributed::test_device_mesh_get_local_rank 2025-12-04T11:18:37.0468918Z 2025-12-04T11:18:37.0469180Z Finished dynamo/test_fake_distributed 1/1 ... [2025-12-04 11:18:37.046088][3573625.570897282], took 0.05min 2025-12-04T11:18:37.0479527Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:18:37.0538449Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:18:37.0540389Z Running export/test_nativert 1/1 ... [2025-12-04 11:18:37.053867][3573625.578680465] 2025-12-04T11:18:37.0540737Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:18:37.0542306Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_nativert.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:18:37.054095] 2025-12-04T11:18:42.7277571Z 2025-12-04T11:18:42.7278730Z export/test_nativert 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_nativert_1.1_07bd76b06db0d761_.log 2025-12-04T11:18:42.7280817Z Running 6 items in this shard: test/export/test_nativert.py::TestNativeRT::test_aoti_0_cpu, test/export/test_nativert.py::TestNativeRT::test_aoti_1_cpu, test/export/test_nativert.py::TestNativeRT::test_aoti_2_cpu, test/export/test_nativert.py::TestNativeRT::test_aoti_3_cuda, test/export/test_nativert.py::TestNativeRT::test_aoti_4_cuda, test/export/test_nativert.py::TestNativeRT::test_aoti_5_cuda 2025-12-04T11:18:42.7282946Z 2025-12-04T11:18:42.7283267Z Finished export/test_nativert 1/1 ... [2025-12-04 11:18:42.727405][3573631.252213806], took 0.09min 2025-12-04T11:18:42.7292454Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:18:42.7351307Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:18:42.7353428Z Running inductor/test_custom_op_autotune 1/1 ... [2025-12-04 11:18:42.735161][3573631.259974239] 2025-12-04T11:18:42.7353801Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:18:42.7355302Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'inductor/test_custom_op_autotune.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:18:42.735391] 2025-12-04T11:18:58.9259341Z 2025-12-04T11:18:58.9260885Z inductor/test_custom_op_autotune 1/1 was successful, full logs can be found in artifacts with path test/test-reports/inductor.test_custom_op_autotune_1.1_01a3575f8e373e46_.log 2025-12-04T11:18:58.9262366Z Running 3 items in this shard: test/inductor/test_custom_op_autotune.py::TestCustomOpAutoTune::test_decompose_k_custom_op_autotune_dynamic_config_for_input_shape, test/inductor/test_custom_op_autotune.py::TestCustomOpAutoTune::test_multi_parameter_tuning, test/inductor/test_custom_op_autotune.py::TestCustomOpAutoTune::test_rmsnorm_custom_op_autotune_with_dynamic_shape 2025-12-04T11:18:58.9263273Z 2025-12-04T11:18:58.9263492Z Finished inductor/test_custom_op_autotune 1/1 ... [2025-12-04 11:18:58.925666][3573647.450473432], took 0.27min 2025-12-04T11:18:58.9275764Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:18:58.9334724Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:18:58.9336694Z Running export/test_converter 1/1 ... [2025-12-04 11:18:58.933522][3573647.458335593] 2025-12-04T11:18:58.9336980Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:18:58.9338572Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_converter.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:18:58.933738] 2025-12-04T11:19:08.6701409Z 2025-12-04T11:19:08.6703644Z export/test_converter 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_converter_1.1_f69874feb5bb46bc_.log 2025-12-04T11:19:08.6710623Z Running 48 items in this shard: test/export/test_converter.py::TestConverter::test_aten___getitem___dict, test/export/test_converter.py::TestConverter::test_aten___getitem___list, test/export/test_converter.py::TestConverter::test_aten___is__, test/export/test_converter.py::TestConverter::test_aten___isnot__, test/export/test_converter.py::TestConverter::test_aten___not__, test/export/test_converter.py::TestConverter::test_aten_add_t, test/export/test_converter.py::TestConverter::test_aten_append_t, test/export/test_converter.py::TestConverter::test_aten_dim, test/export/test_converter.py::TestConverter::test_aten_floordiv, test/export/test_converter.py::TestConverter::test_aten_len, test/export/test_converter.py::TestConverter::test_aten_tensor_dtype_int, test/export/test_converter.py::TestConverter::test_aten_tensor_dynamic, test/export/test_converter.py::TestConverter::test_aten_tensor_prim_dtype, test/export/test_converter.py::TestConverter::test_aten_to_dtype_with_mutating_storage, test/export/test_converter.py::TestConverter::test_context_manager, test/export/test_converter.py::TestConverter::test_convert_func_without_param, test/export/test_converter.py::TestConverter::test_convert_if_basic, test/export/test_converter.py::TestConverter::test_convert_if_duplicate_attr_names, test/export/test_converter.py::TestConverter::test_convert_if_multiple_out, test/export/test_converter.py::TestConverter::test_convert_if_tuple_out, test/export/test_converter.py::TestConverter::test_convert_nn_module_with_nested_buffer, test/export/test_converter.py::TestConverter::test_convert_nn_module_with_nested_if_and_buffer, test/export/test_converter.py::TestConverter::test_convert_nn_module_with_nested_if_and_param, test/export/test_converter.py::TestConverter::test_convert_nn_module_with_nested_param, test/export/test_converter.py::TestConverter::test_convert_retrace_nested_scripted_modules, test/export/test_converter.py::TestConverter::test_convert_script_object, test/export/test_converter.py::TestConverter::test_get_tensor_constants, test/export/test_converter.py::TestConverter::test_hidden_input_name, test/export/test_converter.py::TestConverter::test_implicit_constant_to_tensor_handling, test/export/test_converter.py::TestConverter::test_prim_SetAttr, test/export/test_converter.py::TestConverter::test_prim_device, test/export/test_converter.py::TestConverter::test_prim_device_cuda, test/export/test_converter.py::TestConverter::test_prim_dtype, test/export/test_converter.py::TestConverter::test_prim_max, test/export/test_converter.py::TestConverter::test_prim_min, test/export/test_converter.py::TestConverter::test_prim_tolist, test/export/test_converter.py::TestConverter::test_profiler__record_function, test/export/test_converter.py::TestConverter::test_raise_exception, test/export/test_converter.py::TestConverter::test_ts2ep_convert_quantized_model1, test/export/test_converter.py::TestConverter::test_ts2ep_convert_quantized_model_with_opcontext, test/export/test_converter.py::TestConverter::test_ts2ep_convert_quantized_model_with_opcontext_and_constant, test/export/test_converter.py::TestConverter::test_ts2ep_converter_basic, test/export/test_converter.py::TestConverter::test_ts2ep_converter_container_output, test/export/test_converter.py::TestConverter::test_ts2ep_converter_contains, test/export/test_converter.py::TestConverter::test_ts2ep_converter_custom_op, test/export/test_converter.py::TestConverter::test_ts2ep_converter_unpack, test/export/test_converter.py::TestConverter::test_ts2ep_multi_outputs_on_call_ops, test/export/test_converter.py::TestConverter::test_ts2ep_with_loop 2025-12-04T11:19:08.6715509Z 2025-12-04T11:19:08.6715623Z Finished export/test_converter 1/1 ... [2025-12-04 11:19:08.669794][3573657.194604622], took 0.16min 2025-12-04T11:19:08.6716016Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:19:08.6769315Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:19:08.6771374Z Running dynamo/test_reorder_logs 1/1 ... [2025-12-04 11:19:08.676980][3573657.201793595] 2025-12-04T11:19:08.6771580Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:19:08.6773278Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_reorder_logs.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:19:08.677207] 2025-12-04T11:19:11.3965124Z 2025-12-04T11:19:11.3966930Z dynamo/test_reorder_logs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_reorder_logs_1.1_8aa5278557ea40b9_.log 2025-12-04T11:19:11.3972558Z Running 14 items in this shard: test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method0_fn0_should_ignore_logger_False, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method1_fn1_should_ignore_logger_False, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method2_fn2_should_ignore_logger_False, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method3_fn3_should_ignore_logger_False, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method4_fn4_should_ignore_logger_True, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method5_fn5_should_ignore_logger_True, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method6_fn6_should_ignore_logger_True, test/dynamo/test_reorder_logs.py::IgnoreLogsTests::test_ignore_logger_ignore_method7_fn7_should_ignore_logger_True, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_constant_mutation, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_dont_reorder_print, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_reorder_custom_log_fn, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_reorder_print, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_reorder_print_graph_break, test/dynamo/test_reorder_logs.py::ReorderLogsTests::test_reorder_warnings 2025-12-04T11:19:11.3976998Z 2025-12-04T11:19:11.3977200Z Finished dynamo/test_reorder_logs 1/1 ... [2025-12-04 11:19:11.396229][3573659.921035665], took 0.05min 2025-12-04T11:19:11.3982034Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:19:11.4040502Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:19:11.4042532Z Running dynamo/test_subclasses 1/1 ... [2025-12-04 11:19:11.404101][3573659.928914717] 2025-12-04T11:19:11.4042797Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:19:11.4044575Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'dynamo/test_subclasses.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:19:11.404327] 2025-12-04T11:19:32.7025575Z 2025-12-04T11:19:32.7026764Z dynamo/test_subclasses 1/1 was successful, full logs can be found in artifacts with path test/test-reports/dynamo.test_subclasses_1.1_8742bf5ea184a1ab_.log 2025-12-04T11:19:32.7050727Z Running 126 items in this shard: test/dynamo/test_subclasses.py::SubclassTests::test_as_subclass_attr_mutation, test/dynamo/test_subclasses.py::SubclassTests::test_base_torch_function_tracing, test/dynamo/test_subclasses.py::SubclassTests::test_compile_higher_order_with_functionalization, test/dynamo/test_subclasses.py::SubclassTests::test_compile_with_fake_tensor_automatic_dynamic, test/dynamo/test_subclasses.py::SubclassTests::test_compile_with_fake_tensor_dynamic_dim, test/dynamo/test_subclasses.py::SubclassTests::test_compile_with_functionalization, test/dynamo/test_subclasses.py::SubclassTests::test_disable_all_torch_function, test/dynamo/test_subclasses.py::SubclassTests::test_disable_all_torch_function_restore_values, test/dynamo/test_subclasses.py::SubclassTests::test_disable_all_torch_function_restore_values_graph_break, test/dynamo/test_subclasses.py::SubclassTests::test_has_torch_function, test/dynamo/test_subclasses.py::SubclassTests::test_make_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_mark_static_with_subclass_desugaring_dynamic_False, test/dynamo/test_subclasses.py::SubclassTests::test_mark_static_with_subclass_desugaring_dynamic_True, test/dynamo/test_subclasses.py::SubclassTests::test_newly_constructed_tensor_subclass_attr_mutation, test/dynamo/test_subclasses.py::SubclassTests::test_njt_subclass_from_buffer, test/dynamo/test_subclasses.py::SubclassTests::test_njt_subclass_from_cat, test/dynamo/test_subclasses.py::SubclassTests::test_njt_subclass_simple, test/dynamo/test_subclasses.py::SubclassTests::test_no_call_to_new, test/dynamo/test_subclasses.py::SubclassTests::test_no_torch_function_on_size_bytecode, test/dynamo/test_subclasses.py::SubclassTests::test_no_torch_function_recompiles, test/dynamo/test_subclasses.py::SubclassTests::test_nontraceable_tensor_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_overridden_method_guarding, test/dynamo/test_subclasses.py::SubclassTests::test_parameter_subclass_custom_torch_func_and_dynamic_attr, test/dynamo/test_subclasses.py::SubclassTests::test_parameter_subclass_with_old_torch_function, test/dynamo/test_subclasses.py::SubclassTests::test_recompile_with_symbool_inputs, test/dynamo/test_subclasses.py::SubclassTests::test_recompiles_with_optional_inner_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_return_as_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_return_local_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_return_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_TwoTensor_TwoTensor_TwoTensor, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_TwoTensor_nested_diff_sizes, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_constructor_proxying, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_dont_invoke_torch_function_on_overridden_attr, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_dont_invoke_torch_function_on_overridden_method, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_override_shape_and_to, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_parameters_are_static_under_training, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_views_dynamic_False, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_views_dynamic_True, test/dynamo/test_subclasses.py::SubclassTests::test_subclass_with_disabled_torch_function, test/dynamo/test_subclasses.py::SubclassTests::test_support_bases, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_automatic_dynamic_shapes, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_clone_view, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_different_shape, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_mark_dynamic_shapes, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_mul, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_nested, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_return_multiple, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_return_shape, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_return_tensor_and_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_simple, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_view, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_TwoTensor_view_mul, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_attr_codegen_tos, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_ctx_custom_guards_error_arg_num, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_ctx_custom_guards_error_not_classmethod, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_ctx_custom_guards_override, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_ctx_guards, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_ctx_recursive_guards, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_custom_attr, test/dynamo/test_subclasses.py::SubclassTests::test_tensor_subclass_with_non_classmethod_torch_function, test/dynamo/test_subclasses.py::SubclassTests::test_torch_dispatch_subclass_guard_recompile, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_call_on_attr, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_call_on_method, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_call_on_method_arg, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_list_args, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_state_graph_break, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_state_guards, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_state_nested, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_state_tracing, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_subclass_survives_into_aot_autograd, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_wrapper_class, test/dynamo/test_subclasses.py::SubclassTests::test_torch_function_wrapper_class_with_kwargs, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_equality_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_equality_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_identity_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_identity_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_isinstance_subclass, test/dynamo/test_subclasses.py::SubclassTests::test_type_check_isinstance_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_user_overridden_attr_unsupported, test/dynamo/test_subclasses.py::SubclassTests::test_user_overridden_method_unsupported, test/dynamo/test_subclasses.py::SubclassTests::test_user_overridden_property_unsupported, test/dynamo/test_subclasses.py::SubclassTests::test_wrapper_subclass_dynamo_attribute_access_on_intermediate, test/dynamo/test_subclasses.py::SubclassTests::test_wrapper_subclass_guards_on_inner_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_wrapper_subclass_with_differently_sized_inner_tensor, test/dynamo/test_subclasses.py::SubclassTests::test_wrapper_subclass_with_same_sized_inner_tensor, test/dynamo/test_subclasses.py::TestNestedTensor::test_basic_autograd, test/dynamo/test_subclasses.py::TestNestedTensor::test_basic_autograd_inductor, test/dynamo/test_subclasses.py::TestNestedTensor::test_binary_does_not_recompile, test/dynamo/test_subclasses.py::TestNestedTensor::test_binary_recompiles, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_input, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_input_2, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_input_4, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_input_5, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_input_6, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_intermediate, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_intermediate_2, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_intermediate_3, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_intermediate_4, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_from_intermediate_5, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_mixed, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_mixed_2, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_construction_mixed_3, test/dynamo/test_subclasses.py::TestNestedTensor::test_in_graph_is_nested_call, test/dynamo/test_subclasses.py::TestNestedTensor::test_inference_tensor, test/dynamo/test_subclasses.py::TestNestedTensor::test_inline_nested_tensor_from_jagged, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_basic, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_leaf_False_False, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_leaf_False_True, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_leaf_True_False, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_leaf_True_True, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_False_obscure, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_basic, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_leaf_False_False, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_leaf_False_True, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_leaf_True_False, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_leaf_True_True, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_base_is_nt_True_obscure, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_dense_subclass_dense_subclass, test/dynamo/test_subclasses.py::TestNestedTensor::test_inputs_to_compiled_fn_are_views_nt_view_name_subclass_dense, test/dynamo/test_subclasses.py::TestNestedTensor::test_param_subclass_isinstance_input, test/dynamo/test_subclasses.py::TestNestedTensor::test_return_shape, test/dynamo/test_subclasses.py::TestNestedTensor::test_subclass_dense_subclass_dense_view, test/dynamo/test_subclasses.py::TestNestedTensor::test_subclass_gives_static_shapes_when_dynamic_false, test/dynamo/test_subclasses.py::TestNestedTensor::test_subclass_with_mutation_in_graph, test/dynamo/test_subclasses.py::TestNestedTensor::test_unary_does_not_recompile, test/dynamo/test_subclasses.py::TestNestedTensor::test_unbind 2025-12-04T11:19:32.7066074Z 2025-12-04T11:19:32.7066189Z Finished dynamo/test_subclasses 1/1 ... [2025-12-04 11:19:32.702493][3573681.227299962], took 0.35min 2025-12-04T11:19:32.7066575Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:19:32.7101332Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:19:32.7103917Z Running export/test_verifier 1/1 ... [2025-12-04 11:19:32.710179][3573681.234992537] 2025-12-04T11:19:32.7104319Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:19:32.7105281Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_verifier.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:19:32.710399] 2025-12-04T11:19:35.7304288Z 2025-12-04T11:19:35.7305067Z export/test_verifier 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_verifier_1.1_f5e9e0ad9fb99963_.log 2025-12-04T11:19:35.7306906Z Running 10 items in this shard: test/export/test_verifier.py::TestVerifier::test_ep_verifier_basic, test/export/test_verifier.py::TestVerifier::test_ep_verifier_buffer_mutate, test/export/test_verifier.py::TestVerifier::test_ep_verifier_invalid_buffer, test/export/test_verifier.py::TestVerifier::test_ep_verifier_invalid_output, test/export/test_verifier.py::TestVerifier::test_ep_verifier_invalid_param, test/export/test_verifier.py::TestVerifier::test_verifier_basic, test/export/test_verifier.py::TestVerifier::test_verifier_call_module, test/export/test_verifier.py::TestVerifier::test_verifier_higher_order, test/export/test_verifier.py::TestVerifier::test_verifier_nested_invalid_module, test/export/test_verifier.py::TestVerifier::test_verifier_no_functional 2025-12-04T11:19:35.7307961Z 2025-12-04T11:19:35.7308080Z Finished export/test_verifier 1/1 ... [2025-12-04 11:19:35.730123][3573684.254932323], took 0.05min 2025-12-04T11:19:35.7320516Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:19:35.7378982Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:19:35.7380726Z Running export/test_sparse 1/1 ... [2025-12-04 11:19:35.737930][3573684.262743725] 2025-12-04T11:19:35.7381109Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:19:35.7383003Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'export/test_sparse.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:19:35.738155] 2025-12-04T11:25:53.1528687Z 2025-12-04T11:25:53.1530022Z export/test_sparse 1/1 was successful, full logs can be found in artifacts with path test/test-reports/export.test_sparse_1.1_3f62cc8a4b9b0d50_.log 2025-12-04T11:25:53.1562756Z Running 203 items in this shard: test/export/test_sparse.py::TestSparseProp::test_activation_coo, test/export/test_sparse.py::TestSparseProp::test_activation_csr, test/export/test_sparse.py::TestSparseProp::test_add, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_bfloat16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float32_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_float64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_eltwisenet_int64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_bfloat16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float32_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_float64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_idnet_int64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_bfloat16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float32_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_float64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_sumnet_int64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_bfloat16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float16_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float32_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_float64_int64_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int32_SparseCSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseBSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseBSR, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseCOO, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseCSC, test/export/test_sparse.py::TestSparseProp::test_todensenet_int64_int64_SparseCSR 2025-12-04T11:25:53.1584566Z 2025-12-04T11:25:53.1584678Z Finished export/test_sparse 1/1 ... [2025-12-04 11:25:53.152744][3574061.677552777], took 6.29min 2025-12-04T11:25:53.1585158Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:25:53.1598973Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:25:53.1602483Z Running test_weak 1/1 ... [2025-12-04 11:25:53.159974][3574061.684787947] 2025-12-04T11:25:53.1602686Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:25:53.1603088Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_weak.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:25:53.160161] 2025-12-04T11:25:59.4340640Z 2025-12-04T11:25:59.4341519Z test_weak 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_weak_1.1_87ae930dce61278e_.log 2025-12-04T11:25:59.4349100Z Running 39 items in this shard: test/test_weak.py::WeakTest::test_make_weak_keyed_dict_from_dict, test/test_weak.py::WeakTest::test_make_weak_keyed_dict_from_weak_keyed_dict, test/test_weak.py::WeakTest::test_make_weak_keyed_dict_repr, test/test_weak.py::WeakTest::test_threaded_weak_key_dict_copy, test/test_weak.py::WeakTest::test_threaded_weak_key_dict_deepcopy, test/test_weak.py::WeakTest::test_weak_keyed_bad_delitem, test/test_weak.py::WeakTest::test_weak_keyed_delitem, test/test_weak.py::WeakTest::test_weak_keyed_dict_popitem, test/test_weak.py::WeakTest::test_weak_keyed_dict_setdefault, test/test_weak.py::WeakTest::test_weak_keyed_dict_update, test/test_weak.py::WeakTest::test_weak_keyed_union_operators, test/test_weak.py::WeakKeyDictionaryTestCase::test_bool, test/test_weak.py::WeakKeyDictionaryTestCase::test_constructor, test/test_weak.py::WeakKeyDictionaryTestCase::test_get, test/test_weak.py::WeakKeyDictionaryTestCase::test_getitem, test/test_weak.py::WeakKeyDictionaryTestCase::test_items, test/test_weak.py::WeakKeyDictionaryTestCase::test_keys, test/test_weak.py::WeakKeyDictionaryTestCase::test_len, test/test_weak.py::WeakKeyDictionaryTestCase::test_pop, test/test_weak.py::WeakKeyDictionaryTestCase::test_popitem, test/test_weak.py::WeakKeyDictionaryTestCase::test_read, test/test_weak.py::WeakKeyDictionaryTestCase::test_setdefault, test/test_weak.py::WeakKeyDictionaryTestCase::test_update, test/test_weak.py::WeakKeyDictionaryTestCase::test_values, test/test_weak.py::WeakKeyDictionaryTestCase::test_write, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_bool, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_constructor, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_get, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_getitem, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_items, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_keys, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_len, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_pop, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_popitem, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_read, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_setdefault, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_update, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_values, test/test_weak.py::WeakKeyDictionaryScriptObjectTestCase::test_write 2025-12-04T11:25:59.4354987Z 2025-12-04T11:25:59.4355136Z Finished test_weak 1/1 ... [2025-12-04 11:25:59.433666][3574067.958475943], took 0.10min 2025-12-04T11:25:59.4355736Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:25:59.4405456Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:25:59.4409408Z Running test_decomp 2/12 ... [2025-12-04 11:25:59.440573][3574067.965385938] 2025-12-04T11:25:59.4409816Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:25:59.4410279Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '--shard-id=2', '--num-shards=12', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:25:59.440791] 2025-12-04T11:37:41.5153232Z 2025-12-04T11:37:41.5153800Z test_decomp 2/12 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_2.12_7cc02c307311b1cc_.log 2025-12-04T11:37:41.5238714Z Running 772 items in this shard: test/test_decomp.py::TestDecompCUDA::test_arange_graph_cuda, test/test_decomp.py::TestDecompCUDA::test_bernoulli_default_cuda, test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_H_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_T_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___radd___cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmatmul___cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rxor___cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acos_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcdiv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_decomposed_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmv_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addr_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_all_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_allclose_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_amin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_aminmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argsort_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_partial_views_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_2d_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_baddbmm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bfloat16_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_not_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_or_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cauchy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdouble_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cfloat_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cholesky_solve_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cholesky_solve_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clamp_min_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_column_stack_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_corrcoef_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumprod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumsum_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_embed_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diff_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_digamma_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dist_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_floor_rounding_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dot_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_permuted_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erf_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfinv_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_float8_e4m3fn, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftshift_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfftn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfft_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfftn_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_rfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fliplr_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_power_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gradient_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gradient_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gradient_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_histc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_histc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hsplit_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hstack_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_i0_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_prod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_int_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isclose_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isfinite_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isinf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isneginf_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kron_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kron_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kron_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ldexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cond_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_diagonal_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_ldl_factor_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lstsq_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_solve_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_solve_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_norm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_norm_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_rank_hermitian_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_pinv_singular_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_qr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_slogdet_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_svd_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_svd_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_tensor_overload_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linspace_tensor_overload_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_normal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_with_dtype_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logaddexp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logaddexp_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logcumsumexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_not_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logical_or_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logit_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mT_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_prod_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_softmin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_sum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_var_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matmul_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matmul_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_binary_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_no_dim_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_reduction_with_dim_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_msort_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanquantile_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_dropout_backward_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_neg_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_avg_pool1d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_channel_shuffle_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose2d_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv_transpose3d_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_similarity_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_ctc_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_elu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool2d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_fractional_max_pool3d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_group_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardtanh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_area_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_layer_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_leaky_relu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_margin_ranking_loss_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool1d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool1d_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multilabel_margin_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_nll_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_circular_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_unshuffle_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_prelu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_relu_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_selu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_selu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_silu_complex_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_smooth_l1_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softmin_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softplus_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softshrink_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_threshold_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_unfold_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_upsample_nearest_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_static_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_in_place_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ones_like_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pinverse_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polar_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_3_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_put_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_qr_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rad2deg_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randint_like_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_like_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ravel_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_remainder_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_conj_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_roll_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_roll_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rot90_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_amin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sgn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_short_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_short_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_bartlett_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signal_windows_general_hamming_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_signbit_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_slice_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sparse_sampled_addmm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_y0_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_u_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_w_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_erfcx_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_erfcx_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i0e_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i0e_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i1e_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_legendre_polynomial_p_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k1_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtr_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtri_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_list_args_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sub_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapezoid_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tril_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_indices_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_indices_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trunc_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unflatten_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unfold_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_consecutive_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unravel_index_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_chunk_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsafe_split_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_unbiased_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_unbiased_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_unbiased_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vstack_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_where_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_xlogy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick__upsample_bilinear2d_aa_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_decomposed_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_asin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_and_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_or_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cat_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_ceil_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_ceil_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_clone_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_constant_pad_nd_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_dot_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_logsumexp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_norm_nuc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_squeeze_multiple_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_tril_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_deg2rad_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_dist_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_exp2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_exp2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_expand_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_float8_e4m3fnuz, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_float8_e5m2fnuz, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifft_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_floor_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_fmin_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fmin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_frac_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_frac_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_gt_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_hypot_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_index_add_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_isin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_isnan_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_isposinf_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_le_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_log_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_logaddexp_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_logical_and_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_logical_not_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_logit_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_maximum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nan_to_num_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_native_batch_norm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_native_dropout_backward_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_native_layer_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_new_ones_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_new_zeros_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_elu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_embedding_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardsigmoid_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardswish_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_hardtanh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool2d_grad_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_max_unpool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_pad_constant_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_prelu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_prelu_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu6_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu6_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softshrink_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_norm_inf_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_norm_nuc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_normal_number_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_ones_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_prod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_rot90_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_round_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_rsqrt_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_signbit_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_softmax_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_erfcx_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_i0e_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtr_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_xlog1py_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_special_zeta_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_split_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_sqrt_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_std_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_unbiased_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_unbiased_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_std_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_sub_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_sum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_tanh_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_tril_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_unfold_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_var_mean_unbiased_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_vdot_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_vdot_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_uniform_cuda, test/test_decomp.py::DecompOneOffTestsCUDA::test_native_layer_norm_cpu_decomp_cuda, test/test_decomp.py::HasDecompTest::test_has_decomposition 2025-12-04T11:37:41.5316142Z 2025-12-04T11:37:41.5316253Z Finished test_decomp 2/12 ... [2025-12-04 11:37:41.515713][3574770.040523639], took 11.70min 2025-12-04T11:37:41.5316635Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:37:41.5316988Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:37:41.5317204Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T11:37:41.5317381Z Uploading artifacts took 0.00 seconds 2025-12-04T11:37:41.5317543Z Running test_decomp 8/12 ... [2025-12-04 11:37:41.522331][3574770.047145188] 2025-12-04T11:37:41.5317709Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:37:41.5318072Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_decomp.py', '--shard-id=8', '--num-shards=12', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:37:41.522518] 2025-12-04T11:46:09.4948687Z 2025-12-04T11:46:09.4949387Z test_decomp 8/12 was successful, full logs can be found in artifacts with path test/test-reports/test_decomp_8.12_6133fdee767f2c6a_.log 2025-12-04T11:46:09.5023368Z Running 709 items in this shard: test/test_decomp.py::TestDecompCUDA::test_broadcasting_index_copy_cuda, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive___getitem___cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmod___cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rmul___cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive___ror___cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive___ror___cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive___rpow___cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive__batch_norm_with_update_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__chunk_cat_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive__softmax_backward_data_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive__unsafe_masked_index_put_accumulate_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_abs_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_acosh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_add_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addcmul_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addmm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_addr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_alias_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_angle_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_any_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_arange_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argmax_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_argwhere_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_as_strided_scatter_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_asinh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atan_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atanh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_1d_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_atleast_3d_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_right_shift_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_xor_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bitwise_xor_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_block_diag_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bmm_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bmm_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bool_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_tensors_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_broadcast_to_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_bucketize_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_byte_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cartesian_prod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cat_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cauchy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cauchy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cdouble_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ceil_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chalf_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_char_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cholesky_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_chunk_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_clone_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_combinations_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_conj_physical_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_constant_pad_nd_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_contiguous_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_corrcoef_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cos_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cosh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_count_nonzero_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cross_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cummin_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_cumulative_trapezoid_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diag_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagflat_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_diagonal_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dist_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dist_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_no_rounding_mode_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_div_no_rounding_mode_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dsplit_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_dstack_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_einsum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_like_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_empty_strided_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eq_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_erfc_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_exp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_as_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expand_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_expm1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_float8_e5m2fnuz, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_eye_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_fftn_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_hfft_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifft_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ifftn_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfft2_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_ihfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fft_irfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fill_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flatten_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fliplr_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_flipud_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_float_power_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_floor_divide_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_fmod_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_full_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gather_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gcd_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ge_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_geqrf_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_grid_sampler_2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_grid_sampler_3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_gt_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_half_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hash_tensor_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_heaviside_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_histc_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_hstack_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_igamma_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_add_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_put_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_index_reduce_amin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isclose_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isnan_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isposinf_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_isreal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_item_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_4inputs_with_extra_args_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kron_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_kthvalue_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_le_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lerp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lerp_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_cross_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_eigvalsh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lstsq_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lstsq_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_lu_factor_ex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_power_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_matrix_rank_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_multi_dot_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_norm_subgradients_at_zero_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_slogdet_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_ex_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_solve_triangular_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_linalg_vander_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log10_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log1p_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_log_softmax_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logaddexp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logaddexp_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logcumsumexp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_logspace_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_long_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_lu_unpack_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_amax_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmax_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmin_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_argmin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumprod_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_cumsum_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_fill_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_logsumexp_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_scatter_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_select_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_masked_std_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matrix_exp_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_matrix_exp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_no_dim_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_max_reduction_with_dim_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_maximum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mean_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_median_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_list_of_tensors_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_meshgrid_variadic_tensors_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_min_binary_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_minimum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_movedim_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mul_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmean_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nanmedian_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_narrow_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_native_dropout_backward_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ne_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_empty_strided_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_new_zeros_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_alpha_dropout_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_avg_pool2d_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_bilinear_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_binary_cross_entropy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv2d_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_conv2d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout3d_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_dropout_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_elu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_bag_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_embedding_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_gelu_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardshrink_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardshrink_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardswish_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_hardtanh_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_nearest_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_trilinear_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_interpolate_trilinear_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_l1_loss_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_pool3d_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_max_unpool3d_grad_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_normalize_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_constant_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_reflect_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pad_replicate_negative_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pairwise_distance_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_pixel_shuffle_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_poisson_nll_loss_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_soft_margin_loss_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_softsign_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_threshold_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_threshold_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_threshold_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_loss_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_nonzero_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_norm_fro_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_normal_number_mean_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_ormqr_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_outer_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_permute_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pinverse_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_polygamma_polygamma_n_4_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_positive_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_pow_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rand_like_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_randn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_real_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reciprocal_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_repeat_interleave_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_reshape_as_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize__cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resize_as__cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_resolve_neg_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_round_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_rsub_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scalar_tensor_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_add_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_mean_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_prod_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_sum_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_scatter_reduce_sum_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_searchsorted_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_select_scatter_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_short_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sigmoid_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sign_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinc_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sinh_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_softmax_with_dtype_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sort_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_bessel_j0_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_t_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_chebyshev_polynomial_v_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i0e_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_i0e_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_log_ndtr_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i0_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_i1_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k1_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_modified_bessel_k1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtr_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_ndtr_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k0_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_scaled_modified_bessel_k1_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_spherical_bessel_j0_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_special_xlog1py_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_split_with_sizes_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sqrt_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_square_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_squeeze_multiple_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_stack_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_std_mean_unbiased_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_sum_to_size_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_t_copy_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_take_along_dim_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tanh_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_tile_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_to_sparse_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_topk_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch__scaled_mm_v2_cuda_float8_e4m3fn, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trace_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_transpose_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapezoid_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_trapz_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_triu_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_true_divide_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unbind_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_uniform_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unique_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unravel_index_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unravel_index_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_unsqueeze_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_var_mean_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_copy_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_comprehensive_view_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_comprehensive_vsplit_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_where_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zero__cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_comprehensive_zeros_like_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick__chunk_cat_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick__native_batch_norm_legit_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick__unsafe_masked_index_put_accumulate_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_abs_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_acos_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_acosh_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_add_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_addcmul_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_addmm_decomposed_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_addmv_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_all_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_amax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_amin_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_any_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_atan2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_atan_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_atanh_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_baddbmm_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_bitwise_left_shift_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_block_diag_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_bucketize_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_bucketize_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_cauchy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_max_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_clamp_min_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_complex_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_conj_physical_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_copysign_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_frac_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_logit_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_core_backward_unbind_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_cos_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_cosh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_count_nonzero_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_cumsum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_diag_embed_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_diagonal_scatter_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_digamma_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_div_no_rounding_mode_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_empty_like_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_empty_strided_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_erf_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_erfc_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_erfinv_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_exp2_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_expand_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_eye_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_fftn_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft2_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfft_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_hfftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ifftn_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfft_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_ihfftn_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft2_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_fft_irfft_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfft_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_fft_rfftn_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_fill_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_flip_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_floor_divide_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_fmax_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_fmod_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_gcd_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_ge_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_gt_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_heaviside_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_heaviside_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_igammac_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_index_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_index_fill_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_index_select_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_isin_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_isinf_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_item_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_lcm_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_lerp_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_lgamma_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_cross_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_linalg_diagonal_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_linspace_tensor_overload_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_log10_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_log1p_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_log2_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_logical_or_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_logical_xor_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_logspace_tensor_overload_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_logsumexp_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_lt_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_masked_fill_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_maximum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_maximum_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_list_of_tensors_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_meshgrid_variadic_tensors_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_minimum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_mul_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_nansum_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_narrow_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_native_batch_norm_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_ne_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_neg_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_new_empty_strided_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_nextafter_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_gelu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_gelu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_glu_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_glu_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_relu_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_softshrink_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_nn_functional_unfold_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_norm_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_normal_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_ones_like_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_permute_copy_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_rad2deg_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_remainder_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_repeat_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_roll_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_rsub_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_select_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_select_scatter_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_sigmoid_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_sign_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_sin_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_sinc_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_sinh_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_slice_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_slice_scatter_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_special_entr_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_special_i1e_cuda_float64, test/test_decomp.py::TestDecompCUDA::test_quick_special_log_ndtr_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_special_ndtri_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_split_list_args_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_split_with_sizes_copy_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_copy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_squeeze_multiple_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_stack_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_std_mean_unbiased_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_t_copy_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_t_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_take_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_tan_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_complex128, test/test_decomp.py::TestDecompCUDA::test_quick_trace_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_transpose_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_quick_tril_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_tril_cuda_int8, test/test_decomp.py::TestDecompCUDA::test_quick_triu_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_trunc_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_unbind_cuda_complex32, test/test_decomp.py::TestDecompCUDA::test_quick_uniform_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_unsafe_split_cuda_int64, test/test_decomp.py::TestDecompCUDA::test_quick_unsqueeze_copy_cuda_bfloat16, test/test_decomp.py::TestDecompCUDA::test_quick_var_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_view_copy_cuda_int16, test/test_decomp.py::TestDecompCUDA::test_quick_view_cuda_complex64, test/test_decomp.py::TestDecompCUDA::test_quick_view_cuda_int32, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_float16, test/test_decomp.py::TestDecompCUDA::test_quick_where_cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_xlogy_cuda_bool, test/test_decomp.py::TestDecompCUDA::test_quick_zero__cuda_float32, test/test_decomp.py::TestDecompCUDA::test_quick_zeros_cuda_uint8, test/test_decomp.py::TestDecompCUDA::test_rnn_decomp_module_nn_RNN_train_mode_cuda_float32, test/test_decomp.py::HasDecompTest::test_conv1d_decomposition 2025-12-04T11:46:09.5094195Z 2025-12-04T11:46:09.5094297Z Finished test_decomp 8/12 ... [2025-12-04 11:46:09.494989][3575278.019800089], took 8.47min 2025-12-04T11:46:09.5094670Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:46:09.5095025Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:46:09.5095253Z Running lazy/test_functionalization 1/1 ... [2025-12-04 11:46:09.501616][3575278.026430198] 2025-12-04T11:46:09.5095443Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:46:09.5095837Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_functionalization.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:46:09.501807] 2025-12-04T11:46:11.7693554Z 2025-12-04T11:46:11.7694779Z lazy/test_functionalization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_functionalization_1.1_0a3873e1ab86c868_.log 2025-12-04T11:46:11.7696252Z Running 2 items in this shard: test/lazy/test_functionalization.py::LazyFuncionalizationTest::test_data_assign, test/lazy/test_functionalization.py::LazyFuncionalizationTest::test_lazy_init_with_view 2025-12-04T11:46:11.7697633Z 2025-12-04T11:46:11.7697974Z Finished lazy/test_functionalization 1/1 ... [2025-12-04 11:46:11.769044][3575280.293853764], took 0.04min 2025-12-04T11:46:11.7706316Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:46:11.7761561Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:46:11.7762031Z Running torch_np/test_random 1/1 ... [2025-12-04 11:46:11.775986][3575280.300800348] 2025-12-04T11:46:11.7762354Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:46:11.7763763Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_random.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:46:11.776176] 2025-12-04T11:46:13.8941001Z 2025-12-04T11:46:13.8942152Z torch_np/test_random 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_random_1.1_bd4aed60094f29fc_.log 2025-12-04T11:46:13.8951520Z Running 41 items in this shard: test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_False_func0, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_False_func1, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_False_func2, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_False_func3, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_False_func6, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_False_func7, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_False_random_random, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_False_random_sample, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_True_func0, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_True_func1, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_True_func2, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_True_func3, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_True_func6, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_True_func7, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_True_random_random, test/torch_np/test_random.py::TestScalarReturn::test_rndm_array_use_numpy_True_random_sample, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_False_func0, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_False_func1, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_False_func2, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_False_func3, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_False_func6, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_False_func7, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_False_random_random, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_False_random_sample, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_True_func0, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_True_func1, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_True_func2, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_True_func3, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_True_func6, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_True_func7, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_True_random_random, test/torch_np/test_random.py::TestScalarReturn::test_rndm_scalar_use_numpy_True_random_sample, test/torch_np/test_random.py::TestShuffle::test_1d_use_numpy_False, test/torch_np/test_random.py::TestShuffle::test_1d_use_numpy_True, test/torch_np/test_random.py::TestShuffle::test_2d_use_numpy_False, test/torch_np/test_random.py::TestShuffle::test_2d_use_numpy_True, test/torch_np/test_random.py::TestShuffle::test_shuffle_list_use_numpy_False, test/torch_np/test_random.py::TestShuffle::test_shuffle_list_use_numpy_True, test/torch_np/test_random.py::TestChoice::test_choice_use_numpy_False, test/torch_np/test_random.py::TestChoice::test_choice_use_numpy_True, test/torch_np/test_random.py::TestNumpyGlobal::test_numpy_global 2025-12-04T11:46:13.8958698Z 2025-12-04T11:46:13.8958861Z Finished torch_np/test_random 1/1 ... [2025-12-04 11:46:13.893857][3575282.418666268], took 0.04min 2025-12-04T11:46:13.8959431Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:46:13.9012472Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:46:13.9016031Z Running nn/test_multihead_attention 1/1 ... [2025-12-04 11:46:13.901300][3575282.426113685] 2025-12-04T11:46:13.9016457Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:46:13.9017283Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'nn/test_multihead_attention.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:46:13.901490] 2025-12-04T11:47:44.7512144Z 2025-12-04T11:47:44.7513458Z nn/test_multihead_attention 1/1 was successful, full logs can be found in artifacts with path test/test-reports/nn.test_multihead_attention_1.1_b19c555f91a552ed_.log 2025-12-04T11:47:44.7519967Z Running 20 items in this shard: test/nn/test_multihead_attention.py::TestMultiheadAttentionNN::test_multihead_attention_average_attn_weights_False, test/nn/test_multihead_attention.py::TestMultiheadAttentionNN::test_multihead_attention_average_attn_weights_True, test/nn/test_multihead_attention.py::TestMultiheadAttentionNN::test_multihead_attn_3d_attn_mask, test/nn/test_multihead_attention.py::TestMultiheadAttentionNN::test_multihead_attn_fast_path_invalid_shape, test/nn/test_multihead_attention.py::TestMultiheadAttentionNN::test_multihead_attn_invalid_shape, test/nn/test_multihead_attention.py::TestMultiheadAttentionNN::test_multihead_attn_nested_tensor_outside_fast_path, test/nn/test_multihead_attention.py::TestMultiheadAttentionNN::test_multihead_attn_no_bias, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_fast_path_check_with_mask_does_not_break_in_compile_cuda_float64, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attention_dtype_batch_first_cuda_float16, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attention_dtype_batch_first_cuda_float32, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attention_dtype_batch_first_cuda_float64, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attention_dtype_cuda_float16, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attention_dtype_cuda_float32, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attention_dtype_cuda_float64, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attn_fast_path_query_and_bias_have_different_dtypes_cuda_float64, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attn_fast_path_small_test_cuda_float64, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attn_in_proj_bias_none_cuda_float64, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_attn_in_proj_weight_none_cuda_float64, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_self_attn_two_masks_fast_path_cuda, test/nn/test_multihead_attention.py::TestMultiheadAttentionNNDeviceTypeCUDA::test_multihead_self_attn_two_masks_fast_path_mock_cuda 2025-12-04T11:47:44.7526005Z 2025-12-04T11:47:44.7526215Z Finished nn/test_multihead_attention 1/1 ... [2025-12-04 11:47:44.750791][3575373.275602629], took 1.51min 2025-12-04T11:47:44.7526845Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:47:44.7574445Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:47:44.7576153Z Running lazy/test_bindings 1/1 ... [2025-12-04 11:47:44.757469][3575373.282282535] 2025-12-04T11:47:44.7577687Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:47:44.7578150Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_bindings.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:47:44.757660] 2025-12-04T11:47:46.5747364Z 2025-12-04T11:47:46.5748527Z lazy/test_bindings 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_bindings_1.1_0dcef2b4998656ed_.log 2025-12-04T11:47:46.5749984Z Running 1 items in this shard: test/lazy/test_bindings.py::test_metrics 2025-12-04T11:47:46.5750328Z 2025-12-04T11:47:46.5750622Z Finished lazy/test_bindings 1/1 ... [2025-12-04 11:47:46.574304][3575375.099113613], took 0.03min 2025-12-04T11:47:46.5762462Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:47:46.5816069Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:47:46.5818197Z Running xpu/test_conv 1/1 ... [2025-12-04 11:47:46.581625][3575375.106439479] 2025-12-04T11:47:46.5818545Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:47:46.5819352Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'xpu/test_conv.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:47:46.581810] 2025-12-04T11:47:49.1822533Z 2025-12-04T11:47:49.1823514Z xpu/test_conv 1/1 was successful, full logs can be found in artifacts with path test/test-reports/xpu.test_conv_1.1_69f2c549c2ba4eaf_.log 2025-12-04T11:47:49.1824098Z Running 0 items in this shard: 2025-12-04T11:47:49.1824318Z 2025-12-04T11:47:49.1824565Z Finished xpu/test_conv 1/1 ... [2025-12-04 11:47:49.181886][3575377.706696285], took 0.04min 2025-12-04T11:47:49.1832306Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:47:49.1884770Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:47:49.1886660Z Running test_utils 1/1 ... [2025-12-04 11:47:49.188552][3575377.713366412] 2025-12-04T11:47:49.1886967Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:47:49.1888653Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:47:49.188736] 2025-12-04T11:48:11.5355813Z 2025-12-04T11:48:11.5356889Z test_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_utils_1.1_e9262a35c8404ff9_.log 2025-12-04T11:48:11.6080908Z Running 6014 items in this shard: test/test_utils.py::TestCheckpoint::test_checkpoint, test/test_utils.py::TestCheckpoint::test_checkpoint_module_list, test/test_utils.py::TestCheckpoint::test_checkpoint_no_tensors, test/test_utils.py::TestCheckpoint::test_checkpoint_non_tensor, test/test_utils.py::TestCheckpoint::test_checkpoint_non_tensor_inputs_outputs, test/test_utils.py::TestCheckpoint::test_checkpoint_not_preserve_rng_state_and_without_reentrant, test/test_utils.py::TestCheckpoint::test_checkpoint_partial_grad, test/test_utils.py::TestCheckpoint::test_checkpoint_rng_cpu, test/test_utils.py::TestCheckpoint::test_checkpoint_rng_gpu, test/test_utils.py::TestCheckpoint::test_checkpoint_sequential_deprecated_multiple_args, test/test_utils.py::TestCheckpoint::test_checkpoint_sequential_deprecated_no_args, test/test_utils.py::TestCheckpoint::test_checkpoint_trigger, test/test_utils.py::TestCheckpoint::test_checkpoint_valid, test/test_utils.py::TestCheckpoint::test_checkpointing_without_reentrant_early_free, test/test_utils.py::TestCheckpoint::test_get_device_states_recursive, test/test_utils.py::TestCheckpoint::test_infer_device_state_recursive_meta, test/test_utils.py::TestCheckpoint::test_infer_device_state_recursive_multi_gpu, test/test_utils.py::TestDataLoaderUtils::test_multi_drop, test/test_utils.py::TestDataLoaderUtils::test_multi_keep, test/test_utils.py::TestDataLoaderUtils::test_random_seed, test/test_utils.py::TestDataLoaderUtils::test_single_drop, test/test_utils.py::TestDataLoaderUtils::test_single_keep, test/test_utils.py::TestCollectEnv::test_smoke, test/test_utils.py::TestHipify::test_import_hipify, test/test_utils.py::TestHipifyTrie::test_add_and_search_trie, test/test_utils.py::TestHipifyTrie::test_add_multiple_and_search_trie, test/test_utils.py::TestHipifyTrie::test_char_export_trie_to_regex, test/test_utils.py::TestHipifyTrie::test_export_trie_to_regex, test/test_utils.py::TestHipifyTrie::test_prefix_words_export_trie_to_regex, test/test_utils.py::TestHipifyTrie::test_quote_escape, test/test_utils.py::TestHipifyTrie::test_single_export_trie_to_regex, test/test_utils.py::TestHipifyTrie::test_special_char_export_trie_to_regex, test/test_utils.py::TestAssert::test_assert_scriptable, test/test_utils.py::TestAssert::test_assert_true, test/test_utils.py::TestStandaloneCPPJIT::test_load_standalone, test/test_utils.py::TestRenderUtils::test_basic, test/test_utils.py::TestDeviceUtilsCUDA::test_basic_cuda, test/test_utils.py::TestDeviceUtilsCUDA::test_decorator_cuda, test/test_utils.py::TestDeviceUtilsCUDA::test_decorator_generator_cuda, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_H_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_T_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___getitem___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___radd___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rand___cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rand___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rand___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rand___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rand___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rand___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rdiv___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmatmul___cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmatmul___cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmatmul___cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmatmul___cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmatmul___cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmatmul___cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmod___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rmul___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___ror___cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___ror___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___ror___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___ror___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___ror___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___ror___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rpow___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rsub___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rxor___cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rxor___cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rxor___cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rxor___cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rxor___cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops___rxor___cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__batch_norm_with_update_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__batch_norm_with_update_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__batch_norm_with_update_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__batch_norm_with_update_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__chunk_cat_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__native_batch_norm_legit_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__native_batch_norm_legit_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__native_batch_norm_legit_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__native_batch_norm_legit_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__segment_reduce_lengths_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__segment_reduce_lengths_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__segment_reduce_lengths_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__segment_reduce_lengths_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__segment_reduce_offsets_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__segment_reduce_offsets_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__segment_reduce_offsets_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__segment_reduce_offsets_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__softmax_backward_data_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__softmax_backward_data_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__softmax_backward_data_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__softmax_backward_data_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__upsample_bilinear2d_aa_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__upsample_bilinear2d_aa_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__upsample_bilinear2d_aa_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops__upsample_bilinear2d_aa_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_abs_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acos_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_acosh_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_add_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addbmm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addbmm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addbmm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addbmm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addbmm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addbmm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcdiv_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcdiv_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcdiv_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcdiv_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcdiv_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcdiv_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addcmul_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_decomposed_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_decomposed_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_decomposed_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_decomposed_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_decomposed_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmm_decomposed_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmv_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmv_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmv_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmv_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmv_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addmv_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_addr_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_alias_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_all_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_allclose_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_allclose_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_allclose_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_allclose_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_allclose_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_allclose_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_amin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_aminmax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_angle_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_any_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_arange_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argmin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argsort_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_argwhere_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_partial_views_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_as_strided_scatter_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_asinh_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atan_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atanh_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_1d_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_2d_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_atleast_3d_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_baddbmm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_baddbmm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_baddbmm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_baddbmm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_baddbmm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_baddbmm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bernoulli_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bernoulli_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bernoulli_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bernoulli_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bfloat16_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bincount_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bincount_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bincount_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bincount_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bincount_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_and_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_and_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_and_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_and_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_and_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_and_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_left_shift_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_left_shift_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_left_shift_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_left_shift_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_left_shift_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_not_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_not_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_not_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_not_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_not_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_not_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_or_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_or_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_or_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_or_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_or_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_or_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_right_shift_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_right_shift_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_right_shift_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_right_shift_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_right_shift_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_xor_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_xor_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_xor_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_xor_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_xor_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bitwise_xor_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_block_diag_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bmm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bmm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bmm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bmm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bmm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bmm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bool_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_shapes_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_tensors_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_broadcast_to_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_bucketize_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_byte_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cartesian_prod_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cat_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cauchy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cauchy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cauchy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cauchy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdist_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdist_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cdouble_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ceil_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cfloat_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chalf_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_char_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_inverse_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_inverse_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_inverse_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_inverse_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_solve_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_solve_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_solve_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cholesky_solve_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_chunk_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_max_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clamp_min_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_clone_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_column_stack_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_combinations_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_complex_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_complex_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_complex_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_conj_physical_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_constant_pad_nd_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_contiguous_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_copysign_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_corrcoef_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cos_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cosh_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_count_nonzero_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cov_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cross_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cummin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumprod_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumsum_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_cumulative_trapezoid_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_deg2rad_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diag_embed_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagflat_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diagonal_scatter_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_diff_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_digamma_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dist_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dist_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dist_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dist_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dist_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dist_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_floor_rounding_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_no_rounding_mode_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_div_trunc_rounding_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dot_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dot_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dot_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dot_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dot_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dot_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_double_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dsplit_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_dstack_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_einsum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_einsum_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_einsum_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_einsum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_einsum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_einsum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_like_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_permuted_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_empty_strided_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eq_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_equal_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erf_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfc_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_erfinv_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exp_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_as_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expand_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_expm1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exponential_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exponential_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exponential_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_exponential_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_float8_e4m3fn, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_float8_e4m3fnuz, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_float8_e5m2, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_float8_e5m2fnuz, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_eye_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fft_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftn_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_fftshift_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfft_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_hfftn_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifft_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftn_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ifftshift_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfft_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_ihfftn_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfft_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_irfftn_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfft_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fft_rfftn_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fill_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flatten_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flip_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fliplr_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_flipud_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_float_power_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_floor_divide_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_fmod_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_frac_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_frac_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_frac_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_frac_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_frexp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_frexp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_frexp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_frexp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_uint16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_uint32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_full_like_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gather_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gcd_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gcd_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gcd_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gcd_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gcd_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ge_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geometric_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geqrf_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geqrf_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geqrf_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_geqrf_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gradient_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_grid_sampler_2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_grid_sampler_2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_grid_sampler_2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_grid_sampler_2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_grid_sampler_3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_grid_sampler_3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_grid_sampler_3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_grid_sampler_3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_gt_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_half_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hash_tensor_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_heaviside_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_histc_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_histc_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_histc_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_histc_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_histc_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_histc_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_histc_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hsplit_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hstack_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hypot_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hypot_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hypot_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_hypot_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_i0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_igamma_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_igamma_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_igammac_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_igammac_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_imag_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_imag_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_imag_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_add_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_fill_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_put_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_amin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_mean_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_reduce_prod_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_index_select_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_inner_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_inner_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_inner_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_inner_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_inner_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_inner_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_int_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isclose_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isfinite_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isinf_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isnan_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isneginf_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isposinf_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_isreal_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_istft_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_istft_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_item_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_2inputs_2outputs_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_binary_return_by_ref_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_jiterator_unary_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kron_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_kthvalue_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lcm_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lcm_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lcm_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lcm_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lcm_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ldexp_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_le_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lerp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lerp_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lerp_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lerp_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lerp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lerp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lerp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lgamma_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cholesky_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cholesky_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cholesky_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cholesky_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cholesky_ex_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cholesky_ex_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cholesky_ex_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cholesky_ex_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cond_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cond_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cond_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cond_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_cross_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_det_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_det_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_det_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_det_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_diagonal_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eig_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eig_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eig_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eig_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigh_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigh_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigvals_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigvals_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigvals_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigvals_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigvalsh_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigvalsh_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigvalsh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_eigvalsh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_householder_product_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_householder_product_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_householder_product_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_householder_product_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_inv_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_inv_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_inv_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_inv_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_inv_ex_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_inv_ex_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_inv_ex_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_inv_ex_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_factor_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_factor_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_factor_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_factor_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_factor_ex_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_factor_ex_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_factor_ex_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_factor_ex_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_solve_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_solve_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_solve_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_ldl_solve_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lstsq_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lstsq_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lstsq_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lstsq_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lstsq_grad_oriented_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lstsq_grad_oriented_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lstsq_grad_oriented_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lstsq_grad_oriented_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_factor_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_factor_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_factor_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_factor_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_factor_ex_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_factor_ex_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_factor_ex_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_factor_ex_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_solve_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_solve_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_solve_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_lu_solve_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_norm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_norm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_power_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_power_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_power_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_power_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_rank_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_rank_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_rank_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_rank_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_rank_hermitian_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_rank_hermitian_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_rank_hermitian_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_matrix_rank_hermitian_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_multi_dot_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_multi_dot_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_multi_dot_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_multi_dot_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_multi_dot_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_multi_dot_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_subgradients_at_zero_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_subgradients_at_zero_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_norm_subgradients_at_zero_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_hermitian_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_hermitian_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_hermitian_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_hermitian_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_singular_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_singular_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_singular_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_pinv_singular_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_qr_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_qr_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_qr_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_qr_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_slogdet_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_slogdet_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_slogdet_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_slogdet_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_ex_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_ex_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_ex_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_ex_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_triangular_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_triangular_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_triangular_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_solve_triangular_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_svd_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_svd_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_svd_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_svd_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_svdvals_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_svdvals_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_svdvals_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_svdvals_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_tensorinv_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_tensorinv_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_tensorinv_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_tensorinv_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_tensorsolve_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_tensorsolve_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_tensorsolve_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_tensorsolve_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vander_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vecdot_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vecdot_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vecdot_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vecdot_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vecdot_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vecdot_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vector_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vector_norm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vector_norm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vector_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vector_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linalg_vector_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_linspace_tensor_overload_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log10_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log1p_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_normal_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_normal_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_normal_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_normal_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_log_softmax_with_dtype_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp2_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logaddexp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logcumsumexp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logcumsumexp_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logcumsumexp_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logcumsumexp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logcumsumexp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logcumsumexp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logdet_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logdet_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logdet_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logdet_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_and_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_not_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_or_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logical_xor_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logit_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logspace_tensor_overload_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_logsumexp_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_long_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lt_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_solve_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_solve_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_solve_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_solve_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_unpack_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_unpack_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_unpack_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_lu_unpack_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mH_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mT_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_amin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_argmin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumprod_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_cumsum_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_fill_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_log_softmax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_log_softmax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_log_softmax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_log_softmax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logaddexp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logaddexp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logaddexp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logaddexp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_logsumexp_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_mean_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_mean_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_mean_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_mean_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_mean_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_mean_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_median_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_median_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_median_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_median_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_normalize_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_normalize_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_normalize_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_normalize_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_normalize_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_normalize_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_prod_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_scatter_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_select_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_softmax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_softmax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_softmax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_softmax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_softmin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_softmin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_softmin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_softmin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_std_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_sum_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_masked_var_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matmul_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matmul_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matmul_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matmul_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matmul_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matmul_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matrix_exp_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matrix_exp_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matrix_exp_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matrix_exp_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matrix_exp_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_matrix_exp_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_binary_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_pool2d_with_indices_backward_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_pool2d_with_indices_backward_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_pool2d_with_indices_backward_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_no_dim_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_max_reduction_with_dim_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_maximum_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mean_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mean_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mean_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mean_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mean_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mean_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_median_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_list_of_tensors_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_meshgrid_variadic_tensors_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_binary_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_no_dim_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_min_reduction_with_dim_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_minimum_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mode_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_movedim_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_msort_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mul_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_multinomial_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_multinomial_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_multinomial_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_multinomial_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mv_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mv_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mv_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mv_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mv_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mv_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nan_to_num_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmean_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmean_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmean_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmean_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmean_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmean_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmean_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanmedian_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanquantile_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nanquantile_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nansum_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_narrow_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_batch_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_batch_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_batch_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_batch_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_dropout_backward_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_dropout_backward_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_dropout_backward_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_dropout_backward_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_layer_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_layer_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_layer_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_native_layer_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ne_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_neg_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_empty_strided_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_full_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_ones_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_new_zeros_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nextafter_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nextafter_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nextafter_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nextafter_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_alpha_dropout_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_alpha_dropout_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_alpha_dropout_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_alpha_dropout_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool1d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool1d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool1d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool1d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_avg_pool3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_batch_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_batch_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_batch_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_batch_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_bilinear_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_bilinear_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_bilinear_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_bilinear_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_binary_cross_entropy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_binary_cross_entropy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_binary_cross_entropy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_celu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_celu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_celu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_celu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_channel_shuffle_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv1d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv1d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv1d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv1d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv1d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv1d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv1d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv2d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv2d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv2d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv3d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv3d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv3d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose1d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose1d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose1d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose1d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose1d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose1d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose2d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose2d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose2d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose3d_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose3d_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose3d_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_conv_transpose3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_similarity_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_similarity_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_similarity_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cosine_similarity_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cross_entropy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cross_entropy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cross_entropy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_cross_entropy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_ctc_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_ctc_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_dropout_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_elu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_elu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_elu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_elu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_embedding_bag_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_embedding_bag_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_embedding_bag_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_embedding_bag_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_embedding_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_embedding_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_embedding_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_embedding_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_fractional_max_pool2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_fractional_max_pool2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_fractional_max_pool2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_fractional_max_pool3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_fractional_max_pool3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_fractional_max_pool3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_gaussian_nll_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_gaussian_nll_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_gaussian_nll_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_gelu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_gelu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_gelu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_gelu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_glu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_glu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_glu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_glu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_grid_sample_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_grid_sample_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_grid_sample_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_grid_sample_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_group_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_group_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_group_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_group_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardshrink_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardshrink_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardshrink_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardshrink_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardsigmoid_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardsigmoid_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardsigmoid_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardsigmoid_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardswish_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardswish_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardswish_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardswish_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardtanh_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardtanh_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardtanh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardtanh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardtanh_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardtanh_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardtanh_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hardtanh_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hinge_embedding_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hinge_embedding_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_hinge_embedding_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_huber_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_huber_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_huber_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_huber_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_instance_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_instance_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_instance_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_instance_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_area_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_area_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_area_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_area_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_bicubic_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_bicubic_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_bicubic_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_bilinear_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_bilinear_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_bilinear_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_linear_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_linear_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_linear_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_linear_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_nearest_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_trilinear_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_trilinear_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_interpolate_trilinear_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_kl_div_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_kl_div_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_kl_div_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_kl_div_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_l1_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_l1_loss_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_l1_loss_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_l1_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_l1_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_l1_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_layer_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_layer_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_layer_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_layer_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_leaky_relu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_leaky_relu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_leaky_relu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_leaky_relu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_linear_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_linear_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_linear_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_linear_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_linear_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_linear_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_local_response_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_local_response_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_local_response_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_local_response_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_logsigmoid_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_logsigmoid_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_logsigmoid_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_logsigmoid_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_margin_ranking_loss_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool1d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool1d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool1d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool1d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_pool3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool1d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool1d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool1d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool1d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool1d_grad_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool1d_grad_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool1d_grad_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool2d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool2d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool2d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool2d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool2d_grad_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool2d_grad_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool2d_grad_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool3d_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool3d_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool3d_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool3d_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool3d_grad_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool3d_grad_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_max_unpool3d_grad_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_mish_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_mish_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_mish_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_mish_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_mse_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_mse_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_mse_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_mse_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multi_head_attention_forward_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multi_head_attention_forward_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multi_head_attention_forward_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multi_margin_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multi_margin_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multi_margin_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multilabel_margin_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multilabel_margin_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multilabel_margin_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_nll_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_nll_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_nll_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_nll_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_normalize_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_normalize_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_normalize_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_normalize_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_normalize_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_normalize_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_one_hot_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_circular_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_constant_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_reflect_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pad_replicate_negative_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pairwise_distance_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pdist_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pdist_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_shuffle_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_pixel_unshuffle_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_poisson_nll_loss_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_prelu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_prelu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_prelu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_prelu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu6_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_relu_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rms_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rms_norm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rms_norm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rms_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rms_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rms_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rrelu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rrelu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rrelu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_rrelu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_selu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_selu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_selu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_selu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_silu_complex_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_silu_complex_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_silu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_silu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_silu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_silu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_smooth_l1_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_smooth_l1_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_smooth_l1_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_soft_margin_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_soft_margin_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_soft_margin_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softmin_with_dtype_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softplus_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softplus_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softplus_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softplus_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softshrink_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softshrink_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softshrink_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softshrink_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_softsign_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_tanhshrink_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_threshold_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_loss_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_unfold_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_unfold_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_unfold_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_unfold_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_unfold_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_unfold_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_unfold_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_bilinear_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_bilinear_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_bilinear_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_nearest_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_nearest_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_nearest_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_nearest_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nn_functional_upsample_nearest_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_nonzero_static_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_fro_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_fro_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_fro_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_fro_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_fro_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_fro_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_inf_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_inf_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_inf_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_inf_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_inf_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_inf_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_nuc_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_nuc_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_nuc_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_norm_nuc_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_in_place_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_in_place_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_in_place_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_in_place_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_in_place_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_in_place_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_number_mean_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_number_mean_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_number_mean_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_normal_number_mean_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ones_like_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ormqr_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ormqr_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ormqr_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ormqr_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_outer_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pca_lowrank_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pca_lowrank_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pca_lowrank_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pca_lowrank_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_permute_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pinverse_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pinverse_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pinverse_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pinverse_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polar_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polar_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_2_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_3_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_polygamma_polygamma_n_4_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_positive_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_pow_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_prod_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_put_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_qr_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_qr_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_qr_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_qr_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_quantile_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_quantile_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rad2deg_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rand_like_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rand_like_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rand_like_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rand_like_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rand_like_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rand_like_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rand_like_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randint_like_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_like_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_like_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_like_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_like_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_like_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_like_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_randn_like_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_ravel_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_real_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reciprocal_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_remainder_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_renorm_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_renorm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_renorm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_renorm_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_renorm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_renorm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_repeat_interleave_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_as_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_reshape_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize__cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resize_as__cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_conj_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_resolve_neg_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_roll_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rot90_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_0_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_0_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_3_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_3_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_3_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_3_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_neg_3_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_neg_3_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_neg_3_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_round_decimals_neg_3_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsqrt_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_rsub_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scalar_tensor_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_add_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amax_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_amin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_mean_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_prod_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_scatter_reduce_sum_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_searchsorted_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_select_scatter_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sgn_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_short_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sigmoid_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sign_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_bartlett_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_bartlett_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_blackman_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_blackman_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_cosine_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_cosine_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_exponential_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_exponential_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_gaussian_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_gaussian_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_general_cosine_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_general_cosine_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_general_hamming_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_general_hamming_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_hamming_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_hamming_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_hann_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_hann_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_kaiser_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_kaiser_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_nuttall_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signal_windows_nuttall_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_signbit_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sin_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinc_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sinh_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_slice_scatter_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_softmax_with_dtype_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sort_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sparse_mm_reduce_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sparse_mm_reduce_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sparse_mm_reduce_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sparse_mm_reduce_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sparse_sampled_addmm_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sparse_sampled_addmm_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sparse_sampled_addmm_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sparse_sampled_addmm_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_airy_ai_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_airy_ai_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_airy_ai_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_airy_ai_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_airy_ai_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_airy_ai_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_airy_ai_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_airy_ai_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j1_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_j1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y1_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_bessel_y1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_t_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_t_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_t_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_t_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_t_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_t_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_t_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_t_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_u_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_u_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_u_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_u_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_u_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_u_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_u_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_u_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_v_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_v_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_v_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_v_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_v_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_v_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_v_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_v_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_w_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_w_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_w_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_w_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_w_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_w_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_w_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_chebyshev_polynomial_w_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_entr_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_erfcx_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_erfcx_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_erfcx_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_erfcx_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_erfcx_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_erfcx_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_erfcx_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_erfcx_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_h_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_h_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_h_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_h_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_h_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_h_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_h_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_h_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_he_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_he_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_he_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_he_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_he_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_he_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_he_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_hermite_polynomial_he_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i0e_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_i1e_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_laguerre_polynomial_l_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_laguerre_polynomial_l_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_laguerre_polynomial_l_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_laguerre_polynomial_l_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_laguerre_polynomial_l_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_laguerre_polynomial_l_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_laguerre_polynomial_l_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_laguerre_polynomial_l_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_legendre_polynomial_p_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_legendre_polynomial_p_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_legendre_polynomial_p_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_legendre_polynomial_p_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_legendre_polynomial_p_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_legendre_polynomial_p_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_legendre_polynomial_p_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_legendre_polynomial_p_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_log_ndtr_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_log_ndtr_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_log_ndtr_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_log_ndtr_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_log_ndtr_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_log_ndtr_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_log_ndtr_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_log_ndtr_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i1_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_i1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k1_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_modified_bessel_k1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtr_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtri_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtri_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtri_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtri_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtri_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtri_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtri_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_ndtri_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k1_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k1_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k1_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k1_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k1_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k1_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k1_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_scaled_modified_bessel_k1_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_spherical_bessel_j0_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_spherical_bessel_j0_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_spherical_bessel_j0_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_spherical_bessel_j0_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_spherical_bessel_j0_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_spherical_bessel_j0_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_spherical_bessel_j0_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_spherical_bessel_j0_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_xlog1py_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_zeta_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_zeta_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_zeta_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_zeta_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_zeta_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_zeta_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_zeta_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_special_zeta_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_list_args_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_split_with_sizes_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sqrt_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_square_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_squeeze_multiple_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stack_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_unbiased_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_unbiased_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_unbiased_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_unbiased_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_unbiased_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_mean_unbiased_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_unbiased_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_unbiased_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_unbiased_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_unbiased_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_unbiased_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_std_unbiased_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stft_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stft_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stft_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_stft_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sub_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_sum_to_size_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_svd_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_svd_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_svd_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_svd_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_svd_lowrank_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_svd_lowrank_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_svd_lowrank_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_svd_lowrank_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_t_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_along_dim_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_take_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tan_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tanh_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensor_split_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensordot_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensordot_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensordot_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensordot_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensordot_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tensordot_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tile_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_to_sparse_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_topk_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch__scaled_mm_cuda_float8_e4m3fn, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch__scaled_mm_v2_cuda_float8_e4m3fn, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trace_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_transpose_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapezoid_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trapz_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triangular_solve_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triangular_solve_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triangular_solve_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triangular_solve_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_indices_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_tril_indices_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_indices_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_triu_indices_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_true_divide_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_trunc_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unbind_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unflatten_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unfold_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_uniform_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_uniform_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_uniform_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_uniform_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_uniform_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_uniform_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_consecutive_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_uint16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_uint32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_uint64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unique_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unravel_index_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unravel_index_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unravel_index_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unravel_index_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unravel_index_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_chunk_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsafe_split_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_unsqueeze_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_unbiased_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_unbiased_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_unbiased_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_unbiased_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_unbiased_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_mean_unbiased_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_unbiased_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_unbiased_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_unbiased_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_unbiased_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_unbiased_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_var_unbiased_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vdot_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vdot_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vdot_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vdot_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vdot_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vdot_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_complex_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_complex_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_complex_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_real_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_as_real_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_copy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_view_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vsplit_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_vstack_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_where_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_xlogy_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zero__cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_bfloat16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_bool, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_complex128, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_complex32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_complex64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_float16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_float32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_float64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_int16, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_int32, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_int64, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_int8, test/test_utils.py::TestDeviceUtilsCUDA::test_device_mode_ops_zeros_like_cuda_uint8, test/test_utils.py::TestDeviceUtilsCUDA::test_get_default_device_cuda, test/test_utils.py::TestDeviceUtilsCUDA::test_get_default_device_more_cuda, test/test_utils.py::TestDeviceUtilsCUDA::test_nn_module_cuda, test/test_utils.py::TestDeviceUtilsCUDA::test_set_default_device_cuda, test/test_utils.py::TestCppExtensionUtils::test_cc_compiler_is_ok, test/test_utils.py::TestCppExtensionUtils::test_cpp_compiler_is_ok, test/test_utils.py::TestTraceback::test_basic, test/test_utils.py::TestTraceback::test_captured_traceback, test/test_utils.py::TestTraceback::test_captured_traceback_format_all, test/test_utils.py::TestTraceback::test_captured_traceback_format_all_cached, test/test_utils.py::TestTraceback::test_format_traceback_short, test/test_utils.py::TestTryImport::test_import_existing, test/test_utils.py::TestTryImport::test_import_imported, test/test_utils.py::TestTryImport::test_import_missing, test/test_utils.py::TestDeprecate::test_deprecated 2025-12-04T11:48:11.6745824Z 2025-12-04T11:48:11.6745937Z Finished test_utils 1/1 ... [2025-12-04 11:48:11.540153][3575400.064961697], took 0.37min 2025-12-04T11:48:11.6746319Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:48:11.6746682Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:48:11.6746903Z Running test_pytree 1/1 ... [2025-12-04 11:48:11.546860][3575400.071674023] 2025-12-04T11:48:11.6747075Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:48:11.6747449Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_pytree.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:48:11.547047] 2025-12-04T11:48:14.6716174Z 2025-12-04T11:48:14.6717112Z test_pytree 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_pytree_1.1_dc3ae5a73daae082_.log 2025-12-04T11:48:14.6735030Z Running 100 items in this shard: test/test_pytree.py::TestGenericPytree::test_aligned_public_apis, test/test_pytree.py::TestGenericPytree::test_broadcast_to_and_flatten_cxx, test/test_pytree.py::TestGenericPytree::test_broadcast_to_and_flatten_python, test/test_pytree.py::TestGenericPytree::test_enum_treespec_roundtrip_cxx, test/test_pytree.py::TestGenericPytree::test_enum_treespec_roundtrip_python, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_defaultdict_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_defaultdict_python, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_deque_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_deque_python, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_dict_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_dict_python, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_leaf_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_leaf_python, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_list_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_list_python, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_namedtuple_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_namedtuple_python, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_nested_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_nested_python, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_ordereddict_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_ordereddict_python, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_return_types_max_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_return_types_max_python, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_return_types_min_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_return_types_min_python, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_tuple_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_unflatten_tuple_python, test/test_pytree.py::TestGenericPytree::test_flatten_with_is_leaf_cxx, test/test_pytree.py::TestGenericPytree::test_flatten_with_is_leaf_python, test/test_pytree.py::TestGenericPytree::test_is_namedtuple_cxx, test/test_pytree.py::TestGenericPytree::test_is_namedtuple_python, test/test_pytree.py::TestGenericPytree::test_is_structseq_cxx, test/test_pytree.py::TestGenericPytree::test_is_structseq_python, test/test_pytree.py::TestGenericPytree::test_pytree_serialize_bad_input_cxx, test/test_pytree.py::TestGenericPytree::test_pytree_serialize_bad_input_python, test/test_pytree.py::TestGenericPytree::test_register_pytree_node_cxx, test/test_pytree.py::TestGenericPytree::test_register_pytree_node_python, test/test_pytree.py::TestGenericPytree::test_tree_all_any_cxx, test/test_pytree.py::TestGenericPytree::test_tree_all_any_python, test/test_pytree.py::TestGenericPytree::test_tree_map_cxx, test/test_pytree.py::TestGenericPytree::test_tree_map_dict_order_cxx, test/test_pytree.py::TestGenericPytree::test_tree_map_dict_order_python, test/test_pytree.py::TestGenericPytree::test_tree_map_multi_inputs_cxx, test/test_pytree.py::TestGenericPytree::test_tree_map_multi_inputs_python, test/test_pytree.py::TestGenericPytree::test_tree_map_only_cxx, test/test_pytree.py::TestGenericPytree::test_tree_map_only_predicate_fn_cxx, test/test_pytree.py::TestGenericPytree::test_tree_map_only_predicate_fn_python, test/test_pytree.py::TestGenericPytree::test_tree_map_only_python, test/test_pytree.py::TestGenericPytree::test_tree_map_python, test/test_pytree.py::TestPythonPytree::test_constant, test/test_pytree.py::TestPythonPytree::test_constant_default_eq_error, test/test_pytree.py::TestPythonPytree::test_constant_default_hash_error, test/test_pytree.py::TestPythonPytree::test_dataclass, test/test_pytree.py::TestPythonPytree::test_deprecated_register_pytree_node, test/test_pytree.py::TestPythonPytree::test_flatten_flatten_with_key_consistency, test/test_pytree.py::TestPythonPytree::test_import_pytree_doesnt_import_optree, test/test_pytree.py::TestPythonPytree::test_key_access, test/test_pytree.py::TestPythonPytree::test_key_str, test/test_pytree.py::TestPythonPytree::test_pytree_context_serialize_bad, test/test_pytree.py::TestPythonPytree::test_pytree_custom_type_serialize, test/test_pytree.py::TestPythonPytree::test_pytree_custom_type_serialize_bad, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_bad_protocol, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_defaultdict_enum, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_enum, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_namedtuple, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_namedtuple_bad, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_register_bad, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec0, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec1, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec2, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec3, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec4, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec5, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec6, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec7, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec8, test/test_pytree.py::TestPythonPytree::test_pytree_serialize_spec9, test/test_pytree.py::TestPythonPytree::test_register_dataclass_class, test/test_pytree.py::TestPythonPytree::test_saved_serialized, test/test_pytree.py::TestPythonPytree::test_tree_flatten_with_path_is_leaf, test/test_pytree.py::TestPythonPytree::test_tree_flatten_with_path_roundtrip, test/test_pytree.py::TestPythonPytree::test_tree_leaves_with_path, test/test_pytree.py::TestPythonPytree::test_tree_map_with_path, test/test_pytree.py::TestPythonPytree::test_tree_map_with_path_multiple_trees, test/test_pytree.py::TestPythonPytree::test_treespec_equality, test/test_pytree.py::TestPythonPytree::test_treespec_repr, test/test_pytree.py::TestCxxPytree::test_pytree_custom_type_serialize, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_namedtuple, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec0, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec1, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec2, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec3, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec4, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec5, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec6, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec7, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec8, test/test_pytree.py::TestCxxPytree::test_pytree_serialize_spec9, test/test_pytree.py::TestCxxPytree::test_treespec_equality, test/test_pytree.py::TestCxxPytree::test_treespec_repr 2025-12-04T11:48:14.6746331Z 2025-12-04T11:48:14.6746433Z Finished test_pytree 1/1 ... [2025-12-04 11:48:14.671397][3575403.196207417], took 0.05min 2025-12-04T11:48:14.6746826Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:48:14.6779549Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:48:14.6781440Z Running test_namedtuple_return_api 1/1 ... [2025-12-04 11:48:14.678053][3575403.202867054] 2025-12-04T11:48:14.6781635Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:48:14.6783643Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_namedtuple_return_api.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:48:14.678248] 2025-12-04T11:48:18.0006121Z 2025-12-04T11:48:18.0006902Z test_namedtuple_return_api 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_namedtuple_return_api_1.1_b7b6808e8182315e_.log 2025-12-04T11:48:18.0007966Z Running 3 items in this shard: test/test_namedtuple_return_api.py::TestNamedTupleAPI::test_import_return_types, test/test_namedtuple_return_api.py::TestNamedTupleAPI::test_namedtuple_return, test/test_namedtuple_return_api.py::TestNamedTupleAPI::test_native_functions_yaml 2025-12-04T11:48:18.0008617Z 2025-12-04T11:48:18.0008827Z Finished test_namedtuple_return_api 1/1 ... [2025-12-04 11:48:18.000346][3575406.525154176], took 0.06min 2025-12-04T11:48:18.0017994Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:48:18.0069093Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:48:18.0071063Z Running profiler/test_record_function 1/1 ... [2025-12-04 11:48:18.006998][3575406.531812443] 2025-12-04T11:48:18.0071343Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:48:18.0073292Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'profiler/test_record_function.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:48:18.007182] 2025-12-04T11:48:20.2249535Z 2025-12-04T11:48:20.2250504Z profiler/test_record_function 1/1 was successful, full logs can be found in artifacts with path test/test-reports/profiler.test_record_function_1.1_db898f639cb5799c_.log 2025-12-04T11:48:20.2253119Z Running 6 items in this shard: test/profiler/test_record_function.py::TestRecordFunction::test_datapipe_delegation_with_profiler, test/profiler/test_record_function.py::TestRecordFunction::test_datapipe_with_record_function, test/profiler/test_record_function.py::TestRecordFunction::test_datapipe_with_record_function_fork, test/profiler/test_record_function.py::TestRecordFunction::test_python_dispatch_mode_record_function, test/profiler/test_record_function.py::TestRecordFunction::test_python_subclass_record_function, test/profiler/test_record_function.py::TestRecordFunction::test_record_function 2025-12-04T11:48:20.2254938Z 2025-12-04T11:48:20.2255529Z Finished profiler/test_record_function 1/1 ... [2025-12-04 11:48:20.224684][3575408.749492632], took 0.04min 2025-12-04T11:48:20.2269974Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:48:20.2324572Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:48:20.2328386Z Running test_compile_benchmark_util 1/1 ... [2025-12-04 11:48:20.232549][3575408.75735623] 2025-12-04T11:48:20.2328699Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:48:20.2329456Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_compile_benchmark_util.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:48:20.232769] 2025-12-04T11:48:25.9560074Z 2025-12-04T11:48:25.9561225Z test_compile_benchmark_util 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_compile_benchmark_util_1.1_cc788c1cadfb723e_.log 2025-12-04T11:48:25.9562598Z Running 1 items in this shard: test/test_compile_benchmark_util.py::TestCompileBenchmarkUtil::test_training_and_inference 2025-12-04T11:48:25.9563113Z 2025-12-04T11:48:25.9563402Z Finished test_compile_benchmark_util 1/1 ... [2025-12-04 11:48:25.955720][3575414.480528464], took 0.10min 2025-12-04T11:48:25.9578514Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:48:25.9633037Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:48:25.9634879Z Running test_set_default_mobile_cpu_allocator 1/1 ... [2025-12-04 11:48:25.963359][3575414.488173355] 2025-12-04T11:48:25.9635204Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:48:25.9636529Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_set_default_mobile_cpu_allocator.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:48:25.963551] 2025-12-04T11:48:28.1812488Z 2025-12-04T11:48:28.1813630Z test_set_default_mobile_cpu_allocator 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_set_default_mobile_cpu_allocator_1.1_ab994fd2aff7db46_.log 2025-12-04T11:48:28.1814932Z Running 2 items in this shard: test/test_set_default_mobile_cpu_allocator.py::TestSetDefaultMobileCPUAllocator::test_exception, test/test_set_default_mobile_cpu_allocator.py::TestSetDefaultMobileCPUAllocator::test_no_exception 2025-12-04T11:48:28.1815668Z 2025-12-04T11:48:28.1815939Z Finished test_set_default_mobile_cpu_allocator 1/1 ... [2025-12-04 11:48:28.180899][3575416.705708445], took 0.04min 2025-12-04T11:48:28.1828002Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:48:28.1881107Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:48:28.1882987Z Running test_fake_tensor 1/1 ... [2025-12-04 11:48:28.188147][3575416.712961302] 2025-12-04T11:48:28.1883301Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:48:28.1884539Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_fake_tensor.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:48:28.188340] 2025-12-04T11:48:44.3854211Z 2025-12-04T11:48:44.3855248Z test_fake_tensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_fake_tensor_1.1_bd8e0be8f2c5da5d_.log 2025-12-04T11:48:44.3900942Z Running 288 items in this shard: test/test_fake_tensor.py::FakeTensorTest::test__adaptive_avg_pool2d_backward, test/test_fake_tensor.py::FakeTensorTest::test_alias_call, test/test_fake_tensor.py::FakeTensorTest::test_allow_meta, test/test_fake_tensor.py::FakeTensorTest::test_aten_copy_multi_device, test/test_fake_tensor.py::FakeTensorTest::test_aten_index_multi_device, test/test_fake_tensor.py::FakeTensorTest::test_aten_slice_scatter_multi_device, test/test_fake_tensor.py::FakeTensorTest::test_basic, test/test_fake_tensor.py::FakeTensorTest::test_batch_tensor, test/test_fake_tensor.py::FakeTensorTest::test_binary_op_type_promotion, test/test_fake_tensor.py::FakeTensorTest::test_constructor, test/test_fake_tensor.py::FakeTensorTest::test_conv_nhwc, test/test_fake_tensor.py::FakeTensorTest::test_convert_fake_to_real, test/test_fake_tensor.py::FakeTensorTest::test_cpu_fallback, test/test_fake_tensor.py::FakeTensorTest::test_cuda_initialized, test/test_fake_tensor.py::FakeTensorTest::test_cuda_lstm, test/test_fake_tensor.py::FakeTensorTest::test_cudnn_rnn_with_fallback, test/test_fake_tensor.py::FakeTensorTest::test_cudnn_rnn_without_fallback, test/test_fake_tensor.py::FakeTensorTest::test_custom_op_fallback, test/test_fake_tensor.py::FakeTensorTest::test_data_dependent_operator, test/test_fake_tensor.py::FakeTensorTest::test_deepcopy, test/test_fake_tensor.py::FakeTensorTest::test_device_inplace_copy, test/test_fake_tensor.py::FakeTensorTest::test_embedding_bag_meta, test/test_fake_tensor.py::FakeTensorTest::test_export_numpy, test/test_fake_tensor.py::FakeTensorTest::test_fake_device, test/test_fake_tensor.py::FakeTensorTest::test_fake_dispatch_keys, test/test_fake_tensor.py::FakeTensorTest::test_fake_grad_copy, test/test_fake_tensor.py::FakeTensorTest::test_fake_mode_error, test/test_fake_tensor.py::FakeTensorTest::test_fast_div, test/test_fake_tensor.py::FakeTensorTest::test_fast_div_int_to_float, test/test_fake_tensor.py::FakeTensorTest::test_from_numpy, test/test_fake_tensor.py::FakeTensorTest::test_fsdp_flat_param, test/test_fake_tensor.py::FakeTensorTest::test_full, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_complex128, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_complex64, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_float32, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_float64, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_float8_e4m3fn, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_float8_e4m3fnuz, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_float8_e5m2, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_float8_e5m2fnuz, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_int16, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_int32, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_int64, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_int8, test/test_fake_tensor.py::FakeTensorTest::test_index_cuda_with_cpu_uint8, test/test_fake_tensor.py::FakeTensorTest::test_index_put_error, test/test_fake_tensor.py::FakeTensorTest::test_jagged_fake_to_fake_preserved, test/test_fake_tensor.py::FakeTensorTest::test_like_constructor, test/test_fake_tensor.py::FakeTensorTest::test_mixed_real_and_fake_inputs, test/test_fake_tensor.py::FakeTensorTest::test_mode, test/test_fake_tensor.py::FakeTensorTest::test_nan_to_num, test/test_fake_tensor.py::FakeTensorTest::test_nanmean_out, test/test_fake_tensor.py::FakeTensorTest::test_new, test/test_fake_tensor.py::FakeTensorTest::test_no_tag_func, test/test_fake_tensor.py::FakeTensorTest::test_non_kwarg_device, test/test_fake_tensor.py::FakeTensorTest::test_non_overlapping_stride_zero, test/test_fake_tensor.py::FakeTensorTest::test_non_parameter_grad, test/test_fake_tensor.py::FakeTensorTest::test_normalize_device, test/test_fake_tensor.py::FakeTensorTest::test_op_with_zero_dim_bypassed, test/test_fake_tensor.py::FakeTensorTest::test_out_multi_device, test/test_fake_tensor.py::FakeTensorTest::test_parameter_instantiation, test/test_fake_tensor.py::FakeTensorTest::test_parameter_view, test/test_fake_tensor.py::FakeTensorTest::test_print_in_fake_mode, test/test_fake_tensor.py::FakeTensorTest::test_randperm, test/test_fake_tensor.py::FakeTensorTest::test_recursive_invocation, test/test_fake_tensor.py::FakeTensorTest::test_repr, test/test_fake_tensor.py::FakeTensorTest::test_same_shape_env_preserved, test/test_fake_tensor.py::FakeTensorTest::test_scalar_inputs, test/test_fake_tensor.py::FakeTensorTest::test_scan_reverse_False, test/test_fake_tensor.py::FakeTensorTest::test_scan_reverse_True, test/test_fake_tensor.py::FakeTensorTest::test_setitem, test/test_fake_tensor.py::FakeTensorTest::test_shape_take_not_device, test/test_fake_tensor.py::FakeTensorTest::test_split_return_self, test/test_fake_tensor.py::FakeTensorTest::test_throw, test/test_fake_tensor.py::FakeTensorTest::test_tolist, test/test_fake_tensor.py::FakeTensorTest::test_type_as, test/test_fake_tensor.py::FakeTensorTest::test_unbind_copy_out, test/test_fake_tensor.py::FakeTensorTest::test_unsqueeze_copy, test/test_fake_tensor.py::FakeTensorTest::test_upsample_bilinear_small_channels, test/test_fake_tensor.py::FakeTensorTest::test_zero_dim, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test__adaptive_avg_pool2d_backward_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_alias_call_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_allow_meta_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_aten_copy_multi_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_aten_index_multi_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_aten_slice_scatter_multi_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_basic_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_batch_tensor_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_binary_op_type_promotion_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_constructor_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_conv_nhwc_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_convert_fake_to_real_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_cpu_fallback_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_cuda_initialized_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_cuda_lstm_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_cudnn_rnn_with_fallback_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_cudnn_rnn_without_fallback_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_custom_op_fallback_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_data_dependent_operator_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_deepcopy_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_device_inplace_copy_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_embedding_bag_meta_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_export_numpy_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_fake_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_fake_dispatch_keys_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_fake_grad_copy_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_fake_mode_error_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_fast_div_int_to_float_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_fast_div_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_from_numpy_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_fsdp_flat_param_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_full_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_complex128_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_complex64_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_float32_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_float64_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_float8_e4m3fn_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_float8_e4m3fnuz_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_float8_e5m2_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_float8_e5m2fnuz_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_int16_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_int32_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_int64_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_int8_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_cuda_with_cpu_uint8_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_index_put_error_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_jagged_fake_to_fake_preserved_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_like_constructor_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_mixed_real_and_fake_inputs_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_mode_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_nan_to_num_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_nanmean_out_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_new_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_no_tag_func_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_non_kwarg_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_non_overlapping_stride_zero_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_non_parameter_grad_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_normalize_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_op_with_zero_dim_bypassed_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_out_multi_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_parameter_instantiation_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_parameter_view_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_print_in_fake_mode_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_randperm_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_recursive_invocation_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_repr_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_same_shape_env_preserved_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_scalar_inputs_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_scan_reverse_False_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_scan_reverse_True_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_setitem_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_shape_take_not_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_split_return_self_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_throw_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_tolist_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_type_as_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_unbind_copy_out_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_unsqueeze_copy_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_upsample_bilinear_small_channels_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorTest::test_zero_dim_propagate_real_tensors, test/test_fake_tensor.py::FakeTensorConstHandling::test_aliased_const_write, test/test_fake_tensor.py::FakeTensorConstHandling::test_constant_invalidation, test/test_fake_tensor.py::FakeTensorConstHandling::test_constant_propagate_through_functions, test/test_fake_tensor.py::FakeTensorConstHandling::test_fake_tensor_batch_norm_cpu, test/test_fake_tensor.py::FakeTensorConstHandling::test_fake_tensor_in_intlist_repro, test/test_fake_tensor.py::FakeTensorConstHandling::test_inplace_add, test/test_fake_tensor.py::FakeTensorConstHandling::test_inplace_view_invalidation, test/test_fake_tensor.py::FakeTensorConstHandling::test_shared_storage_invalidation, test/test_fake_tensor.py::FakeTensorConstHandling::test_shared_storages, test/test_fake_tensor.py::FakeTensorConstHandling::test_simple, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_aliased_const_write_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_constant_invalidation_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_constant_propagate_through_functions_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_fake_tensor_batch_norm_cpu_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_fake_tensor_in_intlist_repro_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_inplace_add_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_inplace_view_invalidation_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_shared_storage_invalidation_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_shared_storages_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConstHandling::test_simple_propagate_real_tensors, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpyCatCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpyCubeCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpyMulCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpyMulScalarCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpyNMSCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpyNonzeroCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpySortCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpySplitCopyCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpySplitCopyWithIntCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpyTakeCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorOpInfoTestCUDA::test_fake_NumpyViewCopyCustomOp_cuda_float32, test/test_fake_tensor.py::FakeTensorConverterTest::test_dead_key, test/test_fake_tensor.py::FakeTensorConverterTest::test_dead_weak_ref, test/test_fake_tensor.py::FakeTensorConverterTest::test_memoized_conversion_from_meta, test/test_fake_tensor.py::FakeTensorConverterTest::test_memoized_conversion_to_meta, test/test_fake_tensor.py::FakeTensorConverterTest::test_multiple_modes, test/test_fake_tensor.py::FakeTensorConverterTest::test_no_active_mode, test/test_fake_tensor.py::FakeTensorConverterTest::test_no_ref_cycle, test/test_fake_tensor.py::FakeTensorConverterTest::test_separate_mode_error, test/test_fake_tensor.py::FakeTensorConverterTest::test_separate_tensor_storages_non_view, test/test_fake_tensor.py::FakeTensorConverterTest::test_separate_tensor_storages_view, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_dead_key_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_dead_weak_ref_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_memoized_conversion_from_meta_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_memoized_conversion_to_meta_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_multiple_modes_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_no_active_mode_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_no_ref_cycle_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_separate_mode_error_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_separate_tensor_storages_non_view_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorConverterTest::test_separate_tensor_storages_view_propagate_real_tensors, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_conv_c1_backward, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_cross_entropy_loss, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_embedding_bag_private, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_fake_gpu_no_init, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_flash_attention, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_like_ops, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_module_to, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_move_meta_tensor, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_move_module_under_fake, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_no_dispatch_with_like_function, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_non_kwarg_only_device, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_sparse_new, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_str_storage, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_tensor_constructors_all_have_kwarg_device, test/test_fake_tensor.py::FakeTensorOperatorInvariants::test_tensor_new, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_conv_c1_backward_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_cross_entropy_loss_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_embedding_bag_private_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_fake_gpu_no_init_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_flash_attention_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_like_ops_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_module_to_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_move_meta_tensor_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_move_module_under_fake_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_no_dispatch_with_like_function_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_non_kwarg_only_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_sparse_new_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_str_storage_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_tensor_constructors_all_have_kwarg_device_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorOperatorInvariants::test_tensor_new_propagate_real_tensors, test/test_fake_tensor.py::FakeTensorPropTest::test_fake_tensor_prop_on_nn_module, test/test_fake_tensor.py::FakeTensorPropTest::test_fake_tensor_prop_on_nn_module_with_optional_args, test/test_fake_tensor.py::FakeTensorPropTest::test_nan_to_num, test/test_fake_tensor.py::FakeTensorPropTest::test_nonzero_stride, test/test_fake_tensor.py::FakeTensorPropTest::test_torch_load_with_fake_mode, test/test_fake_tensor.py::FakeTensorPropTest::test_unbacked_shape_realloc, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorPropTest::test_fake_tensor_prop_on_nn_module_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorPropTest::test_fake_tensor_prop_on_nn_module_with_optional_args_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorPropTest::test_nan_to_num_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorPropTest::test_nonzero_stride_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorPropTest::test_torch_load_with_fake_mode_propagate_real_tensors, test/test_fake_tensor.py::PropagateRealTensorsFakeTensorPropTest::test_unbacked_shape_realloc_propagate_real_tensors, test/test_fake_tensor.py::FakeTensorSerialization::test_serialization, test/test_fake_tensor.py::FakeTensorSerialization::test_serialization_with_tracing, test/test_fake_tensor.py::FakeTensorDispatchCache::test__upsample_bilinear2d_aa_backward_dynamic_shapes, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_aten_index, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_bypass, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_default_device, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_default_dtype, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_dispatch_key_set, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_hit, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_inplace_op, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_constants, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_device, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_dtype, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_is_conj, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_is_inference, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_is_neg, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_memory_format, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_requires_grad, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_shape, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_storage_offset, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_key_stride, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_tuple_outputs, test/test_fake_tensor.py::FakeTensorDispatchCache::test_cache_view_op, test/test_fake_tensor.py::FakeTensorDispatchCache::test_empty_list, test/test_fake_tensor.py::FakeTensorDispatchCache::test_fft_hfft2_issue145522, test/test_fake_tensor.py::FakeTensorDispatchCache::test_from_buffer, test/test_fake_tensor.py::FakeTensorDispatchCache::test_inference_mode, test/test_fake_tensor.py::FakeTensorDispatchCache::test_invoke_subgraph, test/test_fake_tensor.py::FakeTensorDispatchCache::test_invoke_subgraph_cacheable_inplace, test/test_fake_tensor.py::FakeTensorDispatchCache::test_meta_tensor_to_fake_cpu, test/test_fake_tensor.py::FakeTensorDispatchCache::test_shape_env_settings, test/test_fake_tensor.py::FakeTensorDispatchCache::test_unbacked_output, test/test_fake_tensor.py::FakeTensorDispatchCache::test_wrapper_tensor_subclass_different_device, test/test_fake_tensor.py::FakeTensorPreferDeviceType::test_fake_tensor_prefer_device_type, test/test_fake_tensor.py::FakeTensorPreferDeviceType::test_fake_tensor_prefer_device_type_cpu_only 2025-12-04T11:48:44.3935459Z 2025-12-04T11:48:44.3935579Z Finished test_fake_tensor 1/1 ... [2025-12-04 11:48:44.385406][3575432.910214924], took 0.27min 2025-12-04T11:48:44.3935993Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:48:44.3936351Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:48:44.3936565Z Running test_stateless 1/1 ... [2025-12-04 11:48:44.392632][3575432.917446191] 2025-12-04T11:48:44.3936741Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:48:44.3937119Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_stateless.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:48:44.392819] 2025-12-04T11:48:49.3661399Z 2025-12-04T11:48:49.3662575Z test_stateless 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_stateless_1.1_2eb0048a9c485d2f_.log 2025-12-04T11:48:49.3676650Z Running 50 items in this shard: test/test_stateless.py::TestStatelessFunctionalAPI::test_circular_references_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_circular_references_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_batch_norm_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_batch_norm_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_member_reference_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_member_reference_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_multiple_dicts_error, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_tuple_dicts, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_data_parallel_error_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_data_parallel_error_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_data_parallel_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_data_parallel_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_gradient_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_gradient_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_jit_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_jit_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_kwargs_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_functional_call_with_kwargs_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_in_place_operator_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_in_place_operator_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_module_fail_reset_to_original_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_module_fail_reset_to_original_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_some_weights_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_some_weights_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_special_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_special_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_strict_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_strict_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_tie_some_weights_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_tie_some_weights_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_tie_weights_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_tie_weights_strict_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_tie_weights_strict_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrize_tie_weights_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrized_module_change_parametrization_original_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_reparametrized_module_change_parametrization_original_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_setattr_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_setattr_strict_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_setattr_strict_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_setattr_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_tied_weights_errors_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_tied_weights_errors_torch_func, test/test_stateless.py::TestStatelessFunctionalAPI::test_tied_weights_no_error_without_flag, test/test_stateless.py::TestStatelessFunctionalAPI::test_tied_weights_warns_stateless, test/test_stateless.py::TestStatelessFunctionalAPI::test_tied_weights_warns_torch_func, test/test_stateless.py::TestStatelessDeprecation::test_private_stateless_warns, test/test_stateless.py::TestStatelessDeprecation::test_stateless_functional_call_warns, test/test_stateless.py::TestPythonOptimizeMode::test_runs_with_optimize_flag 2025-12-04T11:48:49.3684412Z 2025-12-04T11:48:49.3684527Z Finished test_stateless 1/1 ... [2025-12-04 11:48:49.365818][3575437.890627701], took 0.08min 2025-12-04T11:48:49.3684916Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:48:49.3729341Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:48:49.3731294Z Running test_binary_ufuncs 1/1 ... [2025-12-04 11:48:49.372942][3575437.897756219] 2025-12-04T11:48:49.3731490Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:48:49.3732964Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_binary_ufuncs.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:48:49.373135] 2025-12-04T11:51:52.9727665Z 2025-12-04T11:51:52.9728512Z test_binary_ufuncs 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_binary_ufuncs_1.1_13b24f605fe49dc6_.log 2025-12-04T11:51:53.1443324Z Running 12917 items in this shard: test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___add___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___add___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___add___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___add___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___add___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___add___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___add___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___add___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___and___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___and___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___and___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___and___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___and___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___and___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___and___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___and___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___eq___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___eq___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___eq___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___eq___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___eq___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___eq___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___eq___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___eq___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___floordiv___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___floordiv___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___floordiv___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___floordiv___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___floordiv___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___floordiv___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___floordiv___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___floordiv___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ge___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ge___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ge___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ge___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ge___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ge___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ge___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ge___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___gt___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___gt___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___gt___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___gt___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___gt___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___gt___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___gt___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___gt___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iadd___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iadd___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iadd___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iadd___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iadd___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iadd___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iadd___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iadd___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iand___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iand___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iand___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iand___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iand___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iand___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iand___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___iand___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ifloordiv___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ifloordiv___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ifloordiv___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ifloordiv___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ifloordiv___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ifloordiv___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ifloordiv___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ifloordiv___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ilshift___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ilshift___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ilshift___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ilshift___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ilshift___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ilshift___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ilshift___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ilshift___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imod___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imod___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imod___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imod___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imod___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imod___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imod___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imod___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imul___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imul___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imul___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imul___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imul___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imul___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imul___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___imul___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ior___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ior___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ior___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ior___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ior___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ior___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ior___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ior___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ipow___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ipow___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ipow___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ipow___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ipow___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ipow___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ipow___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ipow___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___irshift___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___irshift___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___irshift___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___irshift___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___irshift___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___irshift___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___irshift___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___irshift___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___isub___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___isub___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___isub___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___isub___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___isub___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___isub___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___isub___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___isub___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___itruediv___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___itruediv___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___itruediv___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___itruediv___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___itruediv___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___itruediv___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___itruediv___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___itruediv___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ixor___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ixor___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ixor___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ixor___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ixor___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ixor___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ixor___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ixor___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___le___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___le___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___le___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___le___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___le___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___le___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___le___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___le___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lshift___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lshift___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lshift___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lshift___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lshift___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lshift___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lshift___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lshift___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lt___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lt___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lt___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lt___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lt___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lt___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lt___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___lt___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___matmul___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___matmul___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___matmul___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___matmul___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___matmul___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___matmul___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___matmul___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___matmul___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mod___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mod___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mod___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mod___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mod___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mod___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mod___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mod___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mul___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mul___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mul___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mul___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mul___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mul___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mul___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___mul___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ne___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ne___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ne___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ne___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ne___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ne___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ne___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ne___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___or___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___or___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___or___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___or___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___or___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___or___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___or___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___or___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___pow___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___pow___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___pow___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___pow___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___pow___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___pow___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___pow___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___pow___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___radd___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___radd___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___radd___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___radd___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___radd___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___radd___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___radd___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___radd___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rand___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rand___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rand___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rand___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rand___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rand___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rand___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rand___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rfloordiv___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rfloordiv___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rfloordiv___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rfloordiv___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rfloordiv___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rfloordiv___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rfloordiv___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rfloordiv___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rlshift___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rlshift___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rlshift___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rlshift___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rlshift___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rlshift___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rlshift___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rlshift___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmatmul___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmatmul___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmatmul___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmatmul___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmatmul___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmatmul___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmatmul___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmatmul___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmod___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmod___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmod___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmod___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmod___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmod___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmod___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmod___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmul___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmul___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmul___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmul___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmul___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmul___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmul___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rmul___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ror___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ror___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ror___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ror___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ror___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ror___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ror___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___ror___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rpow___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rpow___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rpow___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rpow___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rpow___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rpow___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rpow___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rpow___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rrshift___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rrshift___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rrshift___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rrshift___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rrshift___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rrshift___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rrshift___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rrshift___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rshift___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rshift___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rshift___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rshift___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rshift___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rshift___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rshift___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rshift___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rsub___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rsub___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rsub___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rsub___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rsub___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rsub___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rsub___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rsub___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rtruediv___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rtruediv___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rtruediv___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rtruediv___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rtruediv___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rtruediv___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rtruediv___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rtruediv___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rxor___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rxor___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rxor___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rxor___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rxor___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rxor___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rxor___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___rxor___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___sub___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___sub___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___sub___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___sub___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___sub___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___sub___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___sub___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___sub___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___truediv___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___truediv___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___truediv___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___truediv___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___truediv___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___truediv___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___truediv___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___truediv___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___xor___not_implemented_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___xor___not_implemented_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___xor___not_implemented_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___xor___not_implemented_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___xor___not_implemented_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___xor___not_implemented_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___xor___not_implemented_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test___xor___not_implemented_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_add_broadcast_empty_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_add_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_add_with_tail_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_addcmul_scalars_as_floats_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_addsub_half_tensor_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_atan2_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_atan2_edgecases_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___radd___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rand___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rand___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rand___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rand___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rand___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rand___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rdiv___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmod___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rmul___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___ror___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___ror___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___ror___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___ror___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___ror___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___ror___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rpow___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rsub___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rxor___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rxor___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rxor___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rxor___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rxor___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing___rxor___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs__conversions_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing__refs_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ldexp_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_h_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_h_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_h_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_h_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_h_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_h_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_h_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_he_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_he_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_he_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_he_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_he_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_he_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_hermite_polynomial_he_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_laguerre_polynomial_l_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_laguerre_polynomial_l_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_laguerre_polynomial_l_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_laguerre_polynomial_l_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_laguerre_polynomial_l_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_laguerre_polynomial_l_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_laguerre_polynomial_l_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_legendre_polynomial_p_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_legendre_polynomial_p_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_legendre_polynomial_p_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_legendre_polynomial_p_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_legendre_polynomial_p_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_legendre_polynomial_p_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_legendre_polynomial_p_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_batch_vs_slicing_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_binary_op_mem_overlap_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_binary_op_scalar_device_unspecified_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_binary_ops_with_scalars_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_bitwise_ops_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_bitwise_ops_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_bitwise_ops_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_bitwise_ops_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_bitwise_ops_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_bitwise_ops_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_bool_tensor_comparison_ops_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_broadcasting_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cdiv_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cmul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_bfloat16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex128_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex128_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex128_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex128_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_complex64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_comparison_ops_type_promotion_and_broadcasting_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_div_underflow_overflow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_div_underflow_overflow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_complex_scalar_pow_tensor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___radd___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rand___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rand___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rand___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rand___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rand___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rand___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rdiv___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmod___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rmul___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___ror___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___ror___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___ror___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___ror___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___ror___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___ror___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rpow___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rsub___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rxor___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rxor___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rxor___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rxor___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rxor___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1___rxor___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs__conversions_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs__conversions_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs__conversions_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1__refs_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___radd___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rand___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rand___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rand___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rand___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rand___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rand___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rdiv___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmod___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rmul___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___ror___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___ror___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___ror___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___ror___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___ror___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___ror___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rpow___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rsub___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rxor___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rxor___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rxor___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rxor___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rxor___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim___rxor___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs__conversions_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim__refs_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ldexp_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_h_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_h_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_h_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_h_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_h_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_h_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_h_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_he_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_he_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_he_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_he_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_he_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_he_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_hermite_polynomial_he_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_laguerre_polynomial_l_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_laguerre_polynomial_l_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_laguerre_polynomial_l_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_laguerre_polynomial_l_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_laguerre_polynomial_l_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_laguerre_polynomial_l_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_laguerre_polynomial_l_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_legendre_polynomial_p_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_legendre_polynomial_p_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_legendre_polynomial_p_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_legendre_polynomial_p_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_legendre_polynomial_p_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_legendre_polynomial_p_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_legendre_polynomial_p_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_large_dim_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ldexp_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_h_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_h_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_h_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_h_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_h_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_h_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_h_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_he_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_he_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_he_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_he_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_he_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_he_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_hermite_polynomial_he_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_laguerre_polynomial_l_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_laguerre_polynomial_l_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_laguerre_polynomial_l_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_laguerre_polynomial_l_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_laguerre_polynomial_l_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_laguerre_polynomial_l_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_laguerre_polynomial_l_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_legendre_polynomial_p_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_legendre_polynomial_p_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_legendre_polynomial_p_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_legendre_polynomial_p_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_legendre_polynomial_p_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_legendre_polynomial_p_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_legendre_polynomial_p_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_size1_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___radd___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rand___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rand___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rand___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rand___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rand___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rand___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rdiv___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmod___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rmul___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___ror___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___ror___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___ror___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___ror___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___ror___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___ror___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rpow___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rsub___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rxor___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rxor___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rxor___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rxor___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rxor___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other___rxor___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs__conversions_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other__refs_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ldexp_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_h_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_h_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_h_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_h_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_h_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_h_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_h_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_he_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_he_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_he_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_he_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_he_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_he_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_hermite_polynomial_he_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_laguerre_polynomial_l_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_laguerre_polynomial_l_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_laguerre_polynomial_l_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_laguerre_polynomial_l_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_laguerre_polynomial_l_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_laguerre_polynomial_l_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_laguerre_polynomial_l_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_legendre_polynomial_p_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_legendre_polynomial_p_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_legendre_polynomial_p_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_legendre_polynomial_p_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_legendre_polynomial_p_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_legendre_polynomial_p_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_legendre_polynomial_p_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_every_other_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___radd___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rand___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rand___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rand___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rand___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rand___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rand___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rdiv___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmod___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rmul___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___ror___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___ror___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___ror___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___ror___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___ror___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___ror___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rpow___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rsub___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rxor___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rxor___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rxor___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rxor___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rxor___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed___rxor___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs__conversions_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed__refs_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ldexp_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_h_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_h_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_h_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_h_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_h_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_h_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_h_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_he_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_he_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_he_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_he_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_he_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_he_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_hermite_polynomial_he_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_laguerre_polynomial_l_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_laguerre_polynomial_l_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_laguerre_polynomial_l_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_laguerre_polynomial_l_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_laguerre_polynomial_l_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_laguerre_polynomial_l_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_laguerre_polynomial_l_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_legendre_polynomial_p_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_legendre_polynomial_p_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_legendre_polynomial_p_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_legendre_polynomial_p_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_legendre_polynomial_p_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_legendre_polynomial_p_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_legendre_polynomial_p_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_contig_vs_transposed_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bfloat16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_bool_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_copysign_subgradient_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cpow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cpu_tensor_pow_cuda_scalar_tensor_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cremainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cross_device_binary_ops_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cross_device_inplace_error_msg_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_csub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cuda_tensor_pow_scalar_tensor_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_cumulative_trapezoid_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_and_floordiv_script_vs_python_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_and_floordiv_vs_python_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_modes_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_nonfinite_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_nonfinite_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_nonfinite_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_nonfinite_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_numpy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_numpy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_numpy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_numpy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_numpy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_numpy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_numpy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_div_rounding_numpy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_divide_by_zero_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_divide_by_zero_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_divide_by_zero_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_divide_by_zero_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_divmul_scalar_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_bfloat16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex128_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_complex64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_power_exceptions_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_scalar_pow_float_tensor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_float_scalar_pow_float_tensor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_div_extremal_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_div_extremal_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_div_extremal_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_div_extremal_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_int_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_int_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_int_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_int_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_scalar_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_scalar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_scalar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_scalar_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_scalar_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_scalar_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_scalar_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_scalar_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_tensor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_tensor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_tensor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_tensor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_tensor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_tensor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_tensor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_tensor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_zero_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_zero_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_zero_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_zero_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_floor_divide_zero_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_by_zero_float_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_by_zero_float_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_by_zero_float_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_by_zero_integral_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_by_zero_integral_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_by_zero_integral_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_by_zero_integral_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_by_zero_integral_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_overflow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_overflow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_overflow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_overflow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_fmod_remainder_overflow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_complex_cuda_complex128_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_complex_cuda_complex128_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_complex_cuda_complex64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_complex_cuda_complex64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cross_device_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bfloat16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_bool_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_heaviside_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_idiv_and_ifloordiv_vs_python_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_inplace_division_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_inplace_dunders_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_int_and_float_pow_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_int_tensor_pow_neg_ints_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_ldexp_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_lowp_cpu_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_lowp_cpu_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_lowp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_lowp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_weight_scalar_tensor_promotion_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_weight_scalar_tensor_promotion_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_weight_scalar_tensor_promotion_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_weight_scalar_tensor_promotion_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_weight_tensor_promotion_error_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_weight_tensor_promotion_error_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_lerp_weight_tensor_promotion_error_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logaddexp2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logaddexp2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logaddexp2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bfloat16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_bool_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex128_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_complex64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_and_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bfloat16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_bool_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex128_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_complex64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_or_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bfloat16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_bool_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex128_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_complex64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_logical_xor_with_nontrivial_alignment_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_long_tensor_pow_floats_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_and_minimum_subgradient_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex128_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_complex_cuda_complex64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_cross_device_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_float_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_float_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_float_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_float_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_float_nan_and_inf_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_float_nan_and_inf_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_float_nan_and_inf_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_float_nan_and_inf_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_forward_ad_float32_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_int_and_bool_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_int_and_bool_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_int_and_bool_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_int_and_bool_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_int_and_bool_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_int_and_bool_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bfloat16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_bool_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_maximum_minimum_type_promotion_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_min_max_binary_op_nan_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_min_max_binary_op_nan_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_min_max_binary_op_nan_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_mul_chalf_tensor_and_cpu_scalar_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_mul_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_mul_intertype_scalar_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_mul_intertype_scalar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_mul_intertype_scalar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_muldiv_scalar_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_nextafter_bfloat16_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___radd___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rand___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rand___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rand___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rand___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rand___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rand___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rdiv___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmod___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rmul___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___ror___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___ror___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___ror___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___ror___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___ror___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___ror___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rpow___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rsub___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rxor___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rxor___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rxor___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rxor___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rxor___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig___rxor___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs__conversions_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs__conversions_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs__conversions_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig__refs_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___radd___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rand___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rand___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rand___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rand___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rand___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rand___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rdiv___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmod___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rmul___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___ror___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___ror___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___ror___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___ror___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___ror___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___ror___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rpow___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rsub___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rxor___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rxor___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rxor___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rxor___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rxor___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand___rxor___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs__conversions_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs__conversions_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs__conversions_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand__refs_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ldexp_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_h_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_h_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_h_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_h_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_h_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_h_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_h_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_he_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_he_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_he_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_he_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_he_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_he_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_hermite_polynomial_he_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_laguerre_polynomial_l_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_laguerre_polynomial_l_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_laguerre_polynomial_l_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_laguerre_polynomial_l_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_laguerre_polynomial_l_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_laguerre_polynomial_l_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_laguerre_polynomial_l_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_legendre_polynomial_p_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_legendre_polynomial_p_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_legendre_polynomial_p_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_legendre_polynomial_p_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_legendre_polynomial_p_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_legendre_polynomial_p_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_legendre_polynomial_p_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_expand_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___radd___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rand___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rand___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rand___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rand___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rand___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rand___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rdiv___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmod___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rmul___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___ror___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___ror___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___ror___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___ror___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___ror___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___ror___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rpow___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rsub___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rxor___cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rxor___cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rxor___cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rxor___cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rxor___cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index___rxor___cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs__conversions_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs__conversions_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs__conversions_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index__refs_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_atan2_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_left_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_left_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_left_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_left_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_left_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_right_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_right_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_right_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_right_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_right_shift_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_complex_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_complex_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_copysign_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_floor_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_no_rounding_mode_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_div_trunc_rounding_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmax_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmin_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_hypot_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_hypot_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_hypot_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_igamma_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_igammac_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ldexp_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_h_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_h_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_h_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_h_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_h_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_h_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_h_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_he_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_he_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_he_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_he_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_he_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_he_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_hermite_polynomial_he_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_laguerre_polynomial_l_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_laguerre_polynomial_l_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_laguerre_polynomial_l_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_laguerre_polynomial_l_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_laguerre_polynomial_l_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_laguerre_polynomial_l_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_laguerre_polynomial_l_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_legendre_polynomial_p_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_legendre_polynomial_p_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_legendre_polynomial_p_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_legendre_polynomial_p_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_legendre_polynomial_p_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_legendre_polynomial_p_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_legendre_polynomial_p_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_index_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ldexp_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logaddexp_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logaddexp_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logaddexp_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logaddexp_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logaddexp_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logaddexp_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_mul_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_nextafter_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_nextafter_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_nextafter_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_polar_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_rsub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_h_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_h_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_h_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_h_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_h_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_h_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_h_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_he_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_he_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_he_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_he_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_he_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_he_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_hermite_polynomial_he_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_laguerre_polynomial_l_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_laguerre_polynomial_l_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_laguerre_polynomial_l_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_laguerre_polynomial_l_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_laguerre_polynomial_l_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_laguerre_polynomial_l_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_laguerre_polynomial_l_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_legendre_polynomial_p_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_legendre_polynomial_p_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_legendre_polynomial_p_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_legendre_polynomial_p_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_legendre_polynomial_p_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_legendre_polynomial_p_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_legendre_polynomial_p_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_xlog1py_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_zeta_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_zeta_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_zeta_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_zeta_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_zeta_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_zeta_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_special_zeta_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_true_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_non_contig_xlogy_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable___radd___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable___rdiv___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable___rmod___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable___rmul___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable___rpow___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable___rsub___cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs__conversions_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs__conversions_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable__refs_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_atan2_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_complex_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_copysign_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_div_floor_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_div_no_rounding_mode_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_div_trunc_rounding_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_fmax_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_fmin_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_hypot_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_igamma_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_igammac_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_ldexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_logaddexp_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_mul_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_nextafter_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_polar_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_rsub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_hermite_polynomial_h_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_hermite_polynomial_he_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_laguerre_polynomial_l_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_legendre_polynomial_p_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_xlog1py_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_special_zeta_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_true_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_not_broadcastable_xlogy_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_out_resize_warning_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_complex_extremal_passing_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_complex_extremal_passing_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_inplace_resizing_exception_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_scalar_base_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_scalar_overloads_mem_overlap_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_pow_scalar_type_promotion_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rdiv_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_extremal_values_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_large_values_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values__refs_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_add_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_bitwise_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_max_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_clamp_min_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_eq_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_float_power_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_floor_divide_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_fmod_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gcd_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gcd_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gcd_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gcd_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ge_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_gt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_heaviside_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_isclose_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_jiterator_binary_return_by_ref_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lcm_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lcm_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lcm_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lcm_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_le_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_and_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_or_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_logical_xor_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_lt_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_max_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_maximum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_min_binary_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_minimum_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_ne_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_pow_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_remainder_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_small_values_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_complex32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_reference_numerics_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_remainder_fmod_large_dividend_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_remainder_fmod_large_dividend_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_remainder_overflow_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_rpow_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support__refs_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_add_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_add_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_add_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_bitwise_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_bitwise_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_bitwise_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_clamp_max_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_clamp_max_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_clamp_min_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_clamp_min_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_eq_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_eq_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_eq_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_float_power_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_float_power_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_float_power_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_floor_divide_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_floor_divide_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_fmod_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_fmod_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_gcd_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_ge_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_ge_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_gt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_gt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_heaviside_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_heaviside_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_isclose_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_isclose_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_isclose_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_jiterator_binary_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_jiterator_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_jiterator_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_jiterator_binary_return_by_ref_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_jiterator_binary_return_by_ref_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_jiterator_binary_return_by_ref_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_lcm_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_le_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_le_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_and_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_and_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_and_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_or_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_or_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_or_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_xor_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_xor_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_logical_xor_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_lt_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_lt_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_max_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_max_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_maximum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_maximum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_min_binary_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_min_binary_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_minimum_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_minimum_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_ne_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_ne_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_ne_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_pow_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_pow_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_pow_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_remainder_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_remainder_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_scalar_support_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_shift_limits_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_shift_limits_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_shift_limits_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_shift_limits_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_shift_limits_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_signed_shift_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_signed_shift_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_signed_shift_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_signed_shift_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_complex128, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_complex64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_cuda_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_sub_typing_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_tensor_pow_tensor_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_trapezoid_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_true_divide_out_cuda_bfloat16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_true_divide_out_cuda_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___radd___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___rand___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___rdiv___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___rmod___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___rmul___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___ror___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___rpow___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___rsub___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion___rxor___cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs__conversions_complex_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs__conversions_polar_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_add_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_atan2_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_bitwise_and_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_bitwise_left_shift_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_bitwise_or_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_bitwise_right_shift_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_bitwise_xor_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_clamp_max_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_clamp_min_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_copysign_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_div_floor_rounding_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_div_no_rounding_mode_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_div_trunc_rounding_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_eq_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_float_power_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_floor_divide_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_fmax_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_fmin_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_fmod_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_gcd_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_ge_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_gt_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_heaviside_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_hypot_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_igamma_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_igammac_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_isclose_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_lcm_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_le_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_logaddexp_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_logical_and_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_logical_or_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_logical_xor_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_lt_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_maximum_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_minimum_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_mul_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_ne_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_nextafter_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_pow_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_remainder_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_rsub_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_special_xlog1py_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_special_zeta_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_sub_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_true_divide_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion__refs_xlogy_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_add_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_atan2_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_bitwise_and_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_bitwise_left_shift_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_bitwise_or_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_bitwise_right_shift_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_bitwise_xor_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_clamp_max_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_clamp_min_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_complex_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_copysign_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_div_floor_rounding_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_div_no_rounding_mode_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_div_trunc_rounding_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_eq_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_float_power_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_floor_divide_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_fmax_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_fmin_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_fmod_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_gcd_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_ge_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_gt_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_heaviside_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_hypot_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_igamma_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_igammac_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_isclose_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_jiterator_binary_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_jiterator_binary_return_by_ref_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_lcm_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_ldexp_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_le_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_logaddexp_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_logical_and_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_logical_or_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_logical_xor_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_lt_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_max_binary_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_maximum_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_min_binary_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_minimum_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_mul_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_ne_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_nextafter_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_polar_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_pow_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_remainder_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_rsub_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_chebyshev_polynomial_t_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_chebyshev_polynomial_u_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_chebyshev_polynomial_v_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_chebyshev_polynomial_w_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_hermite_polynomial_h_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_hermite_polynomial_he_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_laguerre_polynomial_l_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_legendre_polynomial_p_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_shifted_chebyshev_polynomial_t_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_shifted_chebyshev_polynomial_u_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_shifted_chebyshev_polynomial_v_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_shifted_chebyshev_polynomial_w_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_xlog1py_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_special_zeta_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_sub_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_true_divide_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_type_promotion_xlogy_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_bfloat16_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_bool_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_float16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_cuda_uint8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_gradients_cuda_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_xlogy_xlog1py_scalar_type_promotion_cuda, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_bool_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_bool_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_bool_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_bool_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_bool_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_bool_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_bool_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_bool_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_float64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int16_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int16_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int16_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int16_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int16_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int16_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int16_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int16_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int32_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int32_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int32_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int32_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int32_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int32_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int32_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int32_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int64_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int64_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int64_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int64_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int64_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int64_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int64_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int64_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_int8_uint8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_uint8_bool, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_uint8_float32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_uint8_float64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_uint8_int16, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_uint8_int32, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_uint8_int64, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_uint8_int8, test/test_binary_ufuncs.py::TestBinaryUfuncsCUDA::test_zeta_cuda_uint8_uint8 2025-12-04T11:51:53.3051713Z 2025-12-04T11:51:53.3051878Z Finished test_binary_ufuncs 1/1 ... [2025-12-04 11:51:52.980808][3575621.505616154], took 3.06min 2025-12-04T11:51:53.3052268Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T11:51:53.3052638Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:51:53.3052840Z Running test_ops_jit 1/1 ... [2025-12-04 11:51:52.987460][3575621.512274488] 2025-12-04T11:51:53.3053006Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:51:53.3053366Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_jit.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:51:52.987644] 2025-12-04T12:05:33.6055867Z 2025-12-04T12:05:33.6056569Z test_ops_jit 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_ops_jit_1.1_05467dea1a85db38_.log 2025-12-04T12:05:33.6188218Z Running 1140 items in this shard: test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_abs_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_acos_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_acosh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_asin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_asinh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_atan2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_atan_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_atanh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_cat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_clamp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_digamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_div_floor_rounding_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_div_no_rounding_mode_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_div_trunc_rounding_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_erf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_erfc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_erfinv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_exp2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_expm1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_ge_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_gt_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_i0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_igamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_igammac_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_le_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_lgamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_linalg_det_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_linalg_householder_product_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_linalg_inv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_linalg_matrix_power_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_log1p_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_log_softmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_log_softmax_with_dtype_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_logit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_logsumexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_lt_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_mH_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_matmul_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_matrix_exp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_max_binary_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_min_binary_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_movedim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_mul_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_ne_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_neg_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_conv1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_conv2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_conv3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_conv_transpose1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_conv_transpose2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_conv_transpose3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_group_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_layer_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_nn_functional_rms_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_outer_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_round_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_round_decimals_0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_round_decimals_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_round_decimals_neg_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_sigmoid_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_sinc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_softmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_softmax_with_dtype_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_sub_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_tanh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_transpose_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_trunc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_vstack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_jit_alias_remapping_xlogy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_H_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_H_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_T_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_T_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___getitem___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___getitem___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___radd___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___radd___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rdiv___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rdiv___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rmatmul___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rmatmul___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rmod___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rmul___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rmul___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rpow___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rpow___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rsub___cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit___rsub___cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__batch_norm_with_update_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__chunk_cat_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__chunk_cat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__native_batch_norm_legit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__segment_reduce_lengths_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__segment_reduce_offsets_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__softmax_backward_data_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__unsafe_masked_index_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__unsafe_masked_index_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit__upsample_bilinear2d_aa_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_abs_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_abs_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_acos_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_acos_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_acosh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_acosh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_add_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_add_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addbmm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addbmm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addcdiv_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addcdiv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addcmul_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addcmul_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addmm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addmm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addmm_decomposed_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addmm_decomposed_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addmv_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addmv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addr_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_addr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_alias_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_alias_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_all_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_all_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_allclose_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_allclose_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_amax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_amin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_aminmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_angle_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_angle_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_any_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_any_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_arange_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_argmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_argmin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_argsort_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_argwhere_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_argwhere_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_partial_views_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_partial_views_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_scatter_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_as_strided_scatter_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_asin_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_asin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_asinh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_asinh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atan2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atan_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atan_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atanh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atanh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atleast_1d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atleast_1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atleast_2d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atleast_2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atleast_3d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_atleast_3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_baddbmm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_baddbmm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bernoulli_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bfloat16_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bfloat16_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_block_diag_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_block_diag_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bmm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bmm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bool_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bool_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_broadcast_shapes_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_broadcast_tensors_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_broadcast_tensors_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_broadcast_to_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_broadcast_to_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_bucketize_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_byte_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_byte_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cartesian_prod_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cartesian_prod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cat_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cauchy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cdist_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cdouble_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cdouble_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ceil_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cfloat_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cfloat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_chalf_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_chalf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_char_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_char_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cholesky_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cholesky_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cholesky_inverse_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cholesky_inverse_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cholesky_solve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cholesky_solve_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_chunk_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_chunk_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_clamp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_clamp_max_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_clamp_min_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_clone_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_clone_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_column_stack_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_column_stack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_combinations_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_combinations_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_complex_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_conj_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_conj_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_conj_physical_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_conj_physical_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_constant_pad_nd_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_constant_pad_nd_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_contiguous_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_contiguous_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_copysign_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_corrcoef_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_corrcoef_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cos_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cos_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cosh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cosh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_count_nonzero_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_count_nonzero_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cov_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cov_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cross_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cross_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cummax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cummin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cumprod_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cumprod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cumsum_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cumsum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cumulative_trapezoid_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_cumulative_trapezoid_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_deg2rad_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diag_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diag_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diag_embed_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diag_embed_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagflat_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagflat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagonal_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagonal_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagonal_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagonal_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagonal_scatter_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diagonal_scatter_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diff_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_diff_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_digamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dist_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dist_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_div_floor_rounding_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_div_no_rounding_mode_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_div_no_rounding_mode_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_div_trunc_rounding_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dot_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dot_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_double_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_double_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dsplit_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dsplit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dstack_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_dstack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_einsum_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_einsum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_like_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_like_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_permuted_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_permuted_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_strided_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_empty_strided_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_eq_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_eq_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_equal_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_equal_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_erf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_erfc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_erfinv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_exp2_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_exp2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_exp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_exp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expand_as_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expand_as_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expand_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expand_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expand_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expand_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expm1_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_expm1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_exponential_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_eye_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_eye_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fft2_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fft2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fft_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fftn_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fftn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fftshift_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_fftshift_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfft2_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfft2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfft_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfftn_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_hfftn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifft2_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifft2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifft_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifftn_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifftn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifftshift_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ifftshift_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ihfft2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ihfft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_ihfftn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_irfft2_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_irfft2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_irfft_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_irfft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_irfftn_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_irfftn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_rfft2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_rfft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fft_rfftn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fill_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fill_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_flatten_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_flatten_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_flip_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_flip_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fliplr_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fliplr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_flipud_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_flipud_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_float_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_float_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_float_power_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_float_power_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_floor_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_floor_divide_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fmin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_fmod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_frac_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_frexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_full_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_full_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_full_like_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_full_like_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_gather_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_gather_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ge_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_geometric_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_geqrf_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_geqrf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_gradient_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_gradient_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_grid_sampler_2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_grid_sampler_3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_gt_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_half_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_half_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_hash_tensor_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_heaviside_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_histc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_hsplit_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_hsplit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_hstack_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_hstack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_hypot_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_i0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_igamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_igammac_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_imag_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_add_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_add_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_fill_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_fill_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_put_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_put_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_reduce_amax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_reduce_amin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_reduce_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_reduce_prod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_select_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_index_select_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_inner_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_inner_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_int_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_int_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isclose_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isclose_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isfinite_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isfinite_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isinf_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isinf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isnan_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isnan_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isneginf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isposinf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isreal_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_isreal_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_istft_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_item_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_item_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_2inputs_2outputs_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_2inputs_2outputs_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_binary_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_binary_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_binary_return_by_ref_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_binary_return_by_ref_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_unary_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_jiterator_unary_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_kron_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_kron_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_kthvalue_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ldexp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ldexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_le_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lerp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lerp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lgamma_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cholesky_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cholesky_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cholesky_ex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cholesky_ex_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cond_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cond_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cross_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_cross_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_det_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_det_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_diagonal_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_diagonal_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eig_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eig_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eigh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eigh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eigvals_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eigvals_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eigvalsh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_eigvalsh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_householder_product_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_householder_product_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_inv_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_inv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_inv_ex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_inv_ex_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_ldl_factor_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_ldl_factor_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_ldl_factor_ex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_ldl_factor_ex_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_ldl_solve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_ldl_solve_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lstsq_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lstsq_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lstsq_grad_oriented_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_factor_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_factor_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_factor_ex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_factor_ex_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_solve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_solve_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_norm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_power_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_power_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_rank_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_rank_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_rank_hermitian_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_multi_dot_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_multi_dot_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_norm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_pinv_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_pinv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_pinv_hermitian_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_pinv_hermitian_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_pinv_singular_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_pinv_singular_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_qr_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_qr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_slogdet_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_slogdet_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_solve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_solve_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_solve_ex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_solve_ex_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_solve_triangular_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_solve_triangular_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_svd_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_svd_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_svdvals_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_svdvals_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_tensorinv_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_tensorinv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_tensorsolve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_tensorsolve_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_vander_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_vander_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_vecdot_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_vecdot_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_vector_norm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_vector_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linspace_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linspace_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linspace_tensor_overload_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linspace_tensor_overload_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log10_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log10_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log1p_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log1p_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log2_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log_normal_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log_softmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log_softmax_with_dtype_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_log_softmax_with_dtype_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logaddexp2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logaddexp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logaddexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logcumsumexp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logcumsumexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logdet_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logdet_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_and_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_and_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_not_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_not_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_or_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_or_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_xor_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logical_xor_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logspace_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logspace_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logspace_tensor_overload_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logspace_tensor_overload_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logsumexp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_logsumexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_long_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_long_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lt_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lu_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lu_solve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lu_solve_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lu_unpack_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_lu_unpack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mH_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mH_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mT_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mT_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_amax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_amin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_argmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_argmin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_cumprod_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_cumprod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_cumsum_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_cumsum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_fill_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_fill_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_log_softmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_logaddexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_logsumexp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_logsumexp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_mean_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_median_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_normalize_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_normalize_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_prod_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_prod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_scatter_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_scatter_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_select_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_select_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_softmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_softmin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_std_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_std_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_sum_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_sum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_var_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_masked_var_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_matmul_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_matmul_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_matrix_exp_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_matrix_exp_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_max_binary_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_max_pool2d_with_indices_backward_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_max_reduction_no_dim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_max_reduction_with_dim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_maximum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mean_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_median_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_meshgrid_list_of_tensors_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_meshgrid_list_of_tensors_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_meshgrid_variadic_tensors_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_meshgrid_variadic_tensors_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_min_binary_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_min_reduction_no_dim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_min_reduction_with_dim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_minimum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mode_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_movedim_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_movedim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_msort_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mul_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mul_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_multinomial_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mv_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mv_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nan_to_num_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nanmean_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nanmean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nanmedian_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nanquantile_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nansum_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nansum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_narrow_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_narrow_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_narrow_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_narrow_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_native_batch_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_native_dropout_backward_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_native_layer_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ne_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ne_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_neg_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_neg_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_empty_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_empty_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_empty_strided_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_empty_strided_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_full_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_full_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_ones_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_ones_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_zeros_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_new_zeros_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nextafter_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_alpha_dropout_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_avg_pool1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_avg_pool2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_avg_pool3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_batch_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_bilinear_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_celu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_channel_shuffle_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_channel_shuffle_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv1d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv2d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv3d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv_transpose1d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv_transpose1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv_transpose2d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv_transpose2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv_transpose3d_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_conv_transpose3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_cosine_similarity_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_cross_entropy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_ctc_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_dropout2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_dropout3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_dropout_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_elu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_embedding_bag_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_embedding_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_gelu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_glu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_grid_sample_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_group_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_hardshrink_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_hardsigmoid_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_hardswish_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_hardtanh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_huber_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_instance_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_interpolate_area_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_interpolate_linear_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_interpolate_nearest_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_kl_div_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_l1_loss_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_l1_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_layer_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_leaky_relu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_linear_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_linear_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_local_response_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_logsigmoid_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_pool1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_pool2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_pool3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_unpool1d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_unpool2d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_unpool3d_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_mish_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_mse_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_multi_margin_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_nll_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_normalize_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_normalize_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_circular_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_circular_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_constant_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_constant_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_reflect_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_reflect_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_replicate_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_replicate_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_replicate_negative_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pairwise_distance_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pairwise_distance_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pdist_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pixel_shuffle_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pixel_unshuffle_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_prelu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_relu6_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_relu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_rms_norm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_rms_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_rrelu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_selu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_silu_complex_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_silu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_soft_margin_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softmin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softplus_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softshrink_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softsign_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_softsign_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_tanhshrink_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_tanhshrink_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_threshold_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_unfold_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_unfold_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_upsample_bilinear_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nn_functional_upsample_nearest_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nonzero_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nonzero_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nonzero_static_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_nonzero_static_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_fro_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_fro_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_inf_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_inf_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_nuc_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_norm_nuc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_normal_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_normal_in_place_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_normal_in_place_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_normal_number_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ones_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ones_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ones_like_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ones_like_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ormqr_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ormqr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_outer_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_outer_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_pca_lowrank_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_pca_lowrank_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_permute_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_permute_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_permute_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_permute_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_pinverse_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_pinverse_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polar_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polygamma_polygamma_n_0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polygamma_polygamma_n_1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polygamma_polygamma_n_2_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polygamma_polygamma_n_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_polygamma_polygamma_n_4_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_positive_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_positive_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_pow_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_pow_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_prod_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_prod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_put_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_put_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_qr_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_qr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_quantile_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rad2deg_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rand_like_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rand_like_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randint_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randint_like_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randn_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randn_like_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_randn_like_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ravel_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_ravel_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_real_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_real_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_reciprocal_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_reciprocal_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_remainder_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_renorm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_renorm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_repeat_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_repeat_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_repeat_interleave_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_repeat_interleave_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_reshape_as_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_reshape_as_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_reshape_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_reshape_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resize__cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resize__cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resize_as__cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resize_as__cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resolve_conj_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resolve_conj_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resolve_neg_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_resolve_neg_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_roll_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_roll_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rot90_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rot90_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_round_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_round_decimals_0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_round_decimals_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_round_decimals_neg_3_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rsqrt_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rsqrt_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rsub_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_rsub_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scalar_tensor_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scalar_tensor_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_add_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_add_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_reduce_amax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_reduce_amin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_reduce_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_reduce_prod_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_scatter_reduce_sum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_searchsorted_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_select_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_select_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_select_scatter_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sgn_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sgn_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_short_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_short_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sigmoid_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sigmoid_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sign_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_bartlett_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_blackman_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_cosine_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_exponential_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_gaussian_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_general_cosine_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_general_hamming_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_hamming_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_hann_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_kaiser_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signal_windows_nuttall_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_signbit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sin_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sin_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sinc_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sinc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sinh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sinh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_slice_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_slice_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_slice_scatter_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_softmax_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_softmax_with_dtype_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_softmax_with_dtype_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sort_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sparse_mm_reduce_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sparse_sampled_addmm_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sparse_sampled_addmm_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_airy_ai_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_bessel_j0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_bessel_j1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_bessel_y0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_bessel_y1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_chebyshev_polynomial_t_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_chebyshev_polynomial_u_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_chebyshev_polynomial_v_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_chebyshev_polynomial_w_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_entr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_erfcx_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_hermite_polynomial_h_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_hermite_polynomial_he_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_i0e_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_i1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_i1e_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_laguerre_polynomial_l_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_legendre_polynomial_p_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_log_ndtr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_modified_bessel_i0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_modified_bessel_i1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_modified_bessel_k0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_modified_bessel_k1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_ndtr_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_ndtri_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_spherical_bessel_j0_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_xlog1py_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_special_zeta_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_list_args_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_list_args_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_with_sizes_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_with_sizes_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_with_sizes_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_split_with_sizes_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sqrt_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sqrt_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_square_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_square_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_multiple_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_squeeze_multiple_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_stack_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_stack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_mean_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_mean_unbiased_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_mean_unbiased_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_unbiased_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_std_unbiased_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_stft_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_stft_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sub_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sub_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sum_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sum_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sum_to_size_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_sum_to_size_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_svd_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_svd_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_svd_lowrank_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_svd_lowrank_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_t_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_t_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_t_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_t_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_take_along_dim_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_take_along_dim_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_take_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_take_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tan_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tan_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tanh_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tanh_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tensor_split_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tensor_split_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tensordot_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tensordot_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tile_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tile_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_to_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_to_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_to_sparse_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_to_sparse_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_topk_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_trace_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_trace_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_transpose_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_transpose_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_transpose_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_transpose_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_trapezoid_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_trapezoid_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_trapz_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_trapz_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_triangular_solve_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_triangular_solve_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tril_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_tril_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_triu_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_triu_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_true_divide_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_true_divide_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_trunc_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unbind_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unbind_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unbind_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unbind_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unflatten_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unflatten_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unfold_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unfold_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unfold_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unfold_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_uniform_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_uniform_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unique_consecutive_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unique_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsafe_chunk_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsafe_chunk_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsafe_split_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsafe_split_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsqueeze_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsqueeze_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsqueeze_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_unsqueeze_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_mean_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_mean_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_mean_unbiased_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_mean_unbiased_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_unbiased_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_var_unbiased_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vdot_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vdot_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_view_as_complex_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_view_as_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_view_as_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_view_as_real_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_view_copy_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_view_copy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_view_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_view_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vsplit_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vsplit_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vstack_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_vstack_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_where_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_where_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_xlogy_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zero__cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zero__cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zeros_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zeros_cuda_float32, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zeros_like_cuda_complex64, test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_zeros_like_cuda_float32 2025-12-04T12:05:33.6315828Z 2025-12-04T12:05:33.6315933Z Finished test_ops_jit 1/1 ... [2025-12-04 12:05:33.606070][3576442.130879864], took 13.68min 2025-12-04T12:05:33.6316313Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T12:05:33.6316668Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:05:33.6316882Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T12:05:33.6317131Z Uploading artifacts took 0.00 seconds 2025-12-04T12:05:33.6317313Z Running test_nestedtensor 2/2 ... [2025-12-04 12:05:33.613494][3576442.138307422] 2025-12-04T12:05:33.6317492Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:05:33.6317873Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_nestedtensor.py', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:05:33.613710] 2025-12-04T12:27:32.0477996Z 2025-12-04T12:27:32.0479149Z test_nestedtensor 2/2 was successful, full logs can be found in artifacts with path test/test-reports/test_nestedtensor_2.2_1566fd52202f7957_.log 2025-12-04T12:27:32.0620056Z Running 836 items in this shard: test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_2_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_2d_nested_tensor_batch_size_4_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_2_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_2_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_4_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_batch_size_4_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_2_max_seq_len_5_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_2_max_seq_len_5_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_4_max_seq_len_3_vocab_size_10, test/test_nestedtensor.py::TestNestedTensor::test_3d_nested_tensor_float_batch_size_4_max_seq_len_3_vocab_size_20, test/test_nestedtensor.py::TestNestedTensor::test_default_nested_tensor, test/test_nestedtensor.py::TestNestedTensor::test_dim, test/test_nestedtensor.py::TestNestedTensor::test_fill_, test/test_nestedtensor.py::TestNestedTensor::test_like_functions_ones_like, test/test_nestedtensor.py::TestNestedTensor::test_nested_namespace, test/test_nestedtensor.py::TestNestedTensor::test_nested_tensor_matching_dim, test/test_nestedtensor.py::TestNestedTensor::test_nested_view_from_buffer_overflow_errors, test/test_nestedtensor.py::TestNestedTensor::test_numel, test/test_nestedtensor.py::TestNestedTensor::test_repr_string, test/test_nestedtensor.py::TestNestedTensor::test_size, test/test_nestedtensor.py::TestNestedTensor::test_stride, test/test_nestedtensor.py::TestNestedTensor::test_to, test/test_nestedtensor.py::TestNestedTensor::test_to_padded_tensor_on_empty_tensor, test/test_nestedtensor.py::TestNestedTensor::test_unbind_0, test/test_nestedtensor.py::TestNestedTensor::test_unbind_1, test/test_nestedtensor.py::TestNestedTensor::test_unbind_dim, test/test_nestedtensor.py::TestNestedTensor::test_zero_, test/test_nestedtensor.py::TestNestedInt::test_with_factor, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cpu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cpu_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cuda_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cuda_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cuda_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_cuda_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_bmm_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_contiguous_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_detach_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_detach_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_device_checks_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_dropout_strided_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_embedding_strided_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_empty_like_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_amax_dtypes_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_amax_dtypes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_amax_dtypes_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_amax_dtypes_cuda_int16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_amin_dtypes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_amin_dtypes_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_amin_dtypes_cuda_int8, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmax_dtypes_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmax_dtypes_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmax_dtypes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmax_dtypes_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmax_dtypes_cuda_int32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmax_dtypes_cuda_int64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmax_dtypes_cuda_uint8, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmin_dtypes_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmin_dtypes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmin_dtypes_cuda_int16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmin_dtypes_cuda_int32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_argmin_dtypes_cuda_uint8, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_max_dtypes_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_max_dtypes_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_max_dtypes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_max_dtypes_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_max_dtypes_cuda_int64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_max_dtypes_cuda_uint8, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_min_dtypes_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_min_dtypes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_min_dtypes_cuda_int16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_jagged_min_dtypes_cuda_int64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_layer_norm_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_linear_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_linear_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_masked_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_nt_with_broadcasted_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_matmul_nt_with_broadcasted_t_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_narrow_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_in_place_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_add_in_place_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_128_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_128_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_dense_elementwise_embedding_dim_256_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_div_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_noncontiguous_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_indexing_noncontiguous_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_in_place_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_mul_in_place_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_split_with_sizes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_split_with_sizes_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sub_transpose_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sub_transpose_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_nested_tensor_sum_dim_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_reshape_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_scaled_dot_product_attention_input_dim_3_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_False_weights_only_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_False_weights_only_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_True_weights_only_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_True_weights_only_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_True_weights_only_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_serialization_requires_grad_True_weights_only_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_softmax_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_squeeze_unsqueeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim3_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim3_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_dim4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_noncontiguous_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_output_size_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_output_size_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_simple_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_simple_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_padded_tensor_zero_numel_errors_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_to_then_from_padded_tensor_no_transform0213_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_transpose_inference_mode_interaction_cuda_float16, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_abs__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_abs_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_gelu__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_isinf_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_isnan_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_isneginf_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_relu__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_silu__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_silu_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_sin_cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unary_funcs_tanh__cuda, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_unbind_noncontiguous_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_cuda_float32, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_cuda_float64, test/test_nestedtensor.py::TestNestedTensorDeviceTypeCUDA::test_view_inference_mode_interaction_cuda_float64, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_abs_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_accumulate_grad_different_strides_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_as_nested_tensor_propagates_gradients_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_backward_add_strided_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_gelu_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_5d_size_128_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_5d_size_4_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_1024_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_128_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_256_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_32_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_4_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_512_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_layer_norm_backward_size_513_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_masked_fill_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_bmm_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_from_list_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_from_padded_fused_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_linear_plus_transpose_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_matmul_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_reshape_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_softmax_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_transpose_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_nested_tensor_unsqueeze_gradcheck_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_relu_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_selu_backward_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_split_with_sizes_flow_through_cuda, test/test_nestedtensor.py::TestNestedTensorAutogradCUDA::test_values_grad_with_broadcast_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_apply__cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_0_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_1_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_2_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_3_layout_strided_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_jagged_requires_grad_True_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_False_contiguous_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_as_nested_tensor_from_tensor_dim_4_layout_strided_requires_grad_True_contiguous_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_binary_pointwise_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_binary_pointwise_transposed_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_broadcast_shapes_on_in_graph_constructed_njt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_preserves_metadata_cache_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_with_dynamic_max_seq_len_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_compile_with_dynamic_min_seq_len_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_composite_op_with_custom_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_device_dtype_transfer_updates_offsets_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_device_dtype_transfer_updates_offsets_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_dropout_inference_mode_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_dummy_mha_with_nt_use_legacy_api_False_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_dummy_mha_with_nt_use_legacy_api_True_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_is_same_size_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_as_nested_tensor_components_require_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_False_components_require_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_nested_tensor_requires_grad_True_components_require_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_layout_construction_with_pinned_memory_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_op_different_output_shape_dim_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_padded_dense_conversion_kernels_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_padded_dense_conversion_kernels_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_False_values_is_view_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_jagged_view_from_values_offsets_requires_grad_True_values_is_view_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_operate_on_batch_dim_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_reduce_ragged_idx_1_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_reduce_ragged_idx_1_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layer_norm_with_lengths_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_layout_under_torch_dispatch_mode_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_shape_empty_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_shape_randn_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_empty_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_ones_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_rand_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_randint_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_randn_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_like_value_zeros_like_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_linear_backward_memory_usage_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_activation_checkpoint_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_from_jagged_fx_trace_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_nested_tensor_from_jagged_pass_min_max_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_njt_cat_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_noncontiguous_pointwise_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_noncontiguous_to_noncontig_transposed_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_noncontiguous_to_noncontig_transposed_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_noncontiguous_to_noncontig_with_holes_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_batch_only_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_1_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_1_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_mean_transpose_offset_2_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_1_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_reduce_ragged_idx_greater_than_1_different_output_shape_sum_transpose_offset_2_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_transpose_non_ragged_dim_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_False_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_mean_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_op_dim_with_lengths_different_output_shape_sum_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_permute_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_pin_memory_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_reshape_decomp_requires_grad_False_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_backwards_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_compile_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_cuda_bfloat16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_flop_counter_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_constant_sequence_length_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_constant_sequence_length_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sdpa_with_packed_in_proj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_noncontig_transposed_weights_only_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_serialization_noncontig_with_holes_weights_only_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_1_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_1_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_1_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_2_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_reduce_ragged_idx_greater_than_1_same_output_shape_transpose_offset_2_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_False_components_require_grad_False_log_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_False_components_require_grad_True_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_True_components_require_grad_True_log_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_requires_grad_True_components_require_grad_True_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_transpose_non_ragged_dim_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_with_lengths_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_with_lengths_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_dim_with_lengths_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_False_components_require_grad_False_log_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_False_components_require_grad_False_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_True_components_require_grad_False_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_softmax_reduce_batch_dim_requires_grad_True_components_require_grad_True_log_softmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_split_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_batch_and_non_batch_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_False_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_False_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_True_requires_grad_False_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_True_requires_grad_True_components_require_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_sum_dim_reduce_ragged_and_non_batch_keepdim_True_requires_grad_True_components_require_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_tensor_attributes_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_threshold_backward_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_copy_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_dtype_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_2_requires_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_2_requires_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_2_requires_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_3_requires_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_3_requires_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_3_requires_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_3_requires_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_compile_nt_dim_4_requires_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_2_requires_grad_False_cuda_bool, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_2_requires_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_2_requires_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_2_requires_grad_True_cuda_bool, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_False_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_False_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_3_requires_grad_True_cuda_float64, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_4_requires_grad_False_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_4_requires_grad_True_cuda_bool, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_4_requires_grad_True_cuda_float16, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_to_padded_tensor_nt_dim_4_requires_grad_True_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unary_pointwise_transposed_inputs_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_backward_cuda_float32, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_0_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_2_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_3_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_lengths_ragged_idx_equals_2_bad_dim_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unbind_transpose_ragged_idx_2_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_unsafe_view_cuda, test/test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_views_inherit_ragged_dim_cuda, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_add_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_asinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_atanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_bfloat16_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_ceil_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_chunk_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_clamp_max_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_copysign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_cos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_div_floor_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_expm1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_float_power_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_floor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_hypot_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_linalg_vector_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_select_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_masked_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_max_reduction_with_dim_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_maximum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nan_to_num_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nansum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_celu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_hardsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_linear_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_mish_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_prelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_rrelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_softshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_nn_functional_softsign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_polar_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_pow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_reciprocal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_decimals_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_rsqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_select_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sinc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_entr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_i0e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_squeeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_unsqueeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_where_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_backward_xlogy_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___radd___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rdiv___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_asinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_atan2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_cfloat_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_chunk_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_clamp_max_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_conj_physical_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_deg2rad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_digamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_div_floor_rounding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_double_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_erfc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_exp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_expm1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_ldexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_linalg_vector_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log10_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_log_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_logaddexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_logit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_logsumexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_masked_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_max_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_min_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nan_to_num_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_narrow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_celu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_hardsigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_hardtanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_relu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_softshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_nn_functional_softsign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_polygamma_polygamma_n_4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_pow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_rad2deg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_real_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_reciprocal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_round_decimals_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_rsqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_rsub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_select_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sgn_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sigmoid_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_entr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_split_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_square_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_squeeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_std_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_tan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_tanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_unsqueeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_backward_where_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rmul___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rpow___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward___rsub___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_abs_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_acos_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_add_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_all_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_any_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_argmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_argmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_atan2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_atan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_atanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_bfloat16_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_bmm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_bool_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_cdouble_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ceil_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_cfloat_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_chalf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_chunk_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_clamp_min_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_clone_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_complex_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_conj_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_conj_physical_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_count_nonzero_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_div_no_rounding_mode_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_eq_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_exp2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_exp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_float_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_frexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ge_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_gt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_hash_tensor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_heaviside_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_hypot_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_igamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_int_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isclose_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isfinite_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_isnan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ldexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_le_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_log10_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_log2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_log_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logical_and_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_logit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_long_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_lt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_logsumexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_select_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_std_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_masked_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_max_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_maximum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_min_reduction_with_dim_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_minimum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nan_to_num_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_ne_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_neg_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_elu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_embedding_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_hardshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_prelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_relu6_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_relu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_rms_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_softplus_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_softshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_softsign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_nn_functional_threshold_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polar_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_polygamma_polygamma_n_4_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_pow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_real_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_round_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_round_decimals_neg_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_select_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sign_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sinc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_bessel_y0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_chebyshev_polynomial_v_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_erfcx_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_i1e_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_laguerre_polynomial_l_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_modified_bessel_k1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_ndtri_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_scaled_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_special_spherical_bessel_j0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_split_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sqrt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_square_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_squeeze_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_std_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sub_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_tanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_to_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_var_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_compile_forward_where_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___radd___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rdiv___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward___rmod___cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_acosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_all_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_amax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_amin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_angle_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_any_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_asin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_asinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_atan2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_atanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_bool_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_byte_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_cfloat_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_char_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_clamp_min_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_conj_physical_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_cosh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_deg2rad_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_digamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_double_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_eq_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_erf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_erfc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_erfinv_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_exp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_expm1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fill_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_floor_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fmax_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fmin_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_fmod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_frexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ge_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_gt_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_half_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_heaviside_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_i0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_int_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isfinite_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isinf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isnan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isneginf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_isposinf_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_jiterator_binary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_jiterator_unary_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ldexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_le_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_lgamma_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log10_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log1p_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_log_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logaddexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_logical_or_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_logsumexp_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_masked_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_max_reduction_with_dim_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mul_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nan_to_num_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nanmean_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_narrow_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_ne_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_celu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_elu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_hardshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_linear_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_relu6_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_rms_norm_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_rrelu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_selu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_silu_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_softplus_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_softshrink_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_nn_functional_threshold_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polar_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_polygamma_polygamma_n_2_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_positive_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_prod_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_real_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_reciprocal_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_remainder_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_decimals_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_round_decimals_neg_3_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_short_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_signbit_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sinc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sinh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_airy_ai_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_bessel_j1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_bessel_y0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_hermite_polynomial_h_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_hermite_polynomial_he_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_i1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_modified_bessel_k0_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_modified_bessel_k1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_ndtr_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_ndtri_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_scaled_modified_bessel_k1_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_xlog1py_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_special_zeta_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_split_with_sizes_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_sum_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_tan_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_tanh_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_true_divide_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_trunc_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_var_unbiased_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_forward_where_cuda_float32, test/test_nestedtensor.py::TestNestedTensorOpInfoCUDA::test_nested_tensor_non_contiguous_mutation_cuda 2025-12-04T12:27:32.0744123Z 2025-12-04T12:27:32.0744237Z Finished test_nestedtensor 2/2 ... [2025-12-04 12:27:32.048433][3577760.57324308], took 21.97min 2025-12-04T12:27:32.0744660Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T12:27:32.0745012Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:27:32.0745227Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T12:27:32.0745406Z Uploading artifacts took 0.00 seconds 2025-12-04T12:27:32.0745564Z Running test_modules 1/2 ... [2025-12-04 12:27:32.055963][3577760.580776853] 2025-12-04T12:27:32.0745727Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:27:32.0746121Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_modules.py', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:27:32.056182] 2025-12-04T12:34:06.3382875Z 2025-12-04T12:34:06.3383748Z test_modules 1/2 was successful, full logs can be found in artifacts with path test/test-reports/test_modules_1.2_e03f9399cbdd57ef_.log 2025-12-04T12:34:06.3600914Z Running 1801 items in this shard: test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_check_inplace_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveAvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_BatchNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CircularPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConstantPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Conv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose1d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose1d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CosineEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_FractionalMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GaussianNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_GroupNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Hardswish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_HingeEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_InstanceNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_InstanceNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_KLDivLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LPPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Mish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiLabelMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiLabelSoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiheadAttention_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_PoissonNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Tanhshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_Tanhshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_TransformerEncoderLayer_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_TransformerEncoder_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ZeroPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ZeroPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AdaptiveAvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BCELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CircularPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CircularPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConstantPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Conv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose1d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GRUCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GRU_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GroupNorm_cuda_float16, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_KLDivLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LogSigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiLabelMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiLabelMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiheadAttention_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiheadAttention_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_NLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_PoissonNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReflectionPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Tanhshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerDecoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoderLayer_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ZeroPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_device_ctx_init_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_BatchNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_BatchNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_CircularPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_GroupNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_errors_nn_RNN_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_errors_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AdaptiveMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Conv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConvTranspose2d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_FractionalMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GRUCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GaussianNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Hardswish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_InstanceNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LPPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LocalResponseNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Mish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MultiLabelMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MultiheadAttention_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_NLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReflectionPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReplicationPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softplus_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Tanhshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Threshold_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerDecoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerEncoderLayer_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ZeroPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_factory_kwargs_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveMaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_BCELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConstantPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Conv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Conv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Conv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose1d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose2d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_CrossEntropyLoss_cuda_float16, test/test_modules.py::TestModuleCUDA::test_forward_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_forward_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Hardtanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_HuberLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_InstanceNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_L1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LeakyReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_LogSoftmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_MultiheadAttention_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_MultiheadAttention_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_RNN_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_RNN_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReLU6_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_forward_nn_TransformerEncoder_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_forward_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_CosineEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_HuberLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_InstanceNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_InstanceNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MultiLabelMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_MultiheadAttention_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_grad_nn_TransformerEncoderLayer_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AdaptiveAvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_BatchNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ConstantPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_GELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_GroupNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_HuberLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_InstanceNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_InstanceNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MultiLabelMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_MultiMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Tanhshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ZeroPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ZeroPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_gradgrad_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveAvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BCELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BatchNorm1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BatchNorm2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_BatchNorm3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CTCLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CircularPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConstantPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConstantPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Conv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Conv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose1d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose2d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CosineEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CrossEntropyLoss_cuda_float16, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_FractionalMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_FractionalMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Hardswish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Hardtanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_HuberLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_InstanceNorm2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_InstanceNorm3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_KLDivLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LPPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LSTM_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LeakyReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LocalResponseNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LogSigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_LogSoftmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MultiLabelSoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_MultiheadAttention_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_NLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReflectionPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Tanhshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Tanhshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Threshold_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_TransformerDecoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_TransformerEncoder_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_if_train_and_eval_modes_differ_nn_ZeroPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveMaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CircularPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConstantPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Conv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose2d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose2d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CosineEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CosineEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CrossEntropyLoss_cuda_float16, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GRU_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GaussianNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GroupNorm_cuda_float16, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_GroupNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_HingeEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_InstanceNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LPPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LSTM_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LazyConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LocalResponseNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MarginRankingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MultiLabelMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MultiLabelSoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_MultiheadAttention_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_NLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_RNN_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReflectionPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReflectionPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReplicationPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Sigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Tanhshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Threshold_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerEncoderLayer_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ZeroPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ZeroPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ZeroPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_memory_format_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AdaptiveMaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BCELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BatchNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CircularPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConstantPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConstantPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Conv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose1d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose2d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CosineEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_FractionalMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_FractionalMaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GRU_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GRU_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GaussianNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Hardswish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Hardtanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_HingeEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm2d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_InstanceNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_L1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LPPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LazyConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Linear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_LogSoftmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Mish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_MultiMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_NLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_PoissonNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_RNN_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_RNN_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReplicationPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Sigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softplus_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Tanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Tanhshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerDecoderLayer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerEncoderLayer_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerEncoder_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ZeroPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ZeroPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_multiple_device_transfer_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AdaptiveAvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_AvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Bilinear_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CTCLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConstantPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Conv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose1d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose2d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CrossEntropyLoss_cuda_float16, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_CrossEntropyLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_FractionalMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_FractionalMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Hardtanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm3d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LPPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LSTMCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LSTM_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LazyConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LeakyReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LocalResponseNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LocalResponseNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Mish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_MultiheadAttention_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_NLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_PoissonNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RNNCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RNN_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RNN_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReflectionPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Sigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softmax2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softplus_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softplus_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Softshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Threshold_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Threshold_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_TransformerEncoderLayer_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_TransformerEncoder_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ZeroPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_non_contiguous_tensors_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveAvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AdaptiveMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AvgPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_AvgPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_BCELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_BCEWithLogitsLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_BCEWithLogitsLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_BatchNorm1d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_BatchNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_CTCLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_CircularPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_CircularPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConstantPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose1d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose3d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose3d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_FractionalMaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_GRUCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_GaussianNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_GroupNorm_cuda_bfloat16, test/test_modules.py::TestModuleCUDA::test_repr_nn_GroupNorm_cuda_float16, test/test_modules.py::TestModuleCUDA::test_repr_nn_GroupNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Hardshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Hardswish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Hardswish_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_HingeEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_HingeEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_HuberLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_InstanceNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_InstanceNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_L1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LPPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LSTM_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LSTM_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LSTM_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LayerNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Linear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LogSigmoid_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_LogSoftmax_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_MSELoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_MaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_MaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Mish_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiLabelMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiLabelSoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiLabelSoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiheadAttention_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_PoissonNLLLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_RMSNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReflectionPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReflectionPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReflectionPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReplicationPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ReplicationPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_SiLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_SmoothL1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softmax2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softplus_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Softsign_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerEncoderLayer_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerEncoder_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_repr_nn_Transformer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ZeroPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_repr_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveAvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveAvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveAvgPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveMaxPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AdaptiveMaxPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AvgPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AvgPool1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_AvgPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BCELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm2d_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm3d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_BatchNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Bilinear_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CircularPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CircularPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CircularPad2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CircularPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConstantPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConstantPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Conv1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Conv1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Conv2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Conv2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Conv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose1d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose1d_cuda_complex64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose2d_cuda_complex32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose3d_cuda_complex128, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ConvTranspose3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CosineEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CosineEmbeddingLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_CrossEntropyLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Embedding_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Embedding_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_FractionalMaxPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GRUCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GRU_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GRU_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GaussianNLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GroupNorm_cuda_float16, test/test_modules.py::TestModuleCUDA::test_save_load_nn_GroupNorm_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Hardshrink_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Hardtanh_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_HingeEmbeddingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm1d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm1d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm1d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm2d_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm2d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm3d_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_InstanceNorm3d_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_KLDivLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_L1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_L1Loss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LPPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LPPool2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LPPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LPPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LPPool3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LSTMCell_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LSTM_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LayerNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConv3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConv3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConvTranspose1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConvTranspose1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConvTranspose2d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LazyConvTranspose3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LeakyReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LeakyReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LogSigmoid_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_LogSoftmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MSELoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MarginRankingLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MaxPool1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MaxPool2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MaxPool3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MultiLabelMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MultiLabelSoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MultiMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MultiheadAttention_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_MultiheadAttention_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_NLLLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_PReLU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_PReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_RMSNorm_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_RNNCell_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_RNN_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReLU6_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReflectionPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReflectionPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReflectionPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReplicationPad1d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReplicationPad1d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ReplicationPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SELU_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SELU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SiLU_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SmoothL1Loss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SoftMarginLoss_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_SoftMarginLoss_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softmax_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softmin_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softmin_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softshrink_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Softsign_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Tanh_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerDecoderLayer_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoderLayer_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoderLayer_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoderLayer_train_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoder_eval_mode_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoder_eval_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_TransformerEncoder_train_mode_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_Transformer_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ZeroPad2d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ZeroPad3d_cuda_float32, test/test_modules.py::TestModuleCUDA::test_save_load_nn_ZeroPad3d_cuda_float64, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveAvgPool1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveAvgPool2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveAvgPool3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveAvgPool3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveMaxPool1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveMaxPool2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveMaxPool3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AdaptiveMaxPool3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AvgPool1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_AvgPool3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm1d_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm1d_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm1d_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm2d_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm2d_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm2d_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm3d_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_BatchNorm3d_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Bilinear_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CELU_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CELU_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CTCLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CircularPad2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CircularPad3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConstantPad2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConstantPad3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConstantPad3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Conv1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Conv2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Conv2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConvTranspose1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConvTranspose2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConvTranspose2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ConvTranspose3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CosineEmbeddingLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_CrossEntropyLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ELU_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_FractionalMaxPool2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_FractionalMaxPool2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_FractionalMaxPool3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_FractionalMaxPool3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GELU_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GELU_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GLU_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GLU_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GRUCell_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GRU_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_GroupNorm_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Hardshrink_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Hardshrink_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Hardswish_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Hardtanh_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_HingeEmbeddingLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm1d_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm1d_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm1d_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm2d_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm3d_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm3d_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_InstanceNorm3d_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_KLDivLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LPPool1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LPPool3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LPPool3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LSTMCell_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LSTM_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LSTM_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LeakyReLU_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LeakyReLU_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Linear_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LocalResponseNorm_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LogSigmoid_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LogSoftmax_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_LogSoftmax_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MSELoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MSELoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MarginRankingLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MaxPool1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MaxPool2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MaxPool2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Mish_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MultiLabelMarginLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MultiLabelMarginLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MultiMarginLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MultiheadAttention_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_MultiheadAttention_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_NLLLoss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_NLLLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_RMSNorm_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_RNN_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_RNN_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReflectionPad1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReflectionPad2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReflectionPad3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReplicationPad1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReplicationPad1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReplicationPad2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReplicationPad3d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ReplicationPad3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_SiLU_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Sigmoid_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Sigmoid_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_SmoothL1Loss_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_SmoothL1Loss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_SoftMarginLoss_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softmax2d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softmax_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softplus_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softshrink_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softsign_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Softsign_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Tanh_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Tanhshrink_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_Threshold_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoderLayer_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoderLayer_eval_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoder_eval_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoder_train_mode_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_TransformerEncoder_train_mode_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ZeroPad1d_swap_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ZeroPad1d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ZeroPad2d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_empty_nn_ZeroPad3d_swap_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveAvgPool3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool1d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool3d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AdaptiveMaxPool3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool3d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_AvgPool3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BCELoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BCEWithLogitsLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BCEWithLogitsLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BCEWithLogitsLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_eval_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_eval_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm1d_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm2d_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm2d_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm2d_eval_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm2d_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm2d_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm3d_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_BatchNorm3d_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Bilinear_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Bilinear_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CELU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CTCLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CTCLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CircularPad3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConstantPad1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConstantPad2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv1d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Conv3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose1d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose3d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ConvTranspose3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CosineEmbeddingLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CosineEmbeddingLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CosineEmbeddingLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CrossEntropyLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_CrossEntropyLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ELU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ELU_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ELU_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ELU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Embedding_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Embedding_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_FractionalMaxPool2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_FractionalMaxPool3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_FractionalMaxPool3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_FractionalMaxPool3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GELU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GELU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GLU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GLU_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GLU_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GLU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRUCell_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRUCell_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRUCell_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRU_eval_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRU_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRU_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GRU_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GaussianNLLLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GaussianNLLLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GaussianNLLLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GroupNorm_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GroupNorm_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_GroupNorm_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardshrink_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardshrink_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardshrink_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardshrink_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardswish_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardswish_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardtanh_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Hardtanh_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_HingeEmbeddingLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_HingeEmbeddingLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_HingeEmbeddingLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_HuberLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_HuberLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_HuberLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm1d_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm1d_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm1d_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm2d_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm2d_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm2d_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm2d_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm3d_eval_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm3d_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm3d_eval_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm3d_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_InstanceNorm3d_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_KLDivLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_L1Loss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_L1Loss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool3d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LPPool3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTMCell_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTM_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTM_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTM_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LSTM_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LayerNorm_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LeakyReLU_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Linear_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LocalResponseNorm_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LocalResponseNorm_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LogSigmoid_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LogSigmoid_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LogSoftmax_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_LogSoftmax_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MSELoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MSELoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MSELoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MarginRankingLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MaxPool1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MaxPool2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MaxPool3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Mish_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Mish_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Mish_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelMarginLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelMarginLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelMarginLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelSoftMarginLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelSoftMarginLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelSoftMarginLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiLabelSoftMarginLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiMarginLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiMarginLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiheadAttention_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiheadAttention_eval_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiheadAttention_eval_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiheadAttention_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiheadAttention_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_MultiheadAttention_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_NLLLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_PReLU_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_PReLU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_PoissonNLLLoss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_PoissonNLLLoss_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RMSNorm_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RMSNorm_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNNCell_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNNCell_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNN_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNN_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_RNN_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReLU6_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReLU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReLU_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad1d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad2d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad3d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReflectionPad3d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad1d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad2d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad3d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ReplicationPad3d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SELU_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SELU_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SELU_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Sigmoid_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Sigmoid_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Sigmoid_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SmoothL1Loss_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SmoothL1Loss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SoftMarginLoss_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_SoftMarginLoss_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmax2d_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmax2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmax_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmax_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmin_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softmin_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softplus_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softshrink_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softsign_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Softsign_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Tanh_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Tanhshrink_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Tanhshrink_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Threshold_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerDecoderLayer_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerDecoderLayer_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerDecoderLayer_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoderLayer_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoderLayer_train_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoderLayer_train_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoderLayer_train_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoder_eval_mode_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoder_eval_mode_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoder_eval_mode_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_TransformerEncoder_train_mode_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Transformer_swap_False_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_Transformer_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad1d_swap_False_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad1d_swap_True_set_grad_True_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad2d_swap_True_set_grad_False_cuda_float32, test/test_modules.py::TestModuleCUDA::test_to_nn_ZeroPad3d_swap_False_set_grad_True_cuda_float32 2025-12-04T12:34:06.3805439Z 2025-12-04T12:34:06.3805542Z Finished test_modules 1/2 ... [2025-12-04 12:34:06.339254][3578154.86405229], took 6.57min 2025-12-04T12:34:06.3805915Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T12:34:06.3806344Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:34:06.3806578Z Running test_cpp_extensions_mtia_backend 1/1 ... [2025-12-04 12:34:06.346248][3578154.871062147] 2025-12-04T12:34:06.3806776Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:34:06.3807174Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_cpp_extensions_mtia_backend.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:34:06.346457] 2025-12-04T12:34:08.2136416Z 2025-12-04T12:34:08.2137679Z test_cpp_extensions_mtia_backend 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_cpp_extensions_mtia_backend_1.1_9a81146bc53cf9c7_.log 2025-12-04T12:34:08.2140593Z Running 5 items in this shard: test/test_cpp_extensions_mtia_backend.py::TestCppExtensionMTIABackend::test_device_context, test/test_cpp_extensions_mtia_backend.py::TestCppExtensionMTIABackend::test_get_device_module, test/test_cpp_extensions_mtia_backend.py::TestCppExtensionMTIABackend::test_stream_basic, test/test_cpp_extensions_mtia_backend.py::TestCppExtensionMTIABackend::test_stream_context, test/test_cpp_extensions_mtia_backend.py::TestCppExtensionMTIABackend::test_stream_context_different_device 2025-12-04T12:34:08.2142746Z 2025-12-04T12:34:08.2143024Z Finished test_cpp_extensions_mtia_backend 1/1 ... [2025-12-04 12:34:08.213374][3578156.738183516], took 0.03min 2025-12-04T12:34:08.2154641Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T12:34:08.2207293Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:34:08.2209054Z Running lazy/test_ts_opinfo 1/1 ... [2025-12-04 12:34:08.220810][3578156.745624555] 2025-12-04T12:34:08.2209365Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:34:08.2211084Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'lazy/test_ts_opinfo.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:34:08.220997] 2025-12-04T12:34:11.4904149Z 2025-12-04T12:34:11.4905026Z lazy/test_ts_opinfo 1/1 was successful, full logs can be found in artifacts with path test/test-reports/lazy.test_ts_opinfo_1.1_724f80422f6aa430_.log 2025-12-04T12:34:11.4906243Z Running 5 items in this shard: test/lazy/test_ts_opinfo.py::TestLazyTensor::testConvolutionBackward, test/lazy/test_ts_opinfo.py::TestLazyTensor::test_tensor_ctr, test/lazy/test_ts_opinfo.py::TestLazyTensor::test_view_mark_step_preserved, test/lazy/test_ts_opinfo.py::TestLazyDynamicOps::test_adaptiveavgpool3d_dynamic, test/lazy/test_ts_opinfo.py::TestLazyDynamicOps::test_nonzero_dynamic 2025-12-04T12:34:11.4907089Z 2025-12-04T12:34:11.4907265Z Finished lazy/test_ts_opinfo 1/1 ... [2025-12-04 12:34:11.490097][3578160.014905455], took 0.05min 2025-12-04T12:34:11.4917119Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T12:34:11.4970075Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:34:11.4971766Z Running test_dynamic_shapes 1/1 ... [2025-12-04 12:34:11.497069][3578160.021883002] 2025-12-04T12:34:11.4972166Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:34:11.4973711Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_dynamic_shapes.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:34:11.497264] 2025-12-04T12:34:39.6553202Z 2025-12-04T12:34:39.6554093Z test_dynamic_shapes 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_dynamic_shapes_1.1_3fe84a99522f30aa_.log 2025-12-04T12:34:39.6605349Z Running 378 items in this shard: test/test_dynamic_shapes.py::TestPySymInt::test_arith_ops, test/test_dynamic_shapes.py::TestPySymInt::test_aten_ops, test/test_dynamic_shapes.py::TestPySymInt::test_avoid_unbacked_substitution, test/test_dynamic_shapes.py::TestPySymInt::test_backed_size_oblivious_01_spec, test/test_dynamic_shapes.py::TestPySymInt::test_baddbmm_symint, test/test_dynamic_shapes.py::TestPySymInt::test_binary, test/test_dynamic_shapes.py::TestPySymInt::test_data_dependent_guard, test/test_dynamic_shapes.py::TestPySymInt::test_data_dependent_guard_propagate_real_tensors, test/test_dynamic_shapes.py::TestPySymInt::test_debug_has_internal_overlap_unbacked, test/test_dynamic_shapes.py::TestPySymInt::test_deepcopy, test/test_dynamic_shapes.py::TestPySymInt::test_duck_shape, test/test_dynamic_shapes.py::TestPySymInt::test_ephemeral_source_simplification, test/test_dynamic_shapes.py::TestPySymInt::test_ephemeral_source_unified_with_non_ephemeral_source, test/test_dynamic_shapes.py::TestPySymInt::test_expect_true_basic, test/test_dynamic_shapes.py::TestPySymInt::test_expect_true_double_digits, test/test_dynamic_shapes.py::TestPySymInt::test_expect_true_prefer_later, test/test_dynamic_shapes.py::TestPySymInt::test_expect_true_refine_range, test/test_dynamic_shapes.py::TestPySymInt::test_expect_true_with_s0, test/test_dynamic_shapes.py::TestPySymInt::test_floor_clean_div_axioms, test/test_dynamic_shapes.py::TestPySymInt::test_floordiv_static, test/test_dynamic_shapes.py::TestPySymInt::test_fx_trace_intlist, test/test_dynamic_shapes.py::TestPySymInt::test_guard_int, test/test_dynamic_shapes.py::TestPySymInt::test_guard_refine_range, test/test_dynamic_shapes.py::TestPySymInt::test_hash_size, test/test_dynamic_shapes.py::TestPySymInt::test_int_bool, test/test_dynamic_shapes.py::TestPySymInt::test_int_conversion, test/test_dynamic_shapes.py::TestPySymInt::test_int_to_float, test/test_dynamic_shapes.py::TestPySymInt::test_max_of_unique_summation_opt, test/test_dynamic_shapes.py::TestPySymInt::test_meta_symint, test/test_dynamic_shapes.py::TestPySymInt::test_mul_int_oo_nan, test/test_dynamic_shapes.py::TestPySymInt::test_non_overlapping_and_dense_backed, test/test_dynamic_shapes.py::TestPySymInt::test_non_overlapping_and_dense_unbacked, test/test_dynamic_shapes.py::TestPySymInt::test_numel, test/test_dynamic_shapes.py::TestPySymInt::test_numpy_sym_max, test/test_dynamic_shapes.py::TestPySymInt::test_numpy_sym_min, test/test_dynamic_shapes.py::TestPySymInt::test_prefer_deferred_runtime_assertions_over_guards, test/test_dynamic_shapes.py::TestPySymInt::test_prims_non_overlapping_and_dense, test/test_dynamic_shapes.py::TestPySymInt::test_print_readable_with_symints, test/test_dynamic_shapes.py::TestPySymInt::test_reverse_arith_ops, test/test_dynamic_shapes.py::TestPySymInt::test_roundtrip, test/test_dynamic_shapes.py::TestPySymInt::test_size_expressions, test/test_dynamic_shapes.py::TestPySymInt::test_slice_backed_size_oblivious, test/test_dynamic_shapes.py::TestPySymInt::test_specialize_zero_one, test/test_dynamic_shapes.py::TestPySymInt::test_statically_known_false, test/test_dynamic_shapes.py::TestPySymInt::test_statically_known_true, test/test_dynamic_shapes.py::TestPySymInt::test_stride, test/test_dynamic_shapes.py::TestPySymInt::test_sym_ceil, test/test_dynamic_shapes.py::TestPySymInt::test_sym_floor, test/test_dynamic_shapes.py::TestPySymInt::test_sym_int, test/test_dynamic_shapes.py::TestPySymInt::test_sym_ite, test/test_dynamic_shapes.py::TestPySymInt::test_sym_log2, test/test_dynamic_shapes.py::TestPySymInt::test_sym_max_multi_max_simplify, test/test_dynamic_shapes.py::TestPySymInt::test_sym_sqrt, test/test_dynamic_shapes.py::TestPySymInt::test_sym_sum, test/test_dynamic_shapes.py::TestPySymInt::test_sym_trunc, test/test_dynamic_shapes.py::TestPySymInt::test_symint_args, test/test_dynamic_shapes.py::TestPySymInt::test_symint_as_scalar, test/test_dynamic_shapes.py::TestPySymInt::test_symint_bitwise_and, test/test_dynamic_shapes.py::TestPySymInt::test_symint_bitwise_or, test/test_dynamic_shapes.py::TestPySymInt::test_symint_bitwise_xor, test/test_dynamic_shapes.py::TestPySymInt::test_symint_vargs, test/test_dynamic_shapes.py::TestPySymInt::test_sympify_symint, test/test_dynamic_shapes.py::TestPySymInt::test_sympy_optimized_add, test/test_dynamic_shapes.py::TestPySymInt::test_sympy_optimized_add_binary_search, test/test_dynamic_shapes.py::TestPySymInt::test_tensor_factory_with_symint, test/test_dynamic_shapes.py::TestPySymInt::test_tracing_sym_ite, test/test_dynamic_shapes.py::TestPySymInt::test_unbacked_substitution, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_abs, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_add, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_and, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_bitwise_and, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_bitwise_or, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_bitwise_xor, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_ceil, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_eq, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_float_pow, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_float_truediv, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_floor, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_ge, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_gt, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_int_floordiv, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_int_truediv, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_is_integer, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_le, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_lshift, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_lt, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_mod, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_mul, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_ne, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_neg, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_or, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_pos, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_pow_by_natural, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_round, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_rshift, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sub, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_acos, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_asin, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_atan, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_cos, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_cosh, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_ite, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_log2, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_max, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_min, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_not, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_sin, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_sinh, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_sqrt, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_tan, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_sym_tanh, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_bool_method_fn_trunc, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_dynamic_int_basic_compile_backend_eager, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_dynamic_int_basic_compile_backend_inductor, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_dynamic_int_eager_usage, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_abs_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_abs_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_abs_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_abs_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_add_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_add_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_add_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_add_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_and_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_and_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_and_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_and_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_and_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_and_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_and_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_and_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_or_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_or_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_or_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_or_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_xor_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_xor_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_xor_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_bitwise_xor_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ceil_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ceil_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ceil_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ceil_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_eq_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_eq_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_eq_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_eq_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_float_pow_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_float_pow_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_float_pow_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_float_pow_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_float_truediv_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_float_truediv_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_float_truediv_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_float_truediv_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_floor_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_floor_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_floor_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_floor_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ge_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ge_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ge_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ge_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_gt_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_gt_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_gt_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_gt_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_int_floordiv_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_int_floordiv_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_int_floordiv_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_int_floordiv_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_int_truediv_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_int_truediv_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_int_truediv_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_int_truediv_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_is_integer_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_is_integer_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_is_integer_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_is_integer_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_le_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_le_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_le_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_le_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_lshift_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_lshift_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_lshift_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_lshift_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_lt_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_lt_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_lt_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_lt_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_mod_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_mod_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_mod_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_mod_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_mul_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_mul_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_mul_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_mul_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ne_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ne_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ne_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_ne_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_neg_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_neg_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_neg_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_neg_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_or_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_or_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_or_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_or_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_pos_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_pos_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_pos_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_pos_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_pow_by_natural_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_pow_by_natural_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_pow_by_natural_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_pow_by_natural_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_round_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_round_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_round_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_round_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_rshift_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_rshift_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_rshift_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_rshift_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sub_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sub_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sub_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sub_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_acos_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_acos_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_acos_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_acos_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_asin_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_asin_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_asin_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_asin_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_atan_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_atan_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_atan_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_atan_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_cos_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_cos_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_cos_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_cos_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_cosh_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_cosh_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_cosh_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_cosh_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_float_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_float_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_float_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_float_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_ite_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_ite_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_ite_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_ite_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_log2_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_log2_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_log2_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_log2_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_max_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_max_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_max_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_max_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_min_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_min_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_min_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_min_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_not_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_not_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_not_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_not_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sin_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sin_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sin_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sin_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sinh_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sinh_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sinh_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sinh_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sqrt_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sqrt_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sqrt_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_sqrt_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_tan_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_tan_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_tan_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_tan_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_tanh_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_tanh_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_tanh_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_sym_tanh_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_trunc_first_type_float_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_trunc_first_type_float_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_trunc_first_type_int_second_type_float, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_method_fn_trunc_first_type_int_second_type_int, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_non_symbolic_symnode, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_stride_symnode, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_symint_deepcopy, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_symint_hashing, test/test_dynamic_shapes.py::TestSymNumberMagicMethods::test_symnode_hashing, test/test_dynamic_shapes.py::TestFloorDiv::test_floordiv_assumptions, test/test_dynamic_shapes.py::TestFloorDiv::test_floordiv_div_by_one, test/test_dynamic_shapes.py::TestFloorDiv::test_floordiv_div_does_not_generate_non_int_rational, test/test_dynamic_shapes.py::TestFloorDiv::test_floordiv_float_int, test/test_dynamic_shapes.py::TestFloorDiv::test_floordiv_simplify, test/test_dynamic_shapes.py::TestDimConstraints::test_dim_constraints_reduce_congruences_simple, test/test_dynamic_shapes.py::TestDimConstraints::test_dim_constraints_reduce_inequalities_error, test/test_dynamic_shapes.py::TestDimConstraints::test_dim_constraints_reduce_inequalities_simple, test/test_dynamic_shapes.py::TestDimConstraints::test_dim_constraints_solve_full, test/test_dynamic_shapes.py::TestDimConstraints::test_simplify_max_1_0, test/test_dynamic_shapes.py::TestGuardsExpressions::test_guard_or_false, test/test_dynamic_shapes.py::TestGuardsExpressions::test_guard_or_true, test/test_dynamic_shapes.py::TestGuardsExpressions::test_guards_float_div, test/test_dynamic_shapes.py::TestGuardsExpressions::test_guards_float_print, test/test_dynamic_shapes.py::TestGuardsExpressions::test_guards_gt_lt, test/test_dynamic_shapes.py::TestGuardsExpressions::test_remove_symbols_without_guarding, test/test_dynamic_shapes.py::TestGuardsExpressions::test_size_comparison_no_recompile, test/test_dynamic_shapes.py::TestUnbacked::test_deferred_neq_assert_backend_eager, test/test_dynamic_shapes.py::TestUnbacked::test_deferred_neq_assert_backend_inductor, test/test_dynamic_shapes.py::TestUnbacked::test_deferred_sym_eq_assert_backend_eager, test/test_dynamic_shapes.py::TestUnbacked::test_deferred_sym_eq_assert_backend_inductor, test/test_dynamic_shapes.py::TestUnbacked::test_deferred_sym_or_assert_backend_eager, test/test_dynamic_shapes.py::TestUnbacked::test_deferred_sym_or_assert_backend_inductor, test/test_dynamic_shapes.py::TestUnbacked::test_deferred_with_unbacked_input_backend_eager, test/test_dynamic_shapes.py::TestUnbacked::test_deferred_with_unbacked_input_backend_inductor, test/test_dynamic_shapes.py::TestUnbacked::test_div_unbacked_eq_globals, test/test_dynamic_shapes.py::TestUnbacked::test_div_unbacked_eq_input_ints, test/test_dynamic_shapes.py::TestUnbacked::test_div_unbacked_eq_input_tensors, test/test_dynamic_shapes.py::TestUnbacked::test_div_unbacked_eq_item, test/test_dynamic_shapes.py::TestUnbacked::test_do_not_guard_unbacked_inputs, test/test_dynamic_shapes.py::TestUnbacked::test_has_free_symbols, test/test_dynamic_shapes.py::TestUnbacked::test_post_specialize_runtime_assert1_backend_eager, test/test_dynamic_shapes.py::TestUnbacked::test_post_specialize_runtime_assert1_backend_inductor, test/test_dynamic_shapes.py::TestUnbacked::test_post_specialize_runtime_assert2_backend_eager, test/test_dynamic_shapes.py::TestUnbacked::test_post_specialize_runtime_assert2_backend_inductor, test/test_dynamic_shapes.py::TestUbackedOps::test_backed_size_oblivious_broadcast, test/test_dynamic_shapes.py::TestUbackedOps::test_backed_size_oblivious_expand, test/test_dynamic_shapes.py::TestUbackedOps::test_invalid_view_unbacked_view, test/test_dynamic_shapes.py::TestUbackedOps::test_narrow_unbacked_start, test/test_dynamic_shapes.py::TestUbackedOps::test_narrow_unbacked_start_cpp_wrapper, test/test_dynamic_shapes.py::TestUbackedOps::test_narrow_with_tensor_start, test/test_dynamic_shapes.py::TestUbackedOps::test_nonzero_select, test/test_dynamic_shapes.py::TestUbackedOps::test_nonzero_select_cpp_wrapper, test/test_dynamic_shapes.py::TestUbackedOps::test_nonzero_slice, test/test_dynamic_shapes.py::TestUbackedOps::test_nonzero_slice_cpp_wrapper, test/test_dynamic_shapes.py::TestUbackedOps::test_padnd, test/test_dynamic_shapes.py::TestUbackedOps::test_select_scatter_unbacked_index, test/test_dynamic_shapes.py::TestUbackedOps::test_slice_with_tensor_indices, test/test_dynamic_shapes.py::TestUbackedOps::test_slice_with_tensor_indices_cpp_wrapper, test/test_dynamic_shapes.py::TestUbackedOps::test_tensor_split, test/test_dynamic_shapes.py::TestUbackedOps::test_tensor_split_cpp_wrapper, test/test_dynamic_shapes.py::TestUbackedOps::test_trunc_int_div_true, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_contiguous, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_item, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_item_set_item, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_item_set_item2, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_item_set_item3, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_non_contigious_reshape_failing, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_reshape1, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_reshape2, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_reshape3, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_reshape_copy, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_select2, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_select_2, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_select_index, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_select_index_cpp_wrapper, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_select_index_with_check, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_slice, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_slice_cpp_wrapper, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_slice_with_step, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_slice_with_step_cpp_wrapper, test/test_dynamic_shapes.py::TestUbackedOps::test_unbacked_view_extra, test/test_dynamic_shapes.py::TestUbackedOps::test_unbind_not_dynamic 2025-12-04T12:34:39.6657955Z 2025-12-04T12:34:39.6658069Z Finished test_dynamic_shapes 1/1 ... [2025-12-04 12:34:39.655389][3578188.180198293], took 0.47min 2025-12-04T12:34:39.6658452Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T12:34:39.6658811Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:34:39.6659081Z Running test_schema_check 1/1 ... [2025-12-04 12:34:39.663078][3578188.187892058] 2025-12-04T12:34:39.6659249Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:34:39.6659620Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_schema_check.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:34:39.663307] 2025-12-04T12:44:16.7103043Z 2025-12-04T12:44:16.7104005Z test_schema_check 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_schema_check_1.1_d4fa2afdf432c9c4_.log 2025-12-04T12:44:16.8003144Z Running 5996 items in this shard: test/test_schema_check.py::TestSchemaCheck::test_alias_check_fail_custom_ops_output_is_input, test/test_schema_check.py::TestSchemaCheck::test_alias_check_fail_custom_ops_secretly_aliasing, test/test_schema_check.py::TestSchemaCheck::test_alias_check_fail_custom_ops_secretly_mutating, test/test_schema_check.py::TestSchemaCheck::test_alias_check_fail_multiple_operators, test/test_schema_check.py::TestSchemaCheck::test_alias_check_fail_multiple_operators_centered, test/test_schema_check.py::TestSchemaCheck::test_alias_check_fail_outputs_unexpectedly_aliasing, test/test_schema_check.py::TestSchemaCheck::test_alias_check_fail_simple, test/test_schema_check.py::TestSchemaCheck::test_is_alias_of_basic, test/test_schema_check.py::TestSchemaCheck::test_is_alias_of_empty_container, test/test_schema_check.py::TestSchemaCheck::test_mutation_check_fail, test/test_schema_check.py::TestSchemaCheck::test_mutation_check_fail_multiple_operators, test/test_schema_check.py::TestSchemaCheck::test_overlaps_basic, test/test_schema_check.py::TestSchemaCheck::test_overlaps_empty_container, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_empty_list_input, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_aliasing_inputs, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_default_replaced, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_device_input, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_kwarg_tensor, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_list_input, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_mutable_inputs, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_nested_training_op, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_training_op, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_wildcard_after, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_with_multiple_outputs, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_functionality_with_multiple_outputs_aliasing, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_mutated_aliasing_aliasing_inputs, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_mutated_aliasing_aliasing_outputs, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_mutated_aliasing_as_strided, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_mutated_aliasing_multiple_outputs, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_mutated_aliasing_mutation, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_mutated_aliasing_none, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_mutated_aliasing_resize_, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_operator_order, test/test_schema_check.py::TestSchemaCheck::test_schema_check_mode_operator_order_without_grad, test/test_schema_check.py::TestSchemaCheck::test_schema_info_bind_basic, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_H_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_T_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___getitem___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___radd___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rand___cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rand___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rand___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rand___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rand___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rand___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rdiv___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmatmul___cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmatmul___cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmatmul___cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmatmul___cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmatmul___cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmatmul___cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmod___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rmul___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___ror___cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___ror___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___ror___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___ror___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___ror___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___ror___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rpow___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rsub___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rxor___cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rxor___cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rxor___cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rxor___cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rxor___cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness___rxor___cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__batch_norm_with_update_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__batch_norm_with_update_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__batch_norm_with_update_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__batch_norm_with_update_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__chunk_cat_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__native_batch_norm_legit_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__native_batch_norm_legit_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__native_batch_norm_legit_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__native_batch_norm_legit_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__segment_reduce_lengths_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__segment_reduce_lengths_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__segment_reduce_lengths_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__segment_reduce_lengths_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__segment_reduce_offsets_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__segment_reduce_offsets_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__segment_reduce_offsets_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__segment_reduce_offsets_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__softmax_backward_data_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__softmax_backward_data_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__softmax_backward_data_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__softmax_backward_data_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__unsafe_masked_index_put_accumulate_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__upsample_bilinear2d_aa_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__upsample_bilinear2d_aa_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__upsample_bilinear2d_aa_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness__upsample_bilinear2d_aa_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_abs_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acos_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_acosh_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_add_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addbmm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addbmm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addbmm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addbmm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addbmm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addbmm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcdiv_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcdiv_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcdiv_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcdiv_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcdiv_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcdiv_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addcmul_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_decomposed_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_decomposed_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_decomposed_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_decomposed_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_decomposed_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmm_decomposed_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmv_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmv_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmv_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmv_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmv_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addmv_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_addr_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_alias_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_all_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_allclose_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_allclose_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_allclose_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_allclose_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_allclose_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_allclose_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_amin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_aminmax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_angle_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_any_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_arange_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argmin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argsort_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_argwhere_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_partial_views_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_as_strided_scatter_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_asinh_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atan_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atanh_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_1d_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_2d_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_atleast_3d_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_baddbmm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_baddbmm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_baddbmm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_baddbmm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_baddbmm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_baddbmm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bernoulli_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bernoulli_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bernoulli_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bernoulli_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bfloat16_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bincount_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bincount_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bincount_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bincount_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bincount_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_and_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_and_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_and_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_and_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_and_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_and_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_left_shift_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_left_shift_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_left_shift_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_left_shift_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_left_shift_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_not_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_not_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_not_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_not_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_not_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_not_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_or_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_or_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_or_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_or_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_or_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_or_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_right_shift_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_right_shift_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_right_shift_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_right_shift_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_right_shift_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_xor_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_xor_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_xor_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_xor_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_xor_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bitwise_xor_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_block_diag_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bmm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bmm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bmm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bmm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bmm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bmm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bool_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_shapes_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_tensors_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_broadcast_to_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_bucketize_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_byte_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cartesian_prod_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cat_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cauchy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cauchy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cauchy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cauchy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdist_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdist_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cdouble_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ceil_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cfloat_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chalf_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_char_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_inverse_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_inverse_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_inverse_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_inverse_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_solve_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_solve_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_solve_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cholesky_solve_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_chunk_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_max_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clamp_min_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_clone_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_column_stack_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_combinations_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_complex_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_complex_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_complex_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_conj_physical_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_constant_pad_nd_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_contiguous_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_copysign_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_corrcoef_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cos_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cosh_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_count_nonzero_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cov_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cross_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cummin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumprod_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumsum_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_cumulative_trapezoid_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_deg2rad_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diag_embed_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagflat_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diagonal_scatter_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_diff_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_digamma_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dist_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dist_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dist_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dist_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dist_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dist_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_floor_rounding_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_no_rounding_mode_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_div_trunc_rounding_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dot_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dot_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dot_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dot_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dot_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dot_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_double_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dsplit_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_dstack_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_einsum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_einsum_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_einsum_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_einsum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_einsum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_einsum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_like_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_permuted_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_empty_strided_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eq_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_equal_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erf_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfc_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_erfinv_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exp_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_as_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expand_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_expm1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exponential_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exponential_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exponential_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_exponential_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_float8_e4m3fn, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_float8_e4m3fnuz, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_float8_e5m2, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_float8_e5m2fnuz, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_eye_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fft_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftn_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_fftshift_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfft_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_hfftn_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifft_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftn_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ifftshift_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfft_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_ihfftn_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfft_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_irfftn_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfft_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft_rfftn_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fill_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flatten_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flip_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fliplr_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_flipud_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_float_power_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_floor_divide_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fmod_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_frac_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_frac_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_frac_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_frac_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_frexp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_frexp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_frexp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_frexp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_uint16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_uint32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_full_like_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gather_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gcd_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gcd_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gcd_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gcd_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gcd_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ge_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geometric_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geqrf_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geqrf_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geqrf_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_geqrf_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gradient_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_grid_sampler_2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_grid_sampler_2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_grid_sampler_2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_grid_sampler_2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_grid_sampler_3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_grid_sampler_3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_grid_sampler_3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_grid_sampler_3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_gt_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_half_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hash_tensor_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_heaviside_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_histc_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_histc_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_histc_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_histc_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_histc_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_histc_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_histc_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hsplit_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hstack_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hypot_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hypot_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hypot_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_hypot_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_i0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_igamma_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_igamma_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_igammac_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_igammac_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_imag_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_imag_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_imag_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_add_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_fill_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_put_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_amin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_mean_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_reduce_prod_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_index_select_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_inner_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_inner_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_inner_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_inner_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_inner_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_inner_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_int_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isclose_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isfinite_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isinf_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isnan_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isneginf_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isposinf_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_isreal_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_istft_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_istft_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_item_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_2inputs_2outputs_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_4inputs_with_extra_args_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_binary_return_by_ref_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_jiterator_unary_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kron_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_kthvalue_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lcm_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lcm_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lcm_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lcm_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lcm_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ldexp_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_le_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lerp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lerp_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lerp_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lerp_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lerp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lerp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lerp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lgamma_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cholesky_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cholesky_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cholesky_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cholesky_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cholesky_ex_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cholesky_ex_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cholesky_ex_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cholesky_ex_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cond_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cond_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cond_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cond_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_cross_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_det_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_det_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_det_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_det_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_diagonal_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eig_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eig_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eig_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eig_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigh_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigh_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigvals_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigvals_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigvals_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigvals_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigvalsh_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigvalsh_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigvalsh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_eigvalsh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_householder_product_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_householder_product_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_householder_product_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_householder_product_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_inv_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_inv_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_inv_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_inv_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_inv_ex_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_inv_ex_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_inv_ex_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_inv_ex_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_factor_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_factor_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_factor_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_factor_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_factor_ex_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_factor_ex_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_factor_ex_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_factor_ex_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_solve_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_solve_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_solve_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_ldl_solve_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lstsq_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lstsq_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lstsq_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lstsq_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lstsq_grad_oriented_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lstsq_grad_oriented_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lstsq_grad_oriented_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lstsq_grad_oriented_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_factor_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_factor_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_factor_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_factor_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_factor_ex_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_factor_ex_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_factor_ex_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_factor_ex_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_solve_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_solve_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_solve_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_lu_solve_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_norm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_norm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_power_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_power_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_power_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_power_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_rank_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_rank_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_rank_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_rank_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_rank_hermitian_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_rank_hermitian_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_rank_hermitian_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_matrix_rank_hermitian_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_multi_dot_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_multi_dot_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_multi_dot_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_multi_dot_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_multi_dot_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_multi_dot_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_subgradients_at_zero_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_subgradients_at_zero_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_subgradients_at_zero_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_subgradients_at_zero_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_norm_subgradients_at_zero_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_hermitian_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_hermitian_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_hermitian_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_hermitian_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_singular_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_singular_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_singular_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_pinv_singular_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_qr_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_qr_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_qr_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_qr_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_slogdet_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_slogdet_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_slogdet_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_slogdet_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_ex_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_ex_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_ex_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_ex_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_triangular_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_triangular_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_triangular_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_solve_triangular_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_svd_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_svd_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_svd_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_svd_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_svdvals_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_svdvals_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_svdvals_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_svdvals_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_tensorinv_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_tensorinv_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_tensorinv_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_tensorinv_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_tensorsolve_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_tensorsolve_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_tensorsolve_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_tensorsolve_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vander_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vecdot_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vecdot_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vecdot_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vecdot_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vecdot_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vecdot_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vector_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vector_norm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vector_norm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vector_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vector_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linalg_vector_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_linspace_tensor_overload_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log10_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log1p_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_normal_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_normal_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_normal_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_normal_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_log_softmax_with_dtype_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp2_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logaddexp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logcumsumexp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logcumsumexp_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logcumsumexp_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logcumsumexp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logcumsumexp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logcumsumexp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logdet_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logdet_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logdet_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logdet_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_and_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_not_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_or_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logical_xor_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logit_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logspace_tensor_overload_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_logsumexp_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_long_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lt_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_solve_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_solve_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_solve_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_solve_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_unpack_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_unpack_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_unpack_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_lu_unpack_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mH_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mT_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_amin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_argmin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumprod_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_cumsum_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_fill_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_log_softmax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_log_softmax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_log_softmax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_log_softmax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logaddexp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logaddexp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logaddexp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logaddexp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_logsumexp_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_mean_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_mean_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_mean_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_mean_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_mean_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_mean_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_median_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_median_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_median_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_median_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_normalize_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_normalize_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_normalize_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_normalize_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_normalize_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_normalize_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_prod_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_scatter_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_select_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_softmax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_softmax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_softmax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_softmax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_softmin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_softmin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_softmin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_softmin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_std_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_sum_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_masked_var_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matmul_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matmul_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matmul_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matmul_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matmul_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matmul_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matrix_exp_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matrix_exp_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matrix_exp_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matrix_exp_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matrix_exp_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_matrix_exp_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_binary_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_pool2d_with_indices_backward_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_pool2d_with_indices_backward_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_pool2d_with_indices_backward_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_pool2d_with_indices_backward_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_no_dim_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_max_reduction_with_dim_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_maximum_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mean_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mean_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mean_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mean_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mean_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mean_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_median_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_list_of_tensors_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_meshgrid_variadic_tensors_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_binary_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_no_dim_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_min_reduction_with_dim_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_minimum_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mode_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_movedim_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_msort_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mul_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_multinomial_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_multinomial_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_multinomial_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_multinomial_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mv_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mv_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mv_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mv_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mv_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mv_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_mvlgamma_mvlgamma_p_5_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nan_to_num_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmean_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmean_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmean_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmean_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmean_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmean_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmean_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanmedian_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanquantile_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nanquantile_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nansum_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_narrow_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_batch_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_batch_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_batch_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_batch_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_dropout_backward_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_dropout_backward_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_dropout_backward_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_dropout_backward_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_layer_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_layer_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_layer_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_native_layer_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ne_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_neg_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_empty_strided_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_full_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_ones_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_new_zeros_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nextafter_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nextafter_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nextafter_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nextafter_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool1d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool1d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool1d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool1d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool1d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool1d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool1d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_adaptive_max_pool3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_alpha_dropout_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_alpha_dropout_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_alpha_dropout_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_alpha_dropout_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool1d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool1d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool1d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool1d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_avg_pool3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_batch_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_batch_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_batch_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_batch_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_batch_norm_without_cudnn_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_batch_norm_without_cudnn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_batch_norm_without_cudnn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_bilinear_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_bilinear_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_bilinear_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_bilinear_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_binary_cross_entropy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_binary_cross_entropy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_binary_cross_entropy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_binary_cross_entropy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_binary_cross_entropy_with_logits_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_binary_cross_entropy_with_logits_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_celu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_celu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_celu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_celu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_channel_shuffle_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv1d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv1d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv1d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv1d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv1d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv1d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv1d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv2d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv2d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv2d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv3d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv3d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv3d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose1d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose1d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose1d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose1d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose1d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose1d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose1d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose2d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose2d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose2d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose3d_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose3d_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose3d_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_conv_transpose3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_embedding_loss_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_similarity_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_similarity_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_similarity_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cosine_similarity_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cross_entropy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cross_entropy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cross_entropy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_cross_entropy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_ctc_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_ctc_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_dropout_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_elu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_elu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_elu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_elu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_embedding_bag_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_embedding_bag_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_embedding_bag_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_embedding_bag_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_embedding_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_embedding_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_embedding_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_embedding_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_with_train_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_with_train_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_with_train_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_feature_alpha_dropout_without_train_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_fractional_max_pool2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_fractional_max_pool2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_fractional_max_pool2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_fractional_max_pool2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_fractional_max_pool3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_fractional_max_pool3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_fractional_max_pool3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_fractional_max_pool3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_gaussian_nll_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_gaussian_nll_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_gaussian_nll_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_gaussian_nll_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_gelu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_gelu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_gelu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_gelu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_glu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_glu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_glu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_glu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_grid_sample_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_grid_sample_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_grid_sample_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_grid_sample_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_group_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_group_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_group_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_group_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardshrink_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardshrink_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardshrink_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardshrink_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardsigmoid_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardsigmoid_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardsigmoid_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardsigmoid_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardswish_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardswish_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardswish_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardswish_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardtanh_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardtanh_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardtanh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardtanh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardtanh_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardtanh_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardtanh_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hardtanh_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hinge_embedding_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hinge_embedding_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hinge_embedding_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_hinge_embedding_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_huber_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_huber_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_huber_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_huber_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_instance_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_instance_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_instance_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_instance_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_area_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_area_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_area_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_area_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_bicubic_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_bicubic_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_bicubic_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_bicubic_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_bilinear_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_bilinear_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_bilinear_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_bilinear_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_linear_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_linear_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_linear_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_linear_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest-exact_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest-exact_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest-exact_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest-exact_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_nearest_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_trilinear_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_trilinear_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_trilinear_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_interpolate_trilinear_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_kl_div_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_kl_div_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_kl_div_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_kl_div_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_l1_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_l1_loss_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_l1_loss_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_l1_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_l1_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_l1_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_layer_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_layer_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_layer_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_layer_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_leaky_relu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_leaky_relu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_leaky_relu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_leaky_relu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_linear_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_linear_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_linear_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_linear_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_linear_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_linear_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_local_response_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_local_response_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_local_response_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_local_response_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_logsigmoid_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_logsigmoid_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_logsigmoid_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_logsigmoid_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_margin_ranking_loss_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool1d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool1d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool1d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool1d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_pool3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool1d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool1d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool1d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool1d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool1d_grad_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool1d_grad_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool1d_grad_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool1d_grad_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool2d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool2d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool2d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool2d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool2d_grad_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool2d_grad_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool2d_grad_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool2d_grad_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool3d_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool3d_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool3d_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool3d_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool3d_grad_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool3d_grad_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool3d_grad_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_max_unpool3d_grad_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_mish_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_mish_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_mish_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_mish_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_mse_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_mse_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_mse_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_mse_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multi_head_attention_forward_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multi_head_attention_forward_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multi_head_attention_forward_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multi_head_attention_forward_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multi_margin_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multi_margin_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multi_margin_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multi_margin_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multilabel_margin_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multilabel_margin_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multilabel_margin_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multilabel_margin_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multilabel_soft_margin_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multilabel_soft_margin_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_multilabel_soft_margin_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_nll_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_nll_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_nll_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_nll_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_normalize_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_normalize_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_normalize_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_normalize_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_normalize_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_normalize_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_one_hot_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_circular_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_constant_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_reflect_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pad_replicate_negative_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pairwise_distance_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pdist_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pdist_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_shuffle_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_pixel_unshuffle_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_poisson_nll_loss_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_prelu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_prelu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_prelu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_prelu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu6_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_relu_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rms_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rms_norm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rms_norm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rms_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rms_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rms_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rrelu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rrelu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rrelu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_rrelu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_scaled_dot_product_attention_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_scaled_dot_product_attention_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_scaled_dot_product_attention_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_selu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_selu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_selu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_selu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_silu_complex_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_silu_complex_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_silu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_silu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_silu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_silu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_smooth_l1_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_smooth_l1_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_smooth_l1_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_smooth_l1_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_soft_margin_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_soft_margin_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_soft_margin_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_soft_margin_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softmin_with_dtype_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softplus_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softplus_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softplus_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softplus_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softshrink_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softshrink_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softshrink_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softshrink_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_softsign_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_tanhshrink_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_threshold_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_loss_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_triplet_margin_with_distance_loss_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_unfold_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_unfold_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_unfold_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_unfold_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_unfold_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_unfold_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_unfold_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_bilinear_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_bilinear_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_bilinear_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_bilinear_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_nearest_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_nearest_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_nearest_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_nearest_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nn_functional_upsample_nearest_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_nonzero_static_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_fro_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_fro_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_fro_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_fro_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_fro_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_fro_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_inf_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_inf_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_inf_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_inf_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_inf_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_inf_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_nuc_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_nuc_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_nuc_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_norm_nuc_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_in_place_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_in_place_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_in_place_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_in_place_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_in_place_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_in_place_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_number_mean_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_number_mean_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_number_mean_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_normal_number_mean_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ones_like_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ormqr_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ormqr_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ormqr_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ormqr_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_outer_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pca_lowrank_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pca_lowrank_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pca_lowrank_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pca_lowrank_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_permute_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pinverse_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pinverse_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pinverse_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pinverse_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polar_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polar_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_2_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_3_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_polygamma_polygamma_n_4_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_positive_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_pow_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_prod_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_put_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_qr_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_qr_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_qr_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_qr_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_quantile_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_quantile_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rad2deg_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rand_like_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rand_like_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rand_like_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rand_like_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rand_like_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rand_like_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rand_like_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randint_like_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_like_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_like_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_like_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_like_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_like_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_like_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_randn_like_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_ravel_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_real_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reciprocal_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_remainder_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_renorm_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_renorm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_renorm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_renorm_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_renorm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_renorm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_repeat_interleave_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_as_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_reshape_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize__cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resize_as__cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_conj_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_resolve_neg_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_roll_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rot90_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_0_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_0_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_3_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_3_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_3_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_3_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_neg_3_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_neg_3_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_neg_3_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_round_decimals_neg_3_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsqrt_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_rsub_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scalar_tensor_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_add_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amax_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_amin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_mean_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_prod_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_scatter_reduce_sum_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_searchsorted_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_select_scatter_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sgn_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_short_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sigmoid_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sign_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_bartlett_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_bartlett_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_blackman_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_blackman_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_cosine_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_cosine_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_exponential_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_exponential_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_gaussian_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_gaussian_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_general_cosine_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_general_cosine_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_general_hamming_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_general_hamming_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_hamming_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_hamming_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_hann_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_hann_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_kaiser_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_kaiser_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_nuttall_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signal_windows_nuttall_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_signbit_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sin_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinc_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sinh_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_slice_scatter_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_softmax_with_dtype_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sort_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sparse_mm_reduce_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sparse_mm_reduce_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sparse_mm_reduce_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sparse_mm_reduce_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sparse_sampled_addmm_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sparse_sampled_addmm_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sparse_sampled_addmm_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sparse_sampled_addmm_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_airy_ai_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_airy_ai_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_airy_ai_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_airy_ai_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_airy_ai_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_airy_ai_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_airy_ai_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_airy_ai_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j1_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_j1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y1_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_bessel_y1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_t_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_t_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_t_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_t_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_t_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_t_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_t_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_t_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_u_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_u_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_u_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_u_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_u_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_u_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_u_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_u_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_v_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_v_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_v_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_v_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_v_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_v_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_v_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_v_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_w_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_w_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_w_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_w_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_w_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_w_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_w_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_chebyshev_polynomial_w_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_entr_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_erfcx_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_erfcx_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_erfcx_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_erfcx_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_erfcx_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_erfcx_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_erfcx_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_erfcx_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_h_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_h_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_h_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_h_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_h_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_h_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_h_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_h_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_he_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_he_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_he_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_he_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_he_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_he_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_he_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_hermite_polynomial_he_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i0e_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_i1e_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_laguerre_polynomial_l_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_laguerre_polynomial_l_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_laguerre_polynomial_l_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_laguerre_polynomial_l_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_laguerre_polynomial_l_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_laguerre_polynomial_l_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_laguerre_polynomial_l_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_laguerre_polynomial_l_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_legendre_polynomial_p_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_legendre_polynomial_p_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_legendre_polynomial_p_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_legendre_polynomial_p_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_legendre_polynomial_p_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_legendre_polynomial_p_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_legendre_polynomial_p_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_legendre_polynomial_p_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_log_ndtr_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_log_ndtr_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_log_ndtr_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_log_ndtr_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_log_ndtr_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_log_ndtr_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_log_ndtr_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_log_ndtr_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i1_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_i1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k1_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_modified_bessel_k1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtr_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtri_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtri_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtri_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtri_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtri_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtri_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtri_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_ndtri_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_polygamma_special_polygamma_n_0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k1_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k1_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k1_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k1_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k1_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k1_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k1_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_scaled_modified_bessel_k1_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_t_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_t_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_t_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_t_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_t_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_t_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_u_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_u_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_u_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_u_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_u_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_u_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_v_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_v_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_v_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_v_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_v_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_w_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_w_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_w_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_w_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_spherical_bessel_j0_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_spherical_bessel_j0_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_spherical_bessel_j0_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_spherical_bessel_j0_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_spherical_bessel_j0_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_spherical_bessel_j0_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_spherical_bessel_j0_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_spherical_bessel_j0_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_xlog1py_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_zeta_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_zeta_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_zeta_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_zeta_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_zeta_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_zeta_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_zeta_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_special_zeta_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_list_args_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_split_with_sizes_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sqrt_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_square_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_squeeze_multiple_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stack_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_unbiased_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_unbiased_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_unbiased_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_unbiased_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_unbiased_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_mean_unbiased_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_unbiased_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_unbiased_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_unbiased_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_unbiased_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_unbiased_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_std_unbiased_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stft_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stft_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stft_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_stft_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sub_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_sum_to_size_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_svd_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_svd_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_svd_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_svd_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_svd_lowrank_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_svd_lowrank_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_svd_lowrank_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_svd_lowrank_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_t_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_along_dim_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_take_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tan_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tanh_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensor_split_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensordot_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensordot_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensordot_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensordot_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensordot_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tensordot_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tile_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_to_sparse_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_topk_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch__scaled_mm_cuda_float8_e4m3fn, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch__scaled_mm_v2_cuda_float8_e4m3fn, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__efficient_attention_forward_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__efficient_attention_forward_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__flash_attention_forward_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__flash_attention_forward_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_torch_ops_aten__safe_softmax_default_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trace_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_transpose_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapezoid_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trapz_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triangular_solve_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triangular_solve_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triangular_solve_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triangular_solve_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_indices_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_tril_indices_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_indices_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_triu_indices_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_true_divide_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_trunc_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unbind_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unflatten_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unfold_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_uniform_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_uniform_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_uniform_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_uniform_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_uniform_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_uniform_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_consecutive_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_uint16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_uint32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_uint64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unique_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unravel_index_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unravel_index_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unravel_index_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unravel_index_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unravel_index_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_chunk_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsafe_split_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_unsqueeze_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_unbiased_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_unbiased_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_unbiased_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_unbiased_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_unbiased_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_mean_unbiased_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_unbiased_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_unbiased_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_unbiased_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_unbiased_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_unbiased_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_var_unbiased_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vdot_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vdot_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vdot_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vdot_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vdot_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vdot_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_complex_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_complex_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_complex_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_real_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_as_real_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_copy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_view_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vsplit_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_vstack_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_where_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_xlogy_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zero__cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_cuda_uint8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_bfloat16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_bool, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_complex128, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_complex32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_complex64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_float16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_float32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_float64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_int16, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_int32, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_int64, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_int8, test/test_schema_check.py::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_zeros_like_cuda_uint8 2025-12-04T12:44:16.8874302Z 2025-12-04T12:44:16.8874415Z Finished test_schema_check 1/1 ... [2025-12-04 12:44:16.714480][3578765.239291125], took 9.62min 2025-12-04T12:44:16.8874793Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T12:44:16.8875140Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:44:16.8875336Z Running test_ops 3/5 ... [2025-12-04 12:44:16.721299][3578765.246113437] 2025-12-04T12:44:16.8875489Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:44:16.8875852Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops.py', '--shard-id=3', '--num-shards=5', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:44:16.721487] 2025-12-04T13:28:26.2796086Z 2025-12-04T13:28:26.2796906Z PRINTING LOG FILE of test_ops 3/5 (test/test-reports/test_ops_3.5_ea3c4bc91b7c0df0_.log) 2025-12-04T13:28:26.2797725Z Test results will be stored in test-reports/python-pytest/test_ops/test_ops-104b49b023bdc7f9.xml 2025-12-04T13:28:26.2798331Z ============================= test session starts ============================== 2025-12-04T13:28:26.2798938Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:28:26.2799466Z cachedir: .pytest_cache 2025-12-04T13:28:26.2800105Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:28:26.2800781Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:28:26.2801112Z configfile: pytest.ini 2025-12-04T13:28:26.2801779Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, anyio-4.12.0, typeguard-4.3.0 2025-12-04T13:28:26.2802696Z collecting ... collected 33666 items 2025-12-04T13:28:26.2803092Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T13:28:26.3540612Z Running 6702 items in this shard: test/test_ops.py::TestSelfKwarg::test_self_kwargs, test/test_ops.py::TestCommonCUDA::test_compare_cpu_T_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___rmod___cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu___ror___cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_half_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_dstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_flip_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__refs_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cartesian_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_combinations_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_dist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_full_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_compare_cpu_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lstsq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_msort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_triplet_margin_with_distance_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_nonzero_static_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_norm_inf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_ormqr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_outer_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_select_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_sort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_take_along_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_to_sparse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_zero__cuda_float32, test/test_ops.py::TestCommonCUDA::test_compare_cpu_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_T_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_asin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atanh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atleast_1d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atleast_3d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_bool_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_column_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_eq_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_ifft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_index_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_mT_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_masked_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_new_empty_strided_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv3d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv_transpose2d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_positive_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_pow_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_select_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_split_with_sizes_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_squeeze_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_dtypes_T_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___getitem___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes___rmul___cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__native_batch_norm_legit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_cdouble_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_half_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_add_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atan2_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atan_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_atleast_2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_left_shift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_right_shift_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_broadcast_shapes_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_broadcast_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_broadcast_to_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_bucketize_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_ceil_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cos_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_diagonal_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_erfinv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_expand_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_expand_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ifftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_flatten_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_floor_divide_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_gt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_hsplit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_index_fill_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_index_select_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_isinf_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_istft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_le_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_matrix_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_vector_norm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_linspace_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logical_and_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_logsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_masked_fill_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_mul_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_celu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_dropout_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_pdist_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_permute_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_renorm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_repeat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_select_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_sgn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_erfcx_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_i1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_log_ndtr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_special_logit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_squeeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_t_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_to_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_transpose_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_tril_indices_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_triu_indices_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unsqueeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_unsqueeze_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__refs_vdot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes__segment_reduce_lengths_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_acos_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_addmv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_allclose_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_arange_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_argwhere_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_as_strided_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atan_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_atleast_1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bincount_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bitwise_or_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bitwise_xor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bool_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_bucketize_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cdouble_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_char_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_clone_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_column_stack_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_contiguous_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cos_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cosh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cross_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cummax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diag_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_dist_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_dot_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_empty_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_expand_as_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_exponential_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_eye_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_ihfft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fft_irfftn_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_index_put_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_inner_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_isin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_istft_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_item_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_kron_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_cholesky_ex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_det_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_eigvalsh_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_lu_factor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_lu_factor_ex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_lu_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_norm_subgradients_at_zero_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_slogdet_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_solve_ex_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_tensorinv_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_linalg_tensorsolve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_log_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logcumsumexp_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logical_or_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_amin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_argmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_log_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_median_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_masked_normalize_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_matmul_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_max_pool2d_with_indices_backward_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mean_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_meshgrid_list_of_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_minimum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_mul_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_multinomial_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_native_dropout_backward_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_adaptive_max_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv1d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_cosine_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_cosine_similarity_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_dropout2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_fractional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_hardshrink_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_nearest_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_trilinear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_margin_ranking_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_pool2d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pairwise_distance_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pixel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_relu_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_soft_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_upsample_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_nonzero_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_norm_fro_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_normal_in_place_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_ones_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_outer_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_permute_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_polygamma_polygamma_n_1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_randint_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_randn_like_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_reciprocal_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_repeat_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_round_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_rsqrt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_scatter_reduce_amin_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_searchsorted_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sign_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_bartlett_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_general_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_kaiser_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_nuttall_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_softmax_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sparse_sampled_addmm_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_entr_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_special_modified_bessel_k1_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_split_list_args_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_split_with_sizes_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sqrt_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_squeeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_squeeze_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_sum_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_tile_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_topk_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_transpose_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_triangular_solve_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_uniform_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unravel_index_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_unsafe_split_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_vsplit_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_xlogy_cuda, test/test_ops.py::TestCommonCUDA::test_dtypes_zeros_like_cuda, test/test_ops.py::TestCommonCUDA::test_errors_T_cuda, test/test_ops.py::TestCommonCUDA::test_errors___rdiv___cuda, test/test_ops.py::TestCommonCUDA::test_errors_amin_cuda, test/test_ops.py::TestCommonCUDA::test_errors_arange_cuda, test/test_ops.py::TestCommonCUDA::test_errors_bernoulli_cuda, test/test_ops.py::TestCommonCUDA::test_errors_clamp_max_cuda, test/test_ops.py::TestCommonCUDA::test_errors_complex_cuda, test/test_ops.py::TestCommonCUDA::test_errors_copysign_cuda, test/test_ops.py::TestCommonCUDA::test_errors_diag_cuda, test/test_ops.py::TestCommonCUDA::test_errors_diagonal_copy_cuda, test/test_ops.py::TestCommonCUDA::test_errors_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_errors_dsplit_cuda, test/test_ops.py::TestCommonCUDA::test_errors_eq_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_fft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_rfft2_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_errors_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_errors_floor_divide_cuda, test/test_ops.py::TestCommonCUDA::test_errors_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_errors_gather_cuda, test/test_ops.py::TestCommonCUDA::test_errors_gradient_cuda, test/test_ops.py::TestCommonCUDA::test_errors_heaviside_cuda, test/test_ops.py::TestCommonCUDA::test_errors_hypot_cuda, test/test_ops.py::TestCommonCUDA::test_errors_igamma_cuda, test/test_ops.py::TestCommonCUDA::test_errors_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_errors_item_cuda, test/test_ops.py::TestCommonCUDA::test_errors_linalg_lstsq_grad_oriented_cuda, test/test_ops.py::TestCommonCUDA::test_errors_linspace_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_errors_logspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_errors_max_binary_cuda, test/test_ops.py::TestCommonCUDA::test_errors_maximum_cuda, test/test_ops.py::TestCommonCUDA::test_errors_mul_cuda, test/test_ops.py::TestCommonCUDA::test_errors_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_errors_native_layer_norm_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_max_pool1d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_avg_pool3d_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_l1_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_margin_ranking_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_multilabel_margin_loss_cuda, test/test_ops.py::TestCommonCUDA::test_errors_nn_functional_prelu_cuda, test/test_ops.py::TestCommonCUDA::test_errors_pow_cuda, test/test_ops.py::TestCommonCUDA::test_errors_remainder_cuda, test/test_ops.py::TestCommonCUDA::test_errors_roll_cuda, test/test_ops.py::TestCommonCUDA::test_errors_rot90_cuda, test/test_ops.py::TestCommonCUDA::test_errors_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_general_hamming_cuda, test/test_ops.py::TestCommonCUDA::test_errors_signal_windows_nuttall_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_mul_layout3_cuda, test/test_ops.py::TestCommonCUDA::test_errors_sparse_sum_layout4_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_chebyshev_polynomial_t_cuda, test/test_ops.py::TestCommonCUDA::test_errors_special_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_errors_trace_cuda, test/test_ops.py::TestCommonCUDA::test_errors_tril_cuda, test/test_ops.py::TestCommonCUDA::test_errors_true_divide_cuda, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch__batch_norm_with_update_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_addcmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_atan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_ceil_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_fft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_hfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_ldexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_det_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_householder_product_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logical_not_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_masked_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_msort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nanquantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_round_decimals_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_round_decimals_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_legendre_polynomial_p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_topk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_vdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___radd___cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices___rdiv___cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices__chunk_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_all_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_aminmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_angle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_argmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_asin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atan2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_atleast_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_baddbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bernoulli_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bfloat16_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bincount_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_block_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_broadcast_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cartesian_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cdouble_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_combinations_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_conj_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_constant_pad_nd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_corrcoef_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_cumulative_trapezoid_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diag_embed_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_dsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_empty_permuted_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_eq_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_erfc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_exp2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_exp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expand_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_flatten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_flip_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_floor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_full_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hash_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_isnan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lgamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_linspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_log_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_amax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_masked_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_reduction_no_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_maximum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nansum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nansum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_narrow_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ne_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_new_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_reflect_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_replicate_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_triplet_margin_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_normal_in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_ones_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_randint_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_resolve_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_rsub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_short_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_cosine_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sort_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_airy_ai_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_y1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_legendre_polynomial_p_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_log_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_ndtr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_ndtri_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_special_xlog1py_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_squeeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_sum_to_size_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_take_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_tensordot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_to_sparse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_to_sparse_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trapz_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trapz_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_triu_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_unsafe_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_view_cuda_int64, test/test_ops.py::TestCommonCUDA::test_multiple_devices_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_multiple_devices_xlogy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_T_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values___rdiv___cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_argsort_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_as_strided_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_atan2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_bool_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cdouble_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cfloat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_chunk_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_count_nonzero_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diagflat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diff_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_double_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_erf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_hfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ihfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_irfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_float_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_hstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_jiterator_2inputs_2outputs_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_lgamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logical_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logical_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logsumexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_long_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_masked_prod_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_narrow_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_new_full_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_new_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_new_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_4_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ravel_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_resize__cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_resolve_conj_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_bessel_j1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_erfcx_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_i1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_i1e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_modified_bessel_i0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_xlog1py_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_zeta_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_t_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_take_along_dim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_trace_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_transpose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unfold_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unique_consecutive_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unsafe_chunk_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_view_cuda_bool, test/test_ops.py::TestCommonCUDA::test_non_standard_bool_values_zeros_cuda_bool, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rpow___cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addbmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addmm_decomposed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addmv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_allclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_aminmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_aminmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_angle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argsort_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argwhere_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_1d_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bitwise_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_block_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bucketize_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_byte_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_byte_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cdouble_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_chalf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cholesky_inverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cholesky_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cholesky_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_chunk_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_conj_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_constant_pad_nd_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_contiguous_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_corrcoef_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_count_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_embed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_double_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_permuted_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expm1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftshift_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fliplr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_float_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_floor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_frexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hash_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_igamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_int_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isclose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_2inputs_2outputs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_return_by_ref_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_kron_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lerp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cond_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_det_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lstsq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_rank_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_hermitian_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vander_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logaddexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_xor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_long_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_unpack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mH_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mH_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumsum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_select_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_std_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_sum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_binary_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_maximum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_multinomial_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_native_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_grid_sample_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_kl_div_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool3d_grad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_normalize_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_unshuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_relu6_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_smooth_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softsign_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_unfold_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_static_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_fro_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_normal_in_place_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_normal_number_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_like_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pinverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_3_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_positive_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randint_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_repeat_interleave_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize_as__cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_round_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_round_decimals_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scalar_tensor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scalar_tensor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_searchsorted_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_select_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_slice_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_airy_ai_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_u_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_hermite_polynomial_he_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_scaled_modified_bessel_k0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_list_args_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_list_args_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_square_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_square_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_svd_lowrank_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_t_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tensor_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_sparse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unbind_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unravel_index_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsafe_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zeros_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_numpy_ref_aminmax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_aminmax_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_argwhere_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_to_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_clone_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_diff_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_item_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_tensorinv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_l1_loss_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_pairwise_distance_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_pdist_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_permute_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_repeat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_roll_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_general_cosine_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_general_hamming_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_kaiser_cuda_float64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tensor_split_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_numpy_ref_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_unbind_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_numpy_ref_view_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_out___getitem___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out___rmul___cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_bool_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_cfloat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_chalf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs__conversions_char_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_any_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_bitwise_left_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_bitwise_xor_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out__refs_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_cauchy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_clamp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_conj_physical_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_fftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_irfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_geometric_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_hstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_lerp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_ndtri_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_std_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_triu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_unfold_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__refs_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out__unsafe_masked_index_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_addcdiv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_angle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_as_strided_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_asin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_bincount_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_bitwise_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_broadcast_shapes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_byte_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cholesky_inverse_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_constant_pad_nd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_diag_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_diagflat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_expand_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_hfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_frac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_gcd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_out_gradient_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_index_put_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_index_reduce_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_isreal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logcumsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_logical_xor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_lu_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_lu_unpack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_argmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_argmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_logaddexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_masked_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_median_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nanmedian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_new_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nextafter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_conv3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pad_circular_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_norm_nuc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_ones_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_randint_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_randn_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_real_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_abs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmm_decomposed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmm_decomposed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_alias_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_angle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_angle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_inverse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cumprod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diff_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diff_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_erf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_exp2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_expm1_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_expm1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_hfft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ifft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_rfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_gather_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_kron_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_ldexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cross_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eig_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_inv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_factor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_slogdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_ex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_svdvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_tensorinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_vecdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log1p_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_unpack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_matmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_min_binary_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_nuc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_permute_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_quantile_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_round_decimals_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_add_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sgn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sparse_sampled_addmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_squeeze_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unbind_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_var_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_where_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_requires_grad_error_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_resize__cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_resolve_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_slice_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_slice_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_entr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_erfcx_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_i1e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_take_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_torch__scaled_mm_v2_cuda_float8_e4m3fn, test/test_ops.py::TestCommonCUDA::test_out_triangular_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_trunc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_uniform_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_view_as_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_view_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_view_as_real_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_out_warning___rdiv___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rpow___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning___rsub___cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_bool_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_cfloat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_short_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_add_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_amax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_atan_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_clone_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_contiguous_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_diagonal_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_diagonal_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_div_floor_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_div_trunc_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_empty_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_eq_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_expm1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ifftn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_floor_divide_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_fmod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_ge_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_hsplit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_index_add_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_isreal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_item_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_cross_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_linspace_tensor_overload_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_log_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_empty_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_empty_strided_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_full_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_new_zeros_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_elu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_group_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_huber_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pairwise_distance_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pixel_shuffle_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_poisson_nll_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_relu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_threshold_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_normal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_prod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_randn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_real_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_renorm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_repeat_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sigmoid_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sign_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_ndtri_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_special_spherical_bessel_j0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sqrt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_sub_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_tensor_split_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_trace_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_tril_indices_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_triu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__refs_unsqueeze_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning__unsafe_masked_index_put_accumulate_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_addmv_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_alias_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_amin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_aminmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_argmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_argmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_as_strided_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_asin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_atleast_3d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_baddbmm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bernoulli_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_not_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_bitwise_or_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_broadcast_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cauchy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cdouble_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cholesky_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_chunk_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_column_stack_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cross_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cummax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_cumsum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_deg2rad_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_diag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_digamma_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_div_floor_rounding_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_empty_strided_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_equal_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_erfc_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_expand_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_fft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_fft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_hfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fft_rfft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_fill_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_flip_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_float_power_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_gather_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_half_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_hash_tensor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_hsplit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_igammac_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_imag_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_reduce_amin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_index_reduce_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_istft_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_item_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_kthvalue_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_le_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lerp_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_cross_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_ldl_factor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lu_factor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_lu_factor_ex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_linalg_solve_triangular_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logical_and_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logical_not_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_logical_xor_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_lu_solve_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_argmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_cumprod_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_log_softmax_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_median_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_softmin_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_masked_sum_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_meshgrid_variadic_tensors_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_mode_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_narrow_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_neg_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_new_ones_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_batch_norm_without_cudnn_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_bilinear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv1d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv_transpose2d_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_cosine_similarity_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_cross_entropy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_ctc_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_elu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_gelu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_grid_sample_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hardshrink_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hardswish_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hinge_embedding_loss_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_area_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_linear_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_local_response_norm_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_mish_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_multi_head_attention_forward_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_circular_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_reflect_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_relu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_softmin_with_dtype_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_threshold_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_nonzero_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_normal_number_mean_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_ones_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_permute_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_pinverse_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_quantile_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_randint_like_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_reshape_as_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_resize__cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_rot90_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_round_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_rsqrt_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_rsub_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_select_scatter_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_short_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sign_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_cosine_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_gaussian_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_kaiser_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_signbit_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_sinh_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_bessel_j1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_chebyshev_polynomial_w_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_entr_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_i1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_modified_bessel_i0_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_special_modified_bessel_i1_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_split_list_args_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_split_with_sizes_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_squeeze_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_svd_lowrank_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_take_along_dim_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_tensor_split_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_triu_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_uniform_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_var_mean_unbiased_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_as_complex_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_as_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_as_real_cuda, test/test_ops.py::TestCommonCUDA::test_out_warning_view_copy_cuda, test/test_ops.py::TestCommonCUDA::test_out_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_out_zeros_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_pointwise_tag_coverage_cuda, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_copysign_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_float_power_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_i0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log1p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_3_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_5_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_5_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_3_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rad2deg_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rsqrt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sigmoid_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_t_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_h_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_he_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_t_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sqrt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sqrt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tan_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_true_divide_cuda_int32, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cauchy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cauchy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_exponential_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frac_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_frac_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_imag_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_imag_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_lerp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_native_layer_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nextafter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_alpha_dropout_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_elu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_group_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_l1_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_layer_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mish_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mse_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_prelu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_selu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softplus_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_number_mean_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_number_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_stft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_and_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_right_shift_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_copysign_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_diag_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_dstack_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_hfft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_ifft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_ifftn_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_irfft2_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_irfft_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_flipud_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fmax_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_hypot_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_isclose_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_lcm_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_le_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_linalg_diagonal_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_linspace_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_log_normal_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_movedim_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_mul_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_narrow_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_neg_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_softshrink_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_sum_to_size_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_unbind_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_view_copy_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_view_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_xlogy_cuda, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_and_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_left_shift_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_left_shift_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_right_shift_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_xor_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_shapes_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dot_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exponential_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exponential_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float8_e4m3fnuz, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gcd_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hypot_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_igamma_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_imag_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_imag_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_imag_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_istft_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_matrix_norm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svd_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svd_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vector_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_normal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp2_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_native_layer_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nextafter_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_celu_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_celu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_gelu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_group_norm_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardshrink_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardshrink_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_huber_loss_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_huber_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_layer_norm_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_prelu_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_selu_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_smooth_l1_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softshrink_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_norm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_renorm_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_renorm_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vdot_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vdot_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_complex_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_complex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_allclose_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_float8_e4m3fnuz, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gcd_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_imag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lcm_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svd_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svdvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_native_layer_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nextafter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nextafter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_celu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_elu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_gelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hinge_embedding_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_l1_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pdist_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_selu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_smooth_l1_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softplus_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_number_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_renorm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_complex_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_complex_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_polar_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_and_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_left_shift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_right_shift_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exponential_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_float8_e4m3fnuz, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frac_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hypot_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hypot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_igammac_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lerp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vector_norm_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vector_norm_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_normal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_normal_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logaddexp2_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nextafter_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nextafter_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_alpha_dropout_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_celu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_celu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_celu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_gelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_glu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_glu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_huber_loss_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_l1_loss_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_leaky_relu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mish_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mish_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_prelu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_selu_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softplus_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softplus_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_renorm_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_float64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_indices_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_bfloat16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_int64, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_complex32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_int8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_complex128, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_float16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_bool, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_int16, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_int32, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_uint8, test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_int16, test/test_ops.py::TestCommonCUDA::test_reduction_ops_reduce_aminmax_cuda, test/test_ops.py::TestCommonCUDA::test_reduction_ops_reduce_any_cuda, test/test_ops.py::TestCommonCUDA::test_reduction_ops_reduce_argmax_cuda, test/test_ops.py::TestCommonCUDA::test_reduction_ops_reduce_argmin_cuda, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rdiv___cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rmul___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rsub___cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager__unsafe_masked_index_put_accumulate_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_acosh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_acosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_allclose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_aminmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_angle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argsort_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argwhere_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atan2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atanh_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bfloat16_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bfloat16_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_block_diag_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bmm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_tensors_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bucketize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cdouble_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cfloat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_char_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_inverse_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clamp_max_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_column_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_combinations_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_conj_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_copysign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_corrcoef_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cos_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cosh_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cummax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumprod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumulative_trapezoid_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_deg2rad_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagflat_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagonal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diff_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dist_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dist_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_div_floor_rounding_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_like_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_like_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_erfinv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_fft2_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ihfftn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_irfft_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_irfftn_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flatten_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fmod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_grid_sampler_2d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_grid_sampler_3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_gt_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_hypot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_i0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_add_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_reduce_amin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_istft_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_item_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_2inputs_2outputs_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_unary_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ldexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cholesky_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cond_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cross_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eigvals_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_inv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_factor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lstsq_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_multi_dot_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_pinv_hermitian_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_solve_ex_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_svdvals_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_tensorsolve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vander_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vecdot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log10_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log2_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_normal_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_softmax_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logdet_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_and_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_not_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_or_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_xor_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_xor_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logsumexp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logsumexp_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mT_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_cumsum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_cumsum_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_fill_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_fill_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_normalize_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_matmul_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_matrix_exp_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mean_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_movedim_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mv_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mv_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mvlgamma_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nanmedian_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nansum_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_narrow_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_narrow_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_native_batch_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_empty_strided_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_zeros_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv1d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose2d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose3d_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_relu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softmin_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_triplet_margin_with_distance_loss_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_upsample_bilinear_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nonzero_static_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_inf_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_nuc_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ones_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polygamma_polygamma_n_4_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_prod_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_qr_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randint_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randn_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ravel_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_reciprocal_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_renorm_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_repeat_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resize__cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_roll_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_round_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rsqrt_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_short_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_hamming_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sin_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_slice_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_bessel_j1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_bessel_y1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_i0e_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_xlog1py_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_list_args_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_with_sizes_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_copy_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_stack_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_mean_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sub_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_svd_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_t_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_take_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tensordot_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_to_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_trace_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_transpose_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_transpose_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_transpose_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_triangular_solve_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_true_divide_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_true_divide_cuda_float32, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unbind_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unfold_copy_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsafe_chunk_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsafe_split_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_mean_unbiased_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_vstack_cuda_complex64, test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_zeros_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___getitem___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward___rmatmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addcdiv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_addmv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_angle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atan2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atleast_2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_bfloat16_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_column_stack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_combinations_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_conj_physical_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cummin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_div_trunc_rounding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_double_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_erf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_expand_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_fftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_hfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_ihfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_irfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_fmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_grid_sampler_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_hstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_eig_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_eigvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_log_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_matmul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_maximum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nansum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv_transpose1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_multilabel_soft_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_softsign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_reciprocal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_reshape_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_roll_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_round_decimals_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_slice_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_entr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_log_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_split_with_sizes_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_std_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_std_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_stft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_tensordot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unbind_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unflatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_unfold_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_var_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_view_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_backward_xlogy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rmod___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input__chunk_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_addmv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_alias_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_as_strided_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_asin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_bernoulli_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_contiguous_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_corrcoef_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cummin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diagonal_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diagonal_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_dstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_expm1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_fftshift_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ifft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_rfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_full_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_gradient_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_igammac_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isneginf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_jiterator_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_cond_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_det_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_eigvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_ldl_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lstsq_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_vector_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_argmin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_movedim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nanmean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nanmedian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_native_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ne_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_new_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_new_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_avg_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_gelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_pool1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pdist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_selu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_silu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_softmin_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_threshold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_pow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_reciprocal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_reshape_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_resize_as__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_round_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_round_decimals_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_rsub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scalar_tensor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_hann_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signbit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_bessel_j0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_i0e_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_scaled_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_shifted_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_split_list_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_split_with_sizes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_squeeze_multiple_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_std_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_std_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_t_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_t_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_take_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_triangular_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_tril_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unbind_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unbind_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unfold_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsqueeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_cow_input_var_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_T_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addbmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addmm_decomposed_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_aminmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atleast_1d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atleast_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_bernoulli_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_bfloat16_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_bool_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cfloat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clamp_max_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clamp_min_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clone_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_complex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_constant_pad_nd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cumsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_double_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_einsum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_empty_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_exp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_fft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_float_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_floor_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_gather_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_half_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_hypot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_put_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_reduce_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isnan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_le_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_diagonal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_eig_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_householder_product_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lu_factor_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_svdvals_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linspace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log10_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logical_or_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logsumexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_max_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_movedim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_multinomial_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nan_to_num_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_narrow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_new_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nextafter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_alpha_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_embedding_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_l1_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_linear_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_silu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_upsample_nearest_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nonzero_static_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_norm_fro_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_outer_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_permute_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_positive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_quantile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_real_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_reciprocal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_remainder_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_repeat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rot90_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_reduce_sum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_searchsorted_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_hann_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_slice_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_softmax_with_dtype_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_hermite_polynomial_he_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_split_list_args_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sqrt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_std_mean_unbiased_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_stft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_take_along_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_to_sparse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_torch_ops_aten__safe_softmax_default_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unfold_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unique_consecutive_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_vdot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_view_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_zeros_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_T_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_addmm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_allclose_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_as_strided_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_block_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cauchy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_char_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cholesky_inverse_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cholesky_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_corrcoef_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cov_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_cummin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_diagonal_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_equal_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_erf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_exp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_hfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ifft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_rfft2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fliplr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_fmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_geometric_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_gt_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_hstack_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_i0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_reduce_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_index_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_int_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_isneginf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_item_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ldexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_cholesky_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_eigh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_householder_product_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_inv_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_vander_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_log1p_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logdet_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_cumprod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_fill_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_matrix_exp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_min_binary_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_movedim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_narrow_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_native_dropout_backward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ne_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_new_zeros_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_normalize_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_relu6_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_ormqr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_pca_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_polygamma_polygamma_n_1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_pow_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_reshape_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_roll_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_round_decimals_0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_add_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_select_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_general_hamming_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_nuttall_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_signbit_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_slice_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_special_zeta_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_squeeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_squeeze_multiple_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_svd_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_tan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_tanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_uniform_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_unsqueeze_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_zero__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_operator_zeros_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay___rmatmul___cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_argmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_asinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_broadcast_shapes_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_byte_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cat_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_chunk_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_clamp_min_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_copysign_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cosh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cross_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cummax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diagonal_scatter_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_dist_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_dot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_empty_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_erfc_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_erfinv_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_exp2_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_expm1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_exponential_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_eye_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_hfftn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ihfft_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_flatten_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_float_power_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_full_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_full_like_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ge_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_grid_sampler_3d_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_heaviside_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_inner_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_int_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isfinite_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_le_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lerp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_eig_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log10_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logaddexp_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logical_and_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lu_solve_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_amax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_amin_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_median_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_minimum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mode_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_movedim_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_msort_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mul_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nanmean_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nansum_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_native_batch_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_neg_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_new_empty_strided_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_dropout_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_feature_alpha_dropout_without_train_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hardtanh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_multi_margin_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_nll_loss_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_prelu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_selu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_silu_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_softshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_tanhshrink_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_norm_inf_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_pca_lowrank_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_qr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_resize_as__cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_reduce_prod_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sgn_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_blackman_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_general_cosine_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_hann_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sinh_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_softmax_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_bessel_y0_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_chebyshev_polynomial_u_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_entr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_erfcx_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_ndtr_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_scaled_modified_bessel_k1_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_shifted_chebyshev_polynomial_v_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_xlog1py_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_std_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sub_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sum_to_size_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_svd_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tan_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tensordot_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tile_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_to_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_trace_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_true_divide_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unique_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unsafe_split_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unsqueeze_copy_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_view_as_cuda_float32, test/test_ops.py::TestCompositeComplianceCUDA::test_view_replay_xlogy_cuda_float32, test/test_ops.py::TestMathBitsCUDA::test_conj_view_H_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view___radd___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view___rpow___cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_bool_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_add_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_addcdiv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_as_strided_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_asin_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_asinh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atleast_2d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_cumprod_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diag_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diagonal_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_empty_strided_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_ifft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_ifft_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_ifftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_irfftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_float_power_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_isfinite_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_isinf_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_svd_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linspace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logaddexp_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logical_xor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_masked_fill_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_mean_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_narrow_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_l1_loss_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_softmin_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_ones_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_real_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_renorm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_repeat_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_reshape_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_reshape_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_trace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unbind_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_view_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view__refs_where_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_abs_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addbmm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addmm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_addmm_decomposed_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_all_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_as_strided_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_asinh_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_atleast_2d_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_baddbmm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cartesian_prod_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_chunk_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_conj_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_cos_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diag_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_diag_embed_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_empty_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_empty_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_expand_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_hfft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_ifftn_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fft_irfft2_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_fliplr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_float_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_full_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_half_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_index_add_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_index_put_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_inner_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_int_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_isinf_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_isreal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_eigvals_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_inv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_inv_ex_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_ldl_solve_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lstsq_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lstsq_grad_oriented_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lu_factor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_matrix_rank_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_matrix_rank_hermitian_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_pinv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_qr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_solve_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_tensorinv_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_linspace_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_log_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_log_softmax_with_dtype_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logical_and_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logical_not_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_logical_or_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_masked_select_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_meshgrid_variadic_tensors_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_mm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_new_empty_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_channel_shuffle_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pad_constant_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pairwise_distance_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_rms_norm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_triplet_margin_loss_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_unfold_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_norm_fro_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_norm_inf_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_ones_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_ormqr_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_permute_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_prod_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_rand_like_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_reciprocal_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_renorm_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_rot90_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_rsqrt_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_scalar_tensor_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_slice_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_split_with_sizes_copy_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_squeeze_multiple_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_std_unbiased_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sub_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_sum_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_svd_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_t_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_tile_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_true_divide_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unflatten_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_unfold_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_uniform_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_var_mean_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_vdot_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_view_as_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_view_as_real_cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_conj_view_zero__cuda_complex64, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___getitem___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rmul___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rsub___cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_acosh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_addcdiv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_as_strided_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_as_strided_partial_views_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_atan_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_atleast_1d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_block_diag_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_constant_pad_nd_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cosh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_count_nonzero_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_diagonal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_dot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_dsplit_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_empty_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_expand_as_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_expm1_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_hfft2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_ifft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fliplr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_hstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_index_fill_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_item_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_matrix_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_svdvals_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_vector_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_logspace_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_movedim_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_new_ones_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_pixel_shuffle_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_randn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_renorm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_roll_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_rsqrt_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sin_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_special_softmax_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_stack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sub_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_tan_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_transpose_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_triu_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unflatten_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unfold_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unsqueeze_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_view_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_vsplit_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_vstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view__unsafe_masked_index_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_acos_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_argwhere_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_as_strided_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_asinh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_atleast_3d_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_broadcast_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_contiguous_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_corrcoef_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diagonal_scatter_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_div_no_rounding_mode_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_dstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_empty_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_exp2_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_expand_as_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_fftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_ifftn_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_flatten_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_float_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_full_like_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_hstack_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_index_add_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_index_select_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_inner_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_isreal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_istft_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_jiterator_2inputs_2outputs_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ldexp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_cross_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_diagonal_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_eig_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_eigvalsh_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lu_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lu_solve_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_norm_subgradients_at_zero_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_pinv_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_qr_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_vander_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_log_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logdet_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_normalize_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_prod_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_scatter_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_std_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_matrix_exp_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_mm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ne_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_new_empty_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_feature_alpha_dropout_without_train_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_normalize_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_rms_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_softmin_with_dtype_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_triplet_margin_with_distance_loss_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nonzero_static_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_norm_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_real_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_reshape_as_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_resize__cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_resize_as__cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_roll_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_rot90_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_rsub_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_select_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sigmoid_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_squeeze_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_std_unbiased_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sub_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_tensor_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_to_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_true_divide_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unbind_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unfold_copy_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unfold_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_uniform_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unsafe_split_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_vdot_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_view_as_real_cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_conj_view_zero__cuda_complex128, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rdiv___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rmul___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view___rpow___cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__chunk_cat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_bfloat16_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_bool_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_chalf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_double_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_float_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_half_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_addr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_block_diag_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_broadcast_to_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cauchy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cumsum_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diagonal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diagonal_scatter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_empty_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_expand_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_hfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_hfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifftshift_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ihfft2_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ihfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_irfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fill_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_float_power_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_geometric_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_gt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_index_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isposinf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_svdvals_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_vector_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linspace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logaddexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logspace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_lt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_movedim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_native_layer_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_ne_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_new_full_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_dropout_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_hardshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_prelu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_relu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_softmin_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_softplus_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_softshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_tanhshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_prod_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rad2deg_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_real_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sgn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_log_softmax_with_dtype_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_xlog1py_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_squeeze_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_std_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sub_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sum_to_size_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_transpose_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unbind_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unbind_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unfold_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__refs_view_as_complex_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view__unsafe_masked_index_put_accumulate_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_abs_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addcdiv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_addmv_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_argmin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_argwhere_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_as_strided_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_asinh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_atleast_2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_atleast_3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_baddbmm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_bernoulli_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_bfloat16_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_byte_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_chunk_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_clone_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_conj_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_corrcoef_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cos_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cross_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cummin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_cumulative_trapezoid_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_deg2rad_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diagflat_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diagonal_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_diagonal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_dot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_empty_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_eq_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_erfc_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_expand_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_irfft_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_fft_rfftn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_floor_divide_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_frexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_geometric_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_grid_sampler_3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_fill_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_index_reduce_amin_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isclose_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isinf_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isnan_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_isreal_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_jiterator_2inputs_2outputs_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_le_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_cross_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_eig_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_lu_solve_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_matrix_rank_hermitian_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_multi_dot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_vecdot_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_linspace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logaddexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logical_and_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logical_or_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logical_xor_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_logspace_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_lt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mH_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_fill_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_log_softmax_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_logaddexp_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_masked_var_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_matmul_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_max_binary_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_max_reduction_no_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_min_reduction_with_dim_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_mvlgamma_mvlgamma_p_3_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_new_empty_strided_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_new_ones_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nextafter_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_adaptive_avg_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_batch_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_conv_transpose3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_dropout2d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_fractional_max_pool3d_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hardsigmoid_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hardtanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_huber_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_area_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_trilinear_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_layer_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_local_response_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_mish_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_multi_margin_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_normalize_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_replicate_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pdist_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pixel_shuffle_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_prelu_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_smooth_l1_loss_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softshrink_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softsign_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_threshold_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_nonzero_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_norm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_outer_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_permute_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_polygamma_polygamma_n_3_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_qr_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_rand_like_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_randn_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_renorm_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_hamming_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_hann_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_kaiser_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_signbit_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sinh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sort_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_airy_ai_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_bessel_y1_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_t_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_u_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_w_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_i0e_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_shifted_chebyshev_polynomial_w_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_special_spherical_bessel_j0_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_split_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_split_list_args_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_split_with_sizes_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_sqrt_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_squeeze_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_std_mean_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_std_mean_unbiased_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_std_unbiased_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_t_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_tanh_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_topk_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_torch_ops_aten__safe_softmax_default_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_true_divide_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unbind_copy_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unbind_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unflatten_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unsafe_chunk_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_unsqueeze_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_vstack_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_where_cuda_float64, test/test_ops.py::TestMathBitsCUDA::test_neg_view_xlogy_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_fake___getitem___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake___rxor___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_acosh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_addcmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_alias_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_all_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_as_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_as_strided_partial_views_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_as_strided_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_asin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_H_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rmatmul___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rsub___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast__unsafe_masked_index_put_accumulate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_abs_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addmm_decomposed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_all_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_any_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_as_strided_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_atleast_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_baddbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_broadcast_to_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_chalf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_clamp_max_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_digamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_div_trunc_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_empty_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_exp2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_expand_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_expm1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_fft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ifftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ifftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_irfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_full_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_geqrf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_grid_sampler_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_igammac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_imag_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isfinite_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isnan_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isreal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_item_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lerp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_det_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_inv_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_matrix_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_svdvals_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log10_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log1p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log_normal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logical_or_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logspace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_argmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_argmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_matrix_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_msort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_narrow_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_native_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_native_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_avg_pool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_batch_norm_without_cudnn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_conv_transpose2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_glu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_linear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_constant_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_selu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_threshold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ones_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_pca_lowrank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_permute_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_rad2deg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_reshape_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_resize_as__cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_resolve_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_round_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_round_decimals_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_rsqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_add_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sgn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_zeta_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_squeeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_std_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_stft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_to_sparse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_topk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tril_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_triu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_triu_indices_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unbind_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unsafe_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unsqueeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_var_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_as_real_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_vsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_autocast_zeros_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_bincount_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_and_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_not_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_block_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_broadcast_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_byte_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cartesian_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cdouble_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_cfloat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_chalf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_clone_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rpow___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__segment_reduce_lengths_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_as_strided_partial_views_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cdouble_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_chalf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_combinations_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_conj_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_copysign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_diff_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_div_floor_rounding_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_double_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_dstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_expand_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_expand_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ifftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ihfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ihfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_irfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_rfft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_ldexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_cond_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_eigh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_pinv_singular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_solve_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_tensorinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_vander_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_vector_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_log1p_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logdet_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mode_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nansum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_embedding_bag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_fractional_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_layer_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_mish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pairwise_distance_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pdist_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_poisson_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_prelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_positive_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_quantile_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_rad2deg_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_real_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_reciprocal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_remainder_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_softmax_with_dtype_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_ndtri_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_squeeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_stack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_stft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_torch_ops_aten__efficient_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_trace_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_var_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_view_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_vsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___rmod___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp__batch_norm_with_update_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addcdiv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_as_strided_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atleast_1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atleast_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_baddbmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_ceil_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cholesky_inverse_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_clamp_min_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cumulative_trapezoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_deg2rad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_double_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_dstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_erf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_expand_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_expand_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_fft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_hfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_ifft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_flatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_flip_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_float_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_frac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_select_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_inner_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_kthvalue_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lstsq_grad_oriented_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lu_factor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_solve_triangular_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_vecdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_lu_unpack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_matmul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_matrix_exp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_max_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_maximum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_median_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_meshgrid_list_of_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_minimum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_msort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nanmedian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nansum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_adaptive_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_binary_cross_entropy_with_logits_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_celu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_dropout_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_elu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_gaussian_nll_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_trilinear_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pad_reflect_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pixel_unshuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_threshold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_permute_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_permute_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polar_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_ravel_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_real_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_remainder_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_round_decimals_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sinc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sinh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_slice_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_squeeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_t_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tensordot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tril_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_triu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unsafe_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unsafe_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_var_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_var_mean_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_var_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_vdot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_view_as_complex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_view_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_deg2rad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diag_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diag_embed_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diagflat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diagonal_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_diff_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_dsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_einsum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_empty_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_expand_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_fft2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_fftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_ifftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_irfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_fft_rfftn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_full_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_gcd_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_fake_ge_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_geometric_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_gt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_half_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_put_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_index_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isclose_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isinf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_isneginf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_istft_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_fake_kthvalue_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_lgamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_householder_product_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_multi_dot_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_linalg_pinv_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_logical_xor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_long_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_argmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_logaddexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_masked_softmin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_max_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_minimum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_multinomial_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nanmedian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_new_full_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_new_ones_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nextafter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_avg_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_max_pool2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_cosine_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_cross_entropy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_elu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_group_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardswish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_huber_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_area_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_unpool2d_grad_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pixel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_rms_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_soft_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_norm_fro_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_outer_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_rand_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_reshape_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_rsqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_short_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_bartlett_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_hamming_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_sort_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_bessel_y0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_hermite_polynomial_h_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_i1e_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_laguerre_polynomial_l_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_modified_bessel_i1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_special_shifted_chebyshev_polynomial_w_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_split_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_square_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_squeeze_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_t_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_tanh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_trapz_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_unfold_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_uniform_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_view_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_vstack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_where_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_fake_zeros_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_T_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___getitem___cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rand___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rxor___cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__softmax_backward_data_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__unsafe_masked_index_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__upsample_bilinear2d_aa_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addcdiv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_amax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_argwhere_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_atleast_2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bincount_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_left_shift_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bool_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_broadcast_tensors_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cfloat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_chunk_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_copysign_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cov_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cummax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diagflat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diagonal_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diff_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_digamma_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_expand_as_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_expm1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_eye_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fftshift_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_hfft_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_floor_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_gcd_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_gradient_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_grid_sampler_3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_half_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_histc_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_i0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_igammac_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_reduce_mean_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_jiterator_binary_return_by_ref_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_kthvalue_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_le_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_cholesky_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_diagonal_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_eig_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_eigh_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_householder_product_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_ldl_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_factor_ex_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_matrix_power_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_matrix_rank_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_matrix_rank_hermitian_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_tensorinv_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_tensorsolve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logcumsumexp_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logical_not_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_lu_solve_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_lu_unpack_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_cumprod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_fill_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_prod_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_std_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_sum_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_max_binary_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_max_pool2d_with_indices_backward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_min_reduction_with_dim_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mul_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mvlgamma_mvlgamma_p_1_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mvlgamma_mvlgamma_p_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_empty_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_empty_strided_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_adaptive_avg_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_channel_shuffle_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv2d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv_transpose3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_cosine_similarity_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_feature_alpha_dropout_with_train_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_fractional_max_pool3d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_hardshrink_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_nearest-exact_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_local_response_norm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_unpool1d_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_mish_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_multilabel_margin_loss_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_normalize_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_pad_replicate_negative_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_rrelu_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_silu_complex_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_softplus_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_norm_inf_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ormqr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_polar_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_polygamma_polygamma_n_2_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_qr_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_randint_like_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_randn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ravel_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_renorm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_repeat_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_round_decimals_neg_3_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_rsub_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_reduce_amin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sgn_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_blackman_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_exponential_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sin_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_softmax_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_modified_bessel_k0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_polygamma_special_polygamma_n_0_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_xlog1py_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sqrt_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_std_unbiased_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sum_to_size_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_svd_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_t_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_take_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_torch__scaled_mm_cuda_float8_e4m3fn, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_torch__scaled_mm_v2_cuda_float8_e4m3fn, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unbind_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unflatten_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unfold_copy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unfold_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_var_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_view_as_real_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_vsplit_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_xlogy_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_float32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_int64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_complex32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_complex64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_float64, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_uint8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_complex128, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_int8, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_bfloat16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_int16, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_int32, test/test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_uint8, test/test_ops.py::TestTagsCUDA::test_tags_H_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___radd___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___rmatmul___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags___rpow___cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__chunk_cat_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__native_batch_norm_legit_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_T_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_byte_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_cdouble_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_chalf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs__conversions_float_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_abs_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_atleast_1d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_not_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_xor_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_bucketize_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_count_nonzero_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_equal_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_hfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ifftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_ihfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_fft_irfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_flatten_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_flipud_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_gcd_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_geometric_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_hypot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_i0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_imag_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags__refs_index_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isclose_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_isposinf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_lcm_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags__refs_lgamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_cross_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_linalg_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_log10_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logical_and_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logspace_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_logsumexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_meshgrid_variadic_tensors_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_minimum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_movedim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nan_to_num_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_narrow_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_new_empty_strided_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_prelu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_triplet_margin_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_normal__in_place_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_ones_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_pow_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sinh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_i0e_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_special_multigammaln_mvlgamma_p_5_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_split_with_sizes_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sqrt_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_square_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_std_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_stft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_sum_to_size_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_take_along_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_to_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_trace_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_transpose_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unbind_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unfold_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_unsqueeze_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__refs_vstack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags__segment_reduce_offsets_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_addcmul_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_alias_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_amax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_argmin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_as_strided_partial_views_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_atanh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_bitwise_and_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_bitwise_xor_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_bool_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_broadcast_shapes_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_broadcast_tensors_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cauchy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_chalf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_clamp_max_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_clamp_min_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_clone_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_constant_pad_nd_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_contiguous_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cov_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_cummin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_diagonal_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_digamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_einsum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_empty_permuted_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_expand_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_eye_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_fft2_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_fftshift_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_hfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ifft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ifftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ihfft_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_ihfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fft_irfftn_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fill_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_float_power_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_fmod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_frac_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_grid_sampler_3d_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_heaviside_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_i0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_igamma_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_igammac_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_imag_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags_index_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_index_reduce_amin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_int_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_isinf_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_istft_cuda_complex64, test/test_ops.py::TestTagsCUDA::test_tags_jiterator_4inputs_with_extra_args_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_jiterator_unary_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_lcm_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_linalg_cholesky_ex_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_eigvalsh_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_norm_subgradients_at_zero_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_qr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_svdvals_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_vander_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_vecdot_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_linalg_vector_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log10_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log1p_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_log_softmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logdet_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_logsumexp_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_argmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_cumprod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_cumsum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_log_softmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_normalize_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_masked_prod_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_maximum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_min_reduction_no_dim_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_mm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nansum_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_narrow_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_native_batch_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_new_empty_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_batch_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_binary_cross_entropy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_ctc_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_hinge_embedding_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_bicubic_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_linear_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_nearest_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_leaky_relu_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_logsigmoid_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_margin_ranking_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool1d_grad_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_mse_loss_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_multi_head_attention_forward_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_pad_replicate_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_relu6_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_nn_functional_scaled_dot_product_attention_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_norm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_polygamma_polygamma_n_0_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_polygamma_polygamma_n_3_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_positive_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_qr_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_rad2deg_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_randint_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_randn_like_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_ravel_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_resolve_conj_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_rot90_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_round_decimals_3_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_rsub_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_scatter_reduce_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_select_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sign_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_gaussian_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_signal_windows_hann_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sin_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_slice_scatter_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_softmax_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sparse_mm_reduce_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_sparse_sampled_addmm_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_i0e_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_i1e_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_modified_bessel_k1_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_ndtri_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_special_zeta_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_split_list_args_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_squeeze_copy_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_squeeze_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_stack_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_std_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_svd_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_svd_lowrank_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_tensor_split_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_topk_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_trapezoid_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unfold_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_uniform_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_unravel_index_cuda_int64, test/test_ops.py::TestTagsCUDA::test_tags_var_mean_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_where_cuda_float32, test/test_ops.py::TestTagsCUDA::test_tags_zeros_cuda_float32, test/test_ops.py::TestForwardADWithScalarsCUDA::test_0d_tensor_with_python_scalar_div_no_rounding_mode_cuda_float32, test/test_ops.py::TestForwardADWithScalarsCUDA::test_0d_tensor_with_python_scalar_div_trunc_rounding_cuda_float32 2025-12-04T13:28:26.4256183Z 2025-12-04T13:28:26.4256298Z test_ops.py::TestSelfKwarg::test_self_kwargs PASSED [0.0012s] [ 0%] 2025-12-04T13:28:26.4256604Z test_ops.py::TestCommonCUDA::test_compare_cpu_T_cuda_float32 SKIPPED [0.0949s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4257041Z test_ops.py::TestCommonCUDA::test_compare_cpu___getitem___cuda_float32 SKIPPED [0.0012s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4257440Z test_ops.py::TestCommonCUDA::test_compare_cpu___rmod___cuda_float32 SKIPPED [0.0015s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4257829Z test_ops.py::TestCommonCUDA::test_compare_cpu___ror___cuda_int64 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4258231Z test_ops.py::TestCommonCUDA::test_compare_cpu__batch_norm_with_update_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4258689Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_bfloat16_cuda_float32 SKIPPED [0.8315s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4259181Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_bool_cuda_float32 SKIPPED [0.0013s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4259622Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_cfloat_cuda_float32 SKIPPED [0.0011s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4260061Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_complex_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4263456Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_double_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4263893Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_half_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4264318Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs__conversions_long_cuda_float32 SKIPPED [0.0001s] (Overflow when downcasting signed type is undefined) [ 0%] 2025-12-04T13:28:26.4264738Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_as_strided_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4265149Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_atleast_2d_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4265589Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_bitwise_right_shift_cuda_int64 SKIPPED [0.0001s] (Skipped some inputs produce undefined outputs) [ 0%] 2025-12-04T13:28:26.4265999Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_block_diag_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4266412Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_constant_pad_nd_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4266825Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_copysign_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4267243Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_div_no_rounding_mode_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4267656Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_dstack_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4268028Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_empty_like_cuda_float32 SKIPPED [0.0001s] (output is non-deterministic) [ 0%] 2025-12-04T13:28:26.4268446Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_fft_ifftshift_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4268848Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_flip_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4269254Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_fmin_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4269652Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_index_add_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4270065Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linalg_svdvals_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4270478Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_linspace_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4270898Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_logaddexp2_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4271307Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_logsumexp_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4271716Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_masked_fill_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4272159Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_movedim_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4272541Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_new_empty_strided_cuda_float32 SKIPPED [0.0001s] (output is non-deterministic) [ 0%] 2025-12-04T13:28:26.4272936Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_glu_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4273369Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_hardtanh_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4273807Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_leaky_relu_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4274280Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_log_softmax_with_dtype_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4274738Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_pixel_shuffle_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4275192Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_poisson_nll_loss_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4275642Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_nn_functional_relu6_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4276054Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_norm_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4276455Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_permute_copy_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4276864Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_rot90_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4277314Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_special_xlog1py_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4277739Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_std_mean_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4278137Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_t_copy_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4278546Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_take_along_dim_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4278952Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_unfold_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4279361Z test_ops.py::TestCommonCUDA::test_compare_cpu__refs_view_as_complex_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4279814Z test_ops.py::TestCommonCUDA::test_compare_cpu__unsafe_masked_index_put_accumulate_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4280241Z test_ops.py::TestCommonCUDA::test_compare_cpu_arange_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4280639Z test_ops.py::TestCommonCUDA::test_compare_cpu_as_strided_scatter_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4281049Z test_ops.py::TestCommonCUDA::test_compare_cpu_atleast_2d_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4281442Z test_ops.py::TestCommonCUDA::test_compare_cpu_baddbmm_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4281822Z test_ops.py::TestCommonCUDA::test_compare_cpu_bitwise_left_shift_cuda_int64 SKIPPED [0.0001s] (Some inputs produce undefined outputs) [ 0%] 2025-12-04T13:28:26.4282253Z test_ops.py::TestCommonCUDA::test_compare_cpu_bitwise_right_shift_cuda_int64 SKIPPED [0.0001s] (Some inputs produce undefined outputs) [ 0%] 2025-12-04T13:28:26.4282643Z test_ops.py::TestCommonCUDA::test_compare_cpu_cartesian_prod_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4283008Z test_ops.py::TestCommonCUDA::test_compare_cpu_cauchy_cuda_float32 SKIPPED [0.0001s] (output is non-deterministic) [ 0%] 2025-12-04T13:28:26.4283374Z test_ops.py::TestCommonCUDA::test_compare_cpu_cdouble_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4283771Z test_ops.py::TestCommonCUDA::test_compare_cpu_combinations_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4284170Z test_ops.py::TestCommonCUDA::test_compare_cpu_cumprod_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4284555Z test_ops.py::TestCommonCUDA::test_compare_cpu_dist_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4284936Z test_ops.py::TestCommonCUDA::test_compare_cpu_dot_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4285320Z test_ops.py::TestCommonCUDA::test_compare_cpu_double_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4285673Z test_ops.py::TestCommonCUDA::test_compare_cpu_empty_cuda_float32 SKIPPED [0.0001s] (output is non-deterministic) [ 0%] 2025-12-04T13:28:26.4286035Z test_ops.py::TestCommonCUDA::test_compare_cpu_full_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4286419Z test_ops.py::TestCommonCUDA::test_compare_cpu_full_like_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 0%] 2025-12-04T13:28:26.4286821Z test_ops.py::TestCommonCUDA::test_compare_cpu_gather_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4287181Z test_ops.py::TestCommonCUDA::test_compare_cpu_geometric_cuda_float32 SKIPPED [0.0001s] (output is non-deterministic) [ 1%] 2025-12-04T13:28:26.4287545Z test_ops.py::TestCommonCUDA::test_compare_cpu_index_fill_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4287948Z test_ops.py::TestCommonCUDA::test_compare_cpu_index_reduce_amin_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4288360Z test_ops.py::TestCommonCUDA::test_compare_cpu_index_reduce_prod_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4288779Z test_ops.py::TestCommonCUDA::test_compare_cpu_isin_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4289164Z test_ops.py::TestCommonCUDA::test_compare_cpu_istft_cuda_complex64 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4289549Z test_ops.py::TestCommonCUDA::test_compare_cpu_lerp_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4289950Z test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_ldl_factor_ex_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4290359Z test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lstsq_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4290755Z test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lu_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4291154Z test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_lu_factor_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4291557Z test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_pinv_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4292021Z test_ops.py::TestCommonCUDA::test_compare_cpu_linalg_pinv_singular_cuda_float32 SKIPPED [0.0005s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4292434Z test_ops.py::TestCommonCUDA::test_compare_cpu_linspace_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4292802Z test_ops.py::TestCommonCUDA::test_compare_cpu_log_normal_cuda_float32 SKIPPED [0.0001s] (output is non-deterministic) [ 1%] 2025-12-04T13:28:26.4293171Z test_ops.py::TestCommonCUDA::test_compare_cpu_logcumsumexp_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4293591Z test_ops.py::TestCommonCUDA::test_compare_cpu_logspace_tensor_overload_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4293994Z test_ops.py::TestCommonCUDA::test_compare_cpu_long_cuda_float32 SKIPPED [0.0001s] (Overflow when downcasting signed type is undefined) [ 1%] 2025-12-04T13:28:26.4294383Z test_ops.py::TestCommonCUDA::test_compare_cpu_masked_log_softmax_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4294810Z test_ops.py::TestCommonCUDA::test_compare_cpu_masked_logaddexp_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4295218Z test_ops.py::TestCommonCUDA::test_compare_cpu_masked_select_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4295629Z test_ops.py::TestCommonCUDA::test_compare_cpu_matmul_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4296012Z test_ops.py::TestCommonCUDA::test_compare_cpu_msort_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4296389Z test_ops.py::TestCommonCUDA::test_compare_cpu_mv_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4296772Z test_ops.py::TestCommonCUDA::test_compare_cpu_narrow_copy_cuda_float32 SKIPPED [0.0011s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4297147Z test_ops.py::TestCommonCUDA::test_compare_cpu_new_empty_strided_cuda_float32 SKIPPED [0.0001s] (output is non-deterministic) [ 1%] 2025-12-04T13:28:26.4297527Z test_ops.py::TestCommonCUDA::test_compare_cpu_new_full_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4297919Z test_ops.py::TestCommonCUDA::test_compare_cpu_nextafter_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4298346Z test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_adaptive_avg_pool2d_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4298803Z test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_binary_cross_entropy_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4299246Z test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_conv2d_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4299613Z test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_conv3d_cuda_float32 SKIPPED [0.0001s] (Skipped!) [ 1%] 2025-12-04T13:28:26.4299992Z test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_cosine_similarity_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4300438Z test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_embedding_bag_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4300896Z test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_bilinear_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4301358Z test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_linear_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4301812Z test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_nearest_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4302315Z test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_interpolate_trilinear_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4302777Z test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_multilabel_margin_loss_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4303229Z test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_poisson_nll_loss_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4303689Z test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_triplet_margin_loss_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4304159Z test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_triplet_margin_with_distance_loss_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4304639Z test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_upsample_bilinear_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4305090Z test_ops.py::TestCommonCUDA::test_compare_cpu_nn_functional_upsample_nearest_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4305473Z test_ops.py::TestCommonCUDA::test_compare_cpu_nonzero_static_cuda_float32 SKIPPED [0.0005s] (Only runs on cpu) [ 1%] 2025-12-04T13:28:26.4305822Z test_ops.py::TestCommonCUDA::test_compare_cpu_norm_inf_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4306208Z test_ops.py::TestCommonCUDA::test_compare_cpu_ormqr_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4306605Z test_ops.py::TestCommonCUDA::test_compare_cpu_outer_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4306983Z test_ops.py::TestCommonCUDA::test_compare_cpu_put_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4307365Z test_ops.py::TestCommonCUDA::test_compare_cpu_reshape_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4307760Z test_ops.py::TestCommonCUDA::test_compare_cpu_scalar_tensor_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4308161Z test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_add_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4308566Z test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_amax_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4308982Z test_ops.py::TestCommonCUDA::test_compare_cpu_scatter_reduce_amin_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4309394Z test_ops.py::TestCommonCUDA::test_compare_cpu_select_scatter_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4309804Z test_ops.py::TestCommonCUDA::test_compare_cpu_softmax_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4310203Z test_ops.py::TestCommonCUDA::test_compare_cpu_softmax_with_dtype_cuda_float32 SKIPPED [0.0008s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4310601Z test_ops.py::TestCommonCUDA::test_compare_cpu_sort_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4311020Z test_ops.py::TestCommonCUDA::test_compare_cpu_special_chebyshev_polynomial_u_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4311470Z test_ops.py::TestCommonCUDA::test_compare_cpu_special_chebyshev_polynomial_w_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4311971Z test_ops.py::TestCommonCUDA::test_compare_cpu_special_shifted_chebyshev_polynomial_t_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4312438Z test_ops.py::TestCommonCUDA::test_compare_cpu_special_shifted_chebyshev_polynomial_w_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4312876Z test_ops.py::TestCommonCUDA::test_compare_cpu_svd_cuda_float32 SKIPPED [0.0011s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4313267Z test_ops.py::TestCommonCUDA::test_compare_cpu_t_cuda_float32 SKIPPED [0.0010s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4313654Z test_ops.py::TestCommonCUDA::test_compare_cpu_take_along_dim_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4314052Z test_ops.py::TestCommonCUDA::test_compare_cpu_to_sparse_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 1%] 2025-12-04T13:28:26.4314436Z test_ops.py::TestCommonCUDA::test_compare_cpu_zero__cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 2%] 2025-12-04T13:28:26.4314818Z test_ops.py::TestCommonCUDA::test_compare_cpu_zeros_cuda_float32 SKIPPED [0.0009s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 2%] 2025-12-04T13:28:26.4315165Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_T_cuda_complex32 PASSED [0.8476s] [ 2%] 2025-12-04T13:28:26.4315454Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_asin_cuda_complex32 PASSED [0.7586s] [ 2%] 2025-12-04T13:28:26.4315744Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atanh_cuda_complex32 PASSED [0.8920s] [ 2%] 2025-12-04T13:28:26.4316042Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atleast_1d_cuda_complex32 PASSED [0.7375s] [ 2%] 2025-12-04T13:28:26.4316347Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_atleast_3d_cuda_complex32 PASSED [0.7265s] [ 2%] 2025-12-04T13:28:26.4316643Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_bool_cuda_complex32 PASSED [0.7408s] [ 2%] 2025-12-04T13:28:26.4316941Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_column_stack_cuda_complex32 PASSED [0.7388s] [ 2%] 2025-12-04T13:28:26.4317239Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_eq_cuda_complex32 PASSED [0.7334s] [ 2%] 2025-12-04T13:28:26.4317530Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_fft_ifft_cuda_complex32 PASSED [4.2725s] [ 2%] 2025-12-04T13:28:26.4317831Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_index_add_cuda_complex32 PASSED [0.8356s] [ 2%] 2025-12-04T13:28:26.4318125Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_mT_cuda_complex32 PASSED [0.7206s] [ 2%] 2025-12-04T13:28:26.4318420Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_masked_fill_cuda_complex32 PASSED [0.7224s] [ 2%] 2025-12-04T13:28:26.4318809Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_new_empty_strided_cuda_complex32 SKIPPED [0.0002s] (Expected: new_empty_strided is not comparable) [ 2%] 2025-12-04T13:28:26.4319375Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv2d_cuda_complex32 MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback AI] Solver , workspace required: 1200, provided ptr: 0x7db6e4401800 size: 768 2025-12-04T13:28:26.4319920Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 1200, provided ptr: 0x7db6e4401800 size: 768 2025-12-04T13:28:26.4320329Z MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback AI] Solver , workspace required: 2400, provided ptr: 0x7db6e4400e00 size: 1024 2025-12-04T13:28:26.4320739Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 2400, provided ptr: 0x7db6e4400e00 size: 1024 2025-12-04T13:28:26.4321001Z PASSED [2.2232s] [ 2%] 2025-12-04T13:28:26.4321400Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv3d_cuda_complex32 MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback WTI] Solver , workspace required: 26400, provided ptr: 0x7daec7c01600 size: 5888 2025-12-04T13:28:26.4322001Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 26400, provided ptr: 0x7daec7c01600 size: 5888 2025-12-04T13:28:26.4322443Z MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback WTI] Solver , workspace required: 52800, provided ptr: 0x7daec7c07400 size: 11008 2025-12-04T13:28:26.4322863Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 52800, provided ptr: 0x7daec7c07400 size: 11008 2025-12-04T13:28:26.4323282Z MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback WTI] Solver , workspace required: 168960, provided ptr: 0x7daec7c00000 size: 6656 2025-12-04T13:28:26.4323699Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 168960, provided ptr: 0x7daec7c00000 size: 6656 2025-12-04T13:28:26.4324120Z MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback WTI] Solver , workspace required: 337920, provided ptr: 0x7daec7c01600 size: 12544 2025-12-04T13:28:26.4324554Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 337920, provided ptr: 0x7daec7c01600 size: 12544 2025-12-04T13:28:26.4324817Z PASSED [0.8148s] [ 2%] 2025-12-04T13:28:26.4325044Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_nn_functional_conv_transpose2d_cuda_complex32 PASSED [3.1141s] [ 2%] 2025-12-04T13:28:26.4325374Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_positive_cuda_complex32 PASSED [0.7186s] [ 2%] 2025-12-04T13:28:26.4325684Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_pow_cuda_complex32 SKIPPED [0.0002s] (Skipped!) [ 2%] 2025-12-04T13:28:26.4325990Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_select_cuda_complex32 PASSED [0.7377s] [ 2%] 2025-12-04T13:28:26.4326304Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_split_with_sizes_copy_cuda_complex32 PASSED [0.7206s] [ 2%] 2025-12-04T13:28:26.4326617Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_squeeze_cuda_complex32 PASSED [0.7189s] [ 2%] 2025-12-04T13:28:26.4326911Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_stack_cuda_complex32 PASSED [0.0081s] [ 2%] 2025-12-04T13:28:26.4327202Z test_ops.py::TestCommonCUDA::test_complex_half_reference_testing_zeros_cuda_complex32 PASSED [0.7240s] [ 2%] 2025-12-04T13:28:26.4327456Z test_ops.py::TestCommonCUDA::test_dtypes_T_cuda PASSED [0.7449s] [ 2%] 2025-12-04T13:28:26.4327676Z test_ops.py::TestCommonCUDA::test_dtypes___getitem___cuda PASSED [0.8343s] [ 2%] 2025-12-04T13:28:26.4327914Z test_ops.py::TestCommonCUDA::test_dtypes___rmul___cuda PASSED [0.7653s] [ 2%] 2025-12-04T13:28:26.4328154Z test_ops.py::TestCommonCUDA::test_dtypes__native_batch_norm_legit_cuda PASSED [0.7636s] [ 2%] 2025-12-04T13:28:26.4353121Z test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_cdouble_cuda PASSED [0.7472s] [ 2%] 2025-12-04T13:28:26.4353489Z test_ops.py::TestCommonCUDA::test_dtypes__refs__conversions_half_cuda PASSED [0.7481s] [ 2%] 2025-12-04T13:28:26.4353762Z test_ops.py::TestCommonCUDA::test_dtypes__refs_add_cuda PASSED [0.7735s] [ 2%] 2025-12-04T13:28:26.4353990Z test_ops.py::TestCommonCUDA::test_dtypes__refs_atan2_cuda PASSED [0.7904s] [ 2%] 2025-12-04T13:28:26.4354218Z test_ops.py::TestCommonCUDA::test_dtypes__refs_atan_cuda PASSED [0.7464s] [ 2%] 2025-12-04T13:28:26.4354448Z test_ops.py::TestCommonCUDA::test_dtypes__refs_atleast_2d_cuda PASSED [0.7496s] [ 2%] 2025-12-04T13:28:26.4354704Z test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_left_shift_cuda PASSED [0.7692s] [ 2%] 2025-12-04T13:28:26.4354977Z test_ops.py::TestCommonCUDA::test_dtypes__refs_bitwise_right_shift_cuda PASSED [0.7635s] [ 2%] 2025-12-04T13:28:26.4355310Z test_ops.py::TestCommonCUDA::test_dtypes__refs_broadcast_shapes_cuda PASSED [0.0264s] [ 2%] 2025-12-04T13:28:26.4355569Z test_ops.py::TestCommonCUDA::test_dtypes__refs_broadcast_tensors_cuda PASSED [0.7573s] [ 2%] 2025-12-04T13:28:26.4355825Z test_ops.py::TestCommonCUDA::test_dtypes__refs_broadcast_to_cuda PASSED [0.7812s] [ 2%] 2025-12-04T13:28:26.4356087Z test_ops.py::TestCommonCUDA::test_dtypes__refs_bucketize_cuda PASSED [0.8888s] [ 2%] 2025-12-04T13:28:26.4356321Z test_ops.py::TestCommonCUDA::test_dtypes__refs_ceil_cuda PASSED [0.7389s] [ 2%] 2025-12-04T13:28:26.4356544Z test_ops.py::TestCommonCUDA::test_dtypes__refs_cos_cuda PASSED [0.7451s] [ 2%] 2025-12-04T13:28:26.4356770Z test_ops.py::TestCommonCUDA::test_dtypes__refs_cumsum_cuda PASSED [0.7417s] [ 2%] 2025-12-04T13:28:26.4357018Z test_ops.py::TestCommonCUDA::test_dtypes__refs_diagonal_scatter_cuda PASSED [0.7630s] [ 2%] 2025-12-04T13:28:26.4357263Z test_ops.py::TestCommonCUDA::test_dtypes__refs_erfinv_cuda PASSED [0.7331s] [ 2%] 2025-12-04T13:28:26.4357504Z test_ops.py::TestCommonCUDA::test_dtypes__refs_expand_copy_cuda PASSED [0.7620s] [ 2%] 2025-12-04T13:28:26.4357743Z test_ops.py::TestCommonCUDA::test_dtypes__refs_expand_cuda PASSED [0.7455s] [ 2%] 2025-12-04T13:28:26.4357990Z test_ops.py::TestCommonCUDA::test_dtypes__refs_fft_ifftn_cuda PASSED [9.8413s] [ 2%] 2025-12-04T13:28:26.4358228Z test_ops.py::TestCommonCUDA::test_dtypes__refs_flatten_cuda PASSED [1.2142s] [ 2%] 2025-12-04T13:28:26.4358466Z test_ops.py::TestCommonCUDA::test_dtypes__refs_floor_divide_cuda PASSED [0.1530s] [ 2%] 2025-12-04T13:28:26.4358700Z test_ops.py::TestCommonCUDA::test_dtypes__refs_fmin_cuda PASSED [1.2434s] [ 2%] 2025-12-04T13:28:26.4358924Z test_ops.py::TestCommonCUDA::test_dtypes__refs_fmod_cuda PASSED [1.2628s] [ 2%] 2025-12-04T13:28:26.4359153Z test_ops.py::TestCommonCUDA::test_dtypes__refs_gt_cuda PASSED [1.2250s] [ 2%] 2025-12-04T13:28:26.4359376Z test_ops.py::TestCommonCUDA::test_dtypes__refs_hsplit_cuda PASSED [1.1921s] [ 2%] 2025-12-04T13:28:26.4359614Z test_ops.py::TestCommonCUDA::test_dtypes__refs_igammac_cuda PASSED [1.2595s] [ 2%] 2025-12-04T13:28:26.4359849Z test_ops.py::TestCommonCUDA::test_dtypes__refs_index_add_cuda PASSED [1.2400s] [ 2%] 2025-12-04T13:28:26.4360089Z test_ops.py::TestCommonCUDA::test_dtypes__refs_index_fill_cuda PASSED [1.2079s] [ 2%] 2025-12-04T13:28:26.4360329Z test_ops.py::TestCommonCUDA::test_dtypes__refs_index_select_cuda PASSED [1.2228s] [ 2%] 2025-12-04T13:28:26.4360567Z test_ops.py::TestCommonCUDA::test_dtypes__refs_isinf_cuda PASSED [1.2031s] [ 2%] 2025-12-04T13:28:26.4360792Z test_ops.py::TestCommonCUDA::test_dtypes__refs_istft_cuda PASSED [6.7549s] [ 2%] 2025-12-04T13:28:26.4361015Z test_ops.py::TestCommonCUDA::test_dtypes__refs_le_cuda PASSED [0.7792s] [ 2%] 2025-12-04T13:28:26.4361269Z test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_matrix_norm_cuda PASSED [1.2419s] [ 2%] 2025-12-04T13:28:26.4361524Z test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_norm_cuda PASSED [0.9484s] [ 2%] 2025-12-04T13:28:26.4361777Z test_ops.py::TestCommonCUDA::test_dtypes__refs_linalg_vector_norm_cuda PASSED [1.0473s] [ 2%] 2025-12-04T13:28:26.4362072Z test_ops.py::TestCommonCUDA::test_dtypes__refs_linspace_cuda PASSED [0.0998s] [ 2%] 2025-12-04T13:28:26.4362327Z test_ops.py::TestCommonCUDA::test_dtypes__refs_log_softmax_with_dtype_cuda PASSED [0.7654s] [ 2%] 2025-12-04T13:28:26.4362585Z test_ops.py::TestCommonCUDA::test_dtypes__refs_logical_and_cuda PASSED [0.7682s] [ 3%] 2025-12-04T13:28:26.4362824Z test_ops.py::TestCommonCUDA::test_dtypes__refs_logsumexp_cuda PASSED [0.7728s] [ 3%] 2025-12-04T13:28:26.4363062Z test_ops.py::TestCommonCUDA::test_dtypes__refs_masked_fill_cuda PASSED [0.7689s] [ 3%] 2025-12-04T13:28:26.4363295Z test_ops.py::TestCommonCUDA::test_dtypes__refs_mean_cuda PASSED [0.7890s] [ 3%] 2025-12-04T13:28:26.4363519Z test_ops.py::TestCommonCUDA::test_dtypes__refs_mul_cuda PASSED [0.7656s] [ 3%] 2025-12-04T13:28:26.4363760Z test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_celu_cuda PASSED [0.7409s] [ 3%] 2025-12-04T13:28:26.4364044Z test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_dropout_cuda PASSED [0.7876s] [ 3%] 2025-12-04T13:28:26.4364311Z test_ops.py::TestCommonCUDA::test_dtypes__refs_nn_functional_pdist_cuda PASSED [0.9207s] [ 3%] 2025-12-04T13:28:26.4385139Z test_ops.py::TestCommonCUDA::test_dtypes__refs_permute_cuda PASSED [0.7364s] [ 3%] 2025-12-04T13:28:26.4385392Z test_ops.py::TestCommonCUDA::test_dtypes__refs_remainder_cuda PASSED [0.7881s] [ 3%] 2025-12-04T13:28:26.4385616Z test_ops.py::TestCommonCUDA::test_dtypes__refs_renorm_cuda PASSED [0.7420s] [ 3%] 2025-12-04T13:28:26.4385833Z test_ops.py::TestCommonCUDA::test_dtypes__refs_repeat_cuda PASSED [0.0699s] [ 3%] 2025-12-04T13:28:26.4386058Z test_ops.py::TestCommonCUDA::test_dtypes__refs_select_scatter_cuda PASSED [0.7610s] [ 3%] 2025-12-04T13:28:26.4386285Z test_ops.py::TestCommonCUDA::test_dtypes__refs_sgn_cuda PASSED [0.7555s] [ 3%] 2025-12-04T13:28:26.4386507Z test_ops.py::TestCommonCUDA::test_dtypes__refs_special_erfcx_cuda PASSED [0.7763s] [ 3%] 2025-12-04T13:28:26.4386738Z test_ops.py::TestCommonCUDA::test_dtypes__refs_special_i1_cuda PASSED [0.7383s] [ 3%] 2025-12-04T13:28:26.4386986Z test_ops.py::TestCommonCUDA::test_dtypes__refs_special_log_ndtr_cuda PASSED [0.7430s] [ 3%] 2025-12-04T13:28:26.4387227Z test_ops.py::TestCommonCUDA::test_dtypes__refs_special_logit_cuda PASSED [0.7348s] [ 3%] 2025-12-04T13:28:26.4387461Z test_ops.py::TestCommonCUDA::test_dtypes__refs_squeeze_copy_cuda PASSED [0.7635s] [ 3%] 2025-12-04T13:28:26.4387683Z test_ops.py::TestCommonCUDA::test_dtypes__refs_t_cuda PASSED [0.7387s] [ 3%] 2025-12-04T13:28:26.4387891Z test_ops.py::TestCommonCUDA::test_dtypes__refs_to_cuda PASSED [0.8048s] [ 3%] 2025-12-04T13:28:26.4388113Z test_ops.py::TestCommonCUDA::test_dtypes__refs_transpose_copy_cuda PASSED [0.7552s] [ 3%] 2025-12-04T13:28:26.4388346Z test_ops.py::TestCommonCUDA::test_dtypes__refs_tril_indices_cuda PASSED [0.7440s] [ 3%] 2025-12-04T13:28:26.4388576Z test_ops.py::TestCommonCUDA::test_dtypes__refs_triu_indices_cuda PASSED [0.7450s] [ 3%] 2025-12-04T13:28:26.4388807Z test_ops.py::TestCommonCUDA::test_dtypes__refs_unsqueeze_copy_cuda PASSED [0.7552s] [ 3%] 2025-12-04T13:28:26.4389038Z test_ops.py::TestCommonCUDA::test_dtypes__refs_unsqueeze_cuda PASSED [0.7917s] [ 3%] 2025-12-04T13:28:26.4389257Z test_ops.py::TestCommonCUDA::test_dtypes__refs_vdot_cuda PASSED [0.7522s] [ 3%] 2025-12-04T13:28:26.4389485Z test_ops.py::TestCommonCUDA::test_dtypes__segment_reduce_lengths_cuda PASSED [1.0181s] [ 3%] 2025-12-04T13:28:26.4389712Z test_ops.py::TestCommonCUDA::test_dtypes_acos_cuda PASSED [1.0956s] [ 3%] 2025-12-04T13:28:26.4389920Z test_ops.py::TestCommonCUDA::test_dtypes_addmv_cuda PASSED [1.4137s] [ 3%] 2025-12-04T13:28:26.4390139Z test_ops.py::TestCommonCUDA::test_dtypes_allclose_cuda PASSED [0.8013s] [ 3%] 2025-12-04T13:28:26.4390346Z test_ops.py::TestCommonCUDA::test_dtypes_arange_cuda PASSED [0.0471s] [ 3%] 2025-12-04T13:28:26.4390553Z test_ops.py::TestCommonCUDA::test_dtypes_argwhere_cuda PASSED [0.8341s] [ 3%] 2025-12-04T13:28:26.4390770Z test_ops.py::TestCommonCUDA::test_dtypes_as_strided_scatter_cuda PASSED [0.7741s] [ 3%] 2025-12-04T13:28:26.4390989Z test_ops.py::TestCommonCUDA::test_dtypes_atan_cuda PASSED [0.7712s] [ 3%] 2025-12-04T13:28:26.4391197Z test_ops.py::TestCommonCUDA::test_dtypes_atleast_1d_cuda PASSED [0.7666s] [ 3%] 2025-12-04T13:28:26.4391405Z test_ops.py::TestCommonCUDA::test_dtypes_bincount_cuda PASSED [0.7992s] [ 3%] 2025-12-04T13:28:26.4391612Z test_ops.py::TestCommonCUDA::test_dtypes_bitwise_or_cuda PASSED [0.7735s] [ 3%] 2025-12-04T13:28:26.4391825Z test_ops.py::TestCommonCUDA::test_dtypes_bitwise_xor_cuda PASSED [0.7828s] [ 3%] 2025-12-04T13:28:26.4392062Z test_ops.py::TestCommonCUDA::test_dtypes_bool_cuda PASSED [0.7816s] [ 3%] 2025-12-04T13:28:26.4392268Z test_ops.py::TestCommonCUDA::test_dtypes_bucketize_cuda PASSED [0.7981s] [ 3%] 2025-12-04T13:28:26.4392489Z test_ops.py::TestCommonCUDA::test_dtypes_cdouble_cuda PASSED [0.7825s] [ 3%] 2025-12-04T13:28:26.4392694Z test_ops.py::TestCommonCUDA::test_dtypes_char_cuda PASSED [0.7781s] [ 3%] 2025-12-04T13:28:26.4392900Z test_ops.py::TestCommonCUDA::test_dtypes_clone_cuda PASSED [0.7674s] [ 3%] 2025-12-04T13:28:26.4393112Z test_ops.py::TestCommonCUDA::test_dtypes_column_stack_cuda PASSED [0.7733s] [ 3%] 2025-12-04T13:28:26.4393342Z test_ops.py::TestCommonCUDA::test_dtypes_contiguous_cuda PASSED [0.7881s] [ 3%] 2025-12-04T13:28:26.4393549Z test_ops.py::TestCommonCUDA::test_dtypes_cos_cuda PASSED [0.9819s] [ 3%] 2025-12-04T13:28:26.4393751Z test_ops.py::TestCommonCUDA::test_dtypes_cosh_cuda PASSED [1.0030s] [ 3%] 2025-12-04T13:28:26.4393955Z test_ops.py::TestCommonCUDA::test_dtypes_cross_cuda PASSED [0.7548s] [ 3%] 2025-12-04T13:28:26.4394159Z test_ops.py::TestCommonCUDA::test_dtypes_cummax_cuda PASSED [0.7635s] [ 3%] 2025-12-04T13:28:26.4394363Z test_ops.py::TestCommonCUDA::test_dtypes_cumsum_cuda PASSED [0.7532s] [ 3%] 2025-12-04T13:28:26.4394567Z test_ops.py::TestCommonCUDA::test_dtypes_diag_cuda PASSED [0.7688s] [ 3%] 2025-12-04T13:28:26.4394772Z test_ops.py::TestCommonCUDA::test_dtypes_diagonal_cuda PASSED [0.7532s] [ 3%] 2025-12-04T13:28:26.4394993Z test_ops.py::TestCommonCUDA::test_dtypes_dist_cuda PASSED [0.8384s] [ 3%] 2025-12-04T13:28:26.4395219Z test_ops.py::TestCommonCUDA::test_dtypes_div_trunc_rounding_cuda PASSED [0.7434s] [ 3%] 2025-12-04T13:28:26.4395445Z test_ops.py::TestCommonCUDA::test_dtypes_dot_cuda PASSED [0.7309s] [ 3%] 2025-12-04T13:28:26.4395663Z test_ops.py::TestCommonCUDA::test_dtypes_empty_like_cuda PASSED [0.7633s] [ 3%] 2025-12-04T13:28:26.4395881Z test_ops.py::TestCommonCUDA::test_dtypes_expand_as_cuda PASSED [0.7423s] [ 3%] 2025-12-04T13:28:26.4396103Z test_ops.py::TestCommonCUDA::test_dtypes_exponential_cuda PASSED [0.7384s] [ 3%] 2025-12-04T13:28:26.4396326Z test_ops.py::TestCommonCUDA::test_dtypes_eye_cuda PASSED [0.8409s] [ 3%] 2025-12-04T13:28:26.4396541Z test_ops.py::TestCommonCUDA::test_dtypes_fft_ihfft_cuda PASSED [2.3637s] [ 3%] 2025-12-04T13:28:26.4396758Z test_ops.py::TestCommonCUDA::test_dtypes_fft_irfftn_cuda PASSED [7.0265s] [ 3%] 2025-12-04T13:28:26.4396975Z test_ops.py::TestCommonCUDA::test_dtypes_flipud_cuda PASSED [0.7024s] [ 3%] 2025-12-04T13:28:26.4397191Z test_ops.py::TestCommonCUDA::test_dtypes_fmax_cuda PASSED [0.7367s] [ 3%] 2025-12-04T13:28:26.4397405Z test_ops.py::TestCommonCUDA::test_dtypes_fmin_cuda PASSED [0.7339s] [ 3%] 2025-12-04T13:28:26.4397620Z test_ops.py::TestCommonCUDA::test_dtypes_index_add_cuda PASSED [0.7468s] [ 3%] 2025-12-04T13:28:26.4397836Z test_ops.py::TestCommonCUDA::test_dtypes_index_put_cuda PASSED [0.7076s] [ 3%] 2025-12-04T13:28:26.4398049Z test_ops.py::TestCommonCUDA::test_dtypes_inner_cuda PASSED [0.7203s] [ 4%] 2025-12-04T13:28:26.4398272Z test_ops.py::TestCommonCUDA::test_dtypes_isin_cuda PASSED [0.8272s] [ 4%] 2025-12-04T13:28:26.4398481Z test_ops.py::TestCommonCUDA::test_dtypes_istft_cuda PASSED [0.7499s] [ 4%] 2025-12-04T13:28:26.4398687Z test_ops.py::TestCommonCUDA::test_dtypes_item_cuda PASSED [0.7393s] [ 4%] 2025-12-04T13:28:26.4398894Z test_ops.py::TestCommonCUDA::test_dtypes_kron_cuda PASSED [0.7365s] [ 4%] 2025-12-04T13:28:26.4399099Z test_ops.py::TestCommonCUDA::test_dtypes_lcm_cuda PASSED [0.7396s] [ 4%] 2025-12-04T13:28:26.4399318Z test_ops.py::TestCommonCUDA::test_dtypes_linalg_cholesky_ex_cuda PASSED [1.2323s] [ 4%] 2025-12-04T13:28:26.4399544Z test_ops.py::TestCommonCUDA::test_dtypes_linalg_det_cuda PASSED [0.9532s] [ 4%] 2025-12-04T13:28:26.4399762Z test_ops.py::TestCommonCUDA::test_dtypes_linalg_eigvalsh_cuda PASSED [0.8154s] [ 4%] 2025-12-04T13:28:26.4399989Z test_ops.py::TestCommonCUDA::test_dtypes_linalg_lu_factor_cuda PASSED [1.3570s] [ 4%] 2025-12-04T13:28:26.4400223Z test_ops.py::TestCommonCUDA::test_dtypes_linalg_lu_factor_ex_cuda PASSED [0.8488s] [ 4%] 2025-12-04T13:28:26.4400454Z test_ops.py::TestCommonCUDA::test_dtypes_linalg_lu_solve_cuda PASSED [1.1677s] [ 4%] 2025-12-04T13:28:26.4400720Z test_ops.py::TestCommonCUDA::test_dtypes_linalg_norm_subgradients_at_zero_cuda PASSED [0.9223s] [ 4%] 2025-12-04T13:28:26.4400969Z test_ops.py::TestCommonCUDA::test_dtypes_linalg_slogdet_cuda PASSED [0.8000s] [ 4%] 2025-12-04T13:28:26.4401192Z test_ops.py::TestCommonCUDA::test_dtypes_linalg_solve_cuda PASSED [0.8124s] [ 4%] 2025-12-04T13:28:26.4401431Z test_ops.py::TestCommonCUDA::test_dtypes_linalg_solve_ex_cuda PASSED [0.8244s] [ 4%] 2025-12-04T13:28:26.4401660Z test_ops.py::TestCommonCUDA::test_dtypes_linalg_tensorinv_cuda PASSED [0.7661s] [ 4%] 2025-12-04T13:28:26.4401943Z test_ops.py::TestCommonCUDA::test_dtypes_linalg_tensorsolve_cuda PASSED [0.7854s] [ 4%] 2025-12-04T13:28:26.4402184Z test_ops.py::TestCommonCUDA::test_dtypes_log_softmax_with_dtype_cuda PASSED [0.7883s] [ 4%] 2025-12-04T13:28:26.4402419Z test_ops.py::TestCommonCUDA::test_dtypes_logcumsumexp_cuda PASSED [0.8093s] [ 4%] 2025-12-04T13:28:26.4402639Z test_ops.py::TestCommonCUDA::test_dtypes_logical_or_cuda PASSED [0.8059s] [ 4%] 2025-12-04T13:28:26.4402854Z test_ops.py::TestCommonCUDA::test_dtypes_logical_xor_cuda PASSED [0.7923s] [ 4%] 2025-12-04T13:28:26.4403096Z test_ops.py::TestCommonCUDA::test_dtypes_masked_amin_cuda PASSED [0.9330s] [ 4%] 2025-12-04T13:28:26.4403314Z test_ops.py::TestCommonCUDA::test_dtypes_masked_argmax_cuda PASSED [0.8511s] [ 4%] 2025-12-04T13:28:26.4403543Z test_ops.py::TestCommonCUDA::test_dtypes_masked_log_softmax_cuda PASSED [0.8555s] [ 4%] 2025-12-04T13:28:26.4403770Z test_ops.py::TestCommonCUDA::test_dtypes_masked_median_cuda PASSED [0.8351s] [ 4%] 2025-12-04T13:28:26.4403996Z test_ops.py::TestCommonCUDA::test_dtypes_masked_normalize_cuda PASSED [0.8362s] [ 4%] 2025-12-04T13:28:26.4404216Z test_ops.py::TestCommonCUDA::test_dtypes_matmul_cuda PASSED [0.8083s] [ 4%] 2025-12-04T13:28:26.4404459Z test_ops.py::TestCommonCUDA::test_dtypes_max_pool2d_with_indices_backward_cuda PASSED [3.3210s] [ 4%] 2025-12-04T13:28:26.4404700Z test_ops.py::TestCommonCUDA::test_dtypes_mean_cuda PASSED [0.8067s] [ 4%] 2025-12-04T13:28:26.4404930Z test_ops.py::TestCommonCUDA::test_dtypes_meshgrid_list_of_tensors_cuda PASSED [0.7745s] [ 4%] 2025-12-04T13:28:26.4405164Z test_ops.py::TestCommonCUDA::test_dtypes_minimum_cuda PASSED [0.7831s] [ 4%] 2025-12-04T13:28:26.4405375Z test_ops.py::TestCommonCUDA::test_dtypes_mm_cuda PASSED [0.7689s] [ 4%] 2025-12-04T13:28:26.4405582Z test_ops.py::TestCommonCUDA::test_dtypes_movedim_cuda PASSED [0.7747s] [ 4%] 2025-12-04T13:28:26.4405788Z test_ops.py::TestCommonCUDA::test_dtypes_mul_cuda PASSED [1.1045s] [ 4%] 2025-12-04T13:28:26.4406000Z test_ops.py::TestCommonCUDA::test_dtypes_multinomial_cuda PASSED [0.8256s] [ 4%] 2025-12-04T13:28:26.4406235Z test_ops.py::TestCommonCUDA::test_dtypes_native_dropout_backward_cuda PASSED [0.7975s] [ 4%] 2025-12-04T13:28:26.4406511Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_adaptive_max_pool1d_cuda PASSED [0.7910s] [ 4%] 2025-12-04T13:28:26.4406772Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv1d_cuda PASSED [1.9176s] [ 4%] 2025-12-04T13:28:26.4407196Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_conv2d_cuda MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback AI] Solver , workspace required: 2400, provided ptr: 0x7dadd2212600 size: 1024 2025-12-04T13:28:26.4407700Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 2400, provided ptr: 0x7dadd2212600 size: 1024 2025-12-04T13:28:26.4408124Z MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback AI] Solver , workspace required: 2400, provided ptr: 0x7dadd2212800 size: 1024 2025-12-04T13:28:26.4408555Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 2400, provided ptr: 0x7dadd2212800 size: 1024 2025-12-04T13:28:26.4408975Z MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback AI] Solver , workspace required: 1200, provided ptr: 0x7dadd2232200 size: 1024 2025-12-04T13:28:26.4409402Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 1200, provided ptr: 0x7dadd2232200 size: 1024 2025-12-04T13:28:26.4409833Z MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback AI] Solver , workspace required: 1200, provided ptr: 0x7dadd2232400 size: 1024 2025-12-04T13:28:26.4410260Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 1200, provided ptr: 0x7dadd2232400 size: 1024 2025-12-04T13:28:26.4410673Z MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback AI] Solver , workspace required: 1200, provided ptr: 0x7dadd2221800 size: 768 2025-12-04T13:28:26.4411077Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 1200, provided ptr: 0x7dadd2221800 size: 768 2025-12-04T13:28:26.4411481Z MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback AI] Solver , workspace required: 1200, provided ptr: 0x7dadd2221800 size: 1024 2025-12-04T13:28:26.4411929Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 1200, provided ptr: 0x7dadd2221800 size: 1024 2025-12-04T13:28:26.4412362Z MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback AI] Solver , workspace required: 1200, provided ptr: 0x7dadd2221a00 size: 1024 2025-12-04T13:28:26.4412787Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 1200, provided ptr: 0x7dadd2221a00 size: 1024 2025-12-04T13:28:26.4413056Z PASSED [1.0870s] [ 4%] 2025-12-04T13:28:26.4413250Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_cosine_embedding_loss_cuda PASSED [0.8072s] [ 4%] 2025-12-04T13:28:26.4413528Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_cosine_similarity_cuda PASSED [0.7719s] [ 4%] 2025-12-04T13:28:26.4413790Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_dropout2d_cuda PASSED [0.8138s] [ 4%] 2025-12-04T13:28:26.4414056Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_fractional_max_pool3d_cuda PASSED [0.9237s] [ 4%] 2025-12-04T13:28:26.4414324Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_hardshrink_cuda PASSED [0.7865s] [ 4%] 2025-12-04T13:28:26.4414588Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_nearest_cuda PASSED [0.8278s] [ 4%] 2025-12-04T13:28:26.4414866Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_interpolate_trilinear_cuda PASSED [0.7862s] [ 4%] 2025-12-04T13:28:26.4415142Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_margin_ranking_loss_cuda PASSED [0.8220s] [ 4%] 2025-12-04T13:28:26.4415405Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_pool2d_cuda PASSED [3.3426s] [ 4%] 2025-12-04T13:28:26.4415669Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_max_pool3d_cuda PASSED [1.7534s] [ 4%] 2025-12-04T13:28:26.4415932Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pairwise_distance_cuda PASSED [0.7488s] [ 4%] 2025-12-04T13:28:26.4416196Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_pixel_shuffle_cuda PASSED [0.7306s] [ 4%] 2025-12-04T13:28:26.4416445Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_relu_cuda PASSED [0.7379s] [ 4%] 2025-12-04T13:28:26.4416696Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_soft_margin_loss_cuda PASSED [0.7351s] [ 4%] 2025-12-04T13:28:26.4416962Z test_ops.py::TestCommonCUDA::test_dtypes_nn_functional_upsample_bilinear_cuda PASSED [0.7794s] [ 4%] 2025-12-04T13:28:26.4417203Z test_ops.py::TestCommonCUDA::test_dtypes_nonzero_cuda PASSED [0.8184s] [ 4%] 2025-12-04T13:28:26.4417415Z test_ops.py::TestCommonCUDA::test_dtypes_norm_fro_cuda PASSED [0.7832s] [ 4%] 2025-12-04T13:28:26.4417635Z test_ops.py::TestCommonCUDA::test_dtypes_normal_in_place_cuda PASSED [0.7709s] [ 4%] 2025-12-04T13:28:26.4417853Z test_ops.py::TestCommonCUDA::test_dtypes_ones_like_cuda PASSED [0.7920s] [ 4%] 2025-12-04T13:28:26.4418079Z test_ops.py::TestCommonCUDA::test_dtypes_outer_cuda PASSED [0.8002s] [ 4%] 2025-12-04T13:28:26.4418293Z test_ops.py::TestCommonCUDA::test_dtypes_permute_copy_cuda PASSED [0.7764s] [ 4%] 2025-12-04T13:28:26.4418547Z test_ops.py::TestCommonCUDA::test_dtypes_polygamma_polygamma_n_1_cuda SKIPPED [0.0002s] (Skipped!) [ 4%] 2025-12-04T13:28:26.4418808Z test_ops.py::TestCommonCUDA::test_dtypes_randint_cuda PASSED [0.8192s] [ 4%] 2025-12-04T13:28:26.4419021Z test_ops.py::TestCommonCUDA::test_dtypes_randn_like_cuda PASSED [0.8252s] [ 4%] 2025-12-04T13:28:26.4419235Z test_ops.py::TestCommonCUDA::test_dtypes_reciprocal_cuda PASSED [0.7806s] [ 4%] 2025-12-04T13:28:26.4419446Z test_ops.py::TestCommonCUDA::test_dtypes_repeat_cuda PASSED [0.8262s] [ 4%] 2025-12-04T13:28:26.4419654Z test_ops.py::TestCommonCUDA::test_dtypes_round_cuda PASSED [0.7695s] [ 4%] 2025-12-04T13:28:26.4419860Z test_ops.py::TestCommonCUDA::test_dtypes_rsqrt_cuda PASSED [0.7923s] [ 5%] 2025-12-04T13:28:26.4420085Z test_ops.py::TestCommonCUDA::test_dtypes_scatter_reduce_amin_cuda PASSED [0.8411s] [ 5%] 2025-12-04T13:28:26.4420316Z test_ops.py::TestCommonCUDA::test_dtypes_searchsorted_cuda PASSED [1.1939s] [ 5%] 2025-12-04T13:28:26.4420541Z test_ops.py::TestCommonCUDA::test_dtypes_sigmoid_cuda PASSED [1.3522s] [ 5%] 2025-12-04T13:28:26.4420751Z test_ops.py::TestCommonCUDA::test_dtypes_sign_cuda PASSED [0.7620s] [ 5%] 2025-12-04T13:28:26.4420982Z test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_bartlett_cuda PASSED [0.7823s] [ 5%] 2025-12-04T13:28:26.4421241Z test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_general_hamming_cuda PASSED [0.7915s] [ 5%] 2025-12-04T13:28:26.4421499Z test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_hamming_cuda PASSED [0.7859s] [ 5%] 2025-12-04T13:28:26.4421746Z test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_kaiser_cuda PASSED [0.8007s] [ 5%] 2025-12-04T13:28:26.4422041Z test_ops.py::TestCommonCUDA::test_dtypes_signal_windows_nuttall_cuda PASSED [0.8071s] [ 5%] 2025-12-04T13:28:26.4422283Z test_ops.py::TestCommonCUDA::test_dtypes_softmax_with_dtype_cuda PASSED [0.8146s] [ 5%] 2025-12-04T13:28:26.4422523Z test_ops.py::TestCommonCUDA::test_dtypes_sparse_sampled_addmm_cuda PASSED [0.8172s] [ 5%] 2025-12-04T13:28:26.4422780Z test_ops.py::TestCommonCUDA::test_dtypes_special_chebyshev_polynomial_w_cuda PASSED [0.7338s] [ 5%] 2025-12-04T13:28:26.4423025Z test_ops.py::TestCommonCUDA::test_dtypes_special_entr_cuda PASSED [0.7133s] [ 5%] 2025-12-04T13:28:26.4423267Z test_ops.py::TestCommonCUDA::test_dtypes_special_modified_bessel_k1_cuda PASSED [0.7250s] [ 5%] 2025-12-04T13:28:26.4423510Z test_ops.py::TestCommonCUDA::test_dtypes_split_list_args_cuda PASSED [0.7196s] [ 5%] 2025-12-04T13:28:26.4423764Z test_ops.py::TestCommonCUDA::test_dtypes_split_with_sizes_cuda PASSED [0.7340s] [ 5%] 2025-12-04T13:28:26.4423985Z test_ops.py::TestCommonCUDA::test_dtypes_sqrt_cuda PASSED [0.9238s] [ 5%] 2025-12-04T13:28:26.4424203Z test_ops.py::TestCommonCUDA::test_dtypes_squeeze_copy_cuda PASSED [0.7285s] [ 5%] 2025-12-04T13:28:26.4424417Z test_ops.py::TestCommonCUDA::test_dtypes_squeeze_cuda PASSED [0.7452s] [ 5%] 2025-12-04T13:28:26.4424630Z test_ops.py::TestCommonCUDA::test_dtypes_sum_cuda PASSED [0.7494s] [ 5%] 2025-12-04T13:28:26.4424838Z test_ops.py::TestCommonCUDA::test_dtypes_tile_cuda PASSED [0.7998s] [ 5%] 2025-12-04T13:28:26.4425046Z test_ops.py::TestCommonCUDA::test_dtypes_topk_cuda PASSED [0.7334s] [ 5%] 2025-12-04T13:28:26.4425257Z test_ops.py::TestCommonCUDA::test_dtypes_transpose_cuda PASSED [0.7262s] [ 5%] 2025-12-04T13:28:26.4425476Z test_ops.py::TestCommonCUDA::test_dtypes_triangular_solve_cuda PASSED [0.7403s] [ 5%] 2025-12-04T13:28:26.4425697Z test_ops.py::TestCommonCUDA::test_dtypes_uniform_cuda PASSED [0.7369s] [ 5%] 2025-12-04T13:28:26.4425912Z test_ops.py::TestCommonCUDA::test_dtypes_unravel_index_cuda PASSED [0.7354s] [ 5%] 2025-12-04T13:28:26.4426149Z test_ops.py::TestCommonCUDA::test_dtypes_unsafe_split_cuda PASSED [0.7202s] [ 5%] 2025-12-04T13:28:26.4426363Z test_ops.py::TestCommonCUDA::test_dtypes_vsplit_cuda PASSED [0.7227s] [ 5%] 2025-12-04T13:28:26.4426572Z test_ops.py::TestCommonCUDA::test_dtypes_xlogy_cuda PASSED [0.7722s] [ 5%] 2025-12-04T13:28:26.4426783Z test_ops.py::TestCommonCUDA::test_dtypes_zeros_like_cuda PASSED [0.7497s] [ 5%] 2025-12-04T13:28:26.4427006Z test_ops.py::TestCommonCUDA::test_errors_T_cuda PASSED [0.0021s] [ 5%] 2025-12-04T13:28:26.4427216Z test_ops.py::TestCommonCUDA::test_errors___rdiv___cuda PASSED [0.7102s] [ 5%] 2025-12-04T13:28:26.4427424Z test_ops.py::TestCommonCUDA::test_errors_amin_cuda PASSED [0.7314s] [ 5%] 2025-12-04T13:28:26.4427634Z test_ops.py::TestCommonCUDA::test_errors_arange_cuda PASSED [0.0056s] [ 5%] 2025-12-04T13:28:26.4427845Z test_ops.py::TestCommonCUDA::test_errors_bernoulli_cuda PASSED [0.7064s] [ 5%] 2025-12-04T13:28:26.4428056Z test_ops.py::TestCommonCUDA::test_errors_clamp_max_cuda XFAIL [0.0034s] [ 5%] 2025-12-04T13:28:26.4428266Z test_ops.py::TestCommonCUDA::test_errors_complex_cuda PASSED [1.4009s] [ 5%] 2025-12-04T13:28:26.4428474Z test_ops.py::TestCommonCUDA::test_errors_copysign_cuda PASSED [0.6967s] [ 5%] 2025-12-04T13:28:26.4428696Z test_ops.py::TestCommonCUDA::test_errors_diag_cuda PASSED [0.6974s] [ 5%] 2025-12-04T13:28:26.4428915Z test_ops.py::TestCommonCUDA::test_errors_diagonal_copy_cuda PASSED [0.7050s] [ 5%] 2025-12-04T13:28:26.4429147Z test_ops.py::TestCommonCUDA::test_errors_div_trunc_rounding_cuda PASSED [0.0021s] [ 5%] 2025-12-04T13:28:26.4429369Z test_ops.py::TestCommonCUDA::test_errors_dsplit_cuda PASSED [0.7043s] [ 5%] 2025-12-04T13:28:26.4429577Z test_ops.py::TestCommonCUDA::test_errors_eq_cuda PASSED [0.6939s] [ 5%] 2025-12-04T13:28:26.4429787Z test_ops.py::TestCommonCUDA::test_errors_fft_fft2_cuda PASSED [0.7007s] [ 5%] 2025-12-04T13:28:26.4429995Z test_ops.py::TestCommonCUDA::test_errors_fft_hfft2_cuda PASSED [0.7015s] [ 5%] 2025-12-04T13:28:26.4430205Z test_ops.py::TestCommonCUDA::test_errors_fft_rfft2_cuda PASSED [0.6926s] [ 5%] 2025-12-04T13:28:26.4430413Z test_ops.py::TestCommonCUDA::test_errors_fft_rfft_cuda PASSED [0.7105s] [ 5%] 2025-12-04T13:28:26.4430628Z test_ops.py::TestCommonCUDA::test_errors_float_power_cuda PASSED [0.0028s] [ 5%] 2025-12-04T13:28:26.4430849Z test_ops.py::TestCommonCUDA::test_errors_floor_divide_cuda PASSED [0.0016s] [ 5%] 2025-12-04T13:28:26.4431061Z test_ops.py::TestCommonCUDA::test_errors_fmin_cuda PASSED [0.7041s] [ 5%] 2025-12-04T13:28:26.4431269Z test_ops.py::TestCommonCUDA::test_errors_gather_cuda PASSED [0.7021s] [ 5%] 2025-12-04T13:28:26.4431476Z test_ops.py::TestCommonCUDA::test_errors_gradient_cuda PASSED [0.7161s] [ 5%] 2025-12-04T13:28:26.4431684Z test_ops.py::TestCommonCUDA::test_errors_heaviside_cuda PASSED [0.6989s] [ 5%] 2025-12-04T13:28:26.4431936Z test_ops.py::TestCommonCUDA::test_errors_hypot_cuda PASSED [0.7005s] [ 5%] 2025-12-04T13:28:26.4432145Z test_ops.py::TestCommonCUDA::test_errors_igamma_cuda PASSED [0.6965s] [ 5%] 2025-12-04T13:28:26.4432352Z test_ops.py::TestCommonCUDA::test_errors_index_add_cuda PASSED [0.0031s] [ 5%] 2025-12-04T13:28:26.4432559Z test_ops.py::TestCommonCUDA::test_errors_item_cuda PASSED [0.7019s] [ 5%] 2025-12-04T13:28:26.4432791Z test_ops.py::TestCommonCUDA::test_errors_linalg_lstsq_grad_oriented_cuda PASSED [0.0032s] [ 5%] 2025-12-04T13:28:26.4433026Z test_ops.py::TestCommonCUDA::test_errors_linspace_cuda PASSED [0.0039s] [ 5%] 2025-12-04T13:28:26.4433240Z test_ops.py::TestCommonCUDA::test_errors_logical_xor_cuda PASSED [0.6986s] [ 5%] 2025-12-04T13:28:26.4433477Z test_ops.py::TestCommonCUDA::test_errors_logspace_tensor_overload_cuda PASSED [0.6988s] [ 5%] 2025-12-04T13:28:26.4433711Z test_ops.py::TestCommonCUDA::test_errors_max_binary_cuda PASSED [0.6943s] [ 5%] 2025-12-04T13:28:26.4433924Z test_ops.py::TestCommonCUDA::test_errors_maximum_cuda PASSED [0.6952s] [ 5%] 2025-12-04T13:28:26.4434152Z test_ops.py::TestCommonCUDA::test_errors_mul_cuda PASSED [0.0021s] [ 5%] 2025-12-04T13:28:26.4434365Z test_ops.py::TestCommonCUDA::test_errors_narrow_copy_cuda PASSED [0.7111s] [ 5%] 2025-12-04T13:28:26.4434592Z test_ops.py::TestCommonCUDA::test_errors_native_layer_norm_cuda PASSED [0.7142s] [ 5%] 2025-12-04T13:28:26.4434875Z test_ops.py::TestCommonCUDA::test_errors_nn_functional_adaptive_max_pool1d_cuda PASSED [0.7010s] [ 6%] 2025-12-04T13:28:26.4435140Z test_ops.py::TestCommonCUDA::test_errors_nn_functional_avg_pool3d_cuda PASSED [0.0058s] [ 6%] 2025-12-04T13:28:26.4435387Z test_ops.py::TestCommonCUDA::test_errors_nn_functional_l1_loss_cuda PASSED [0.7001s] [ 6%] 2025-12-04T13:28:26.4435645Z test_ops.py::TestCommonCUDA::test_errors_nn_functional_margin_ranking_loss_cuda PASSED [0.7035s] [ 6%] 2025-12-04T13:28:26.4435924Z test_ops.py::TestCommonCUDA::test_errors_nn_functional_multilabel_margin_loss_cuda PASSED [0.7241s] [ 6%] 2025-12-04T13:28:26.4436184Z test_ops.py::TestCommonCUDA::test_errors_nn_functional_prelu_cuda PASSED [0.7092s] [ 6%] 2025-12-04T13:28:26.4436410Z test_ops.py::TestCommonCUDA::test_errors_pow_cuda PASSED [0.0028s] [ 6%] 2025-12-04T13:28:26.4436621Z test_ops.py::TestCommonCUDA::test_errors_remainder_cuda PASSED [0.0023s] [ 6%] 2025-12-04T13:28:26.4436842Z test_ops.py::TestCommonCUDA::test_errors_roll_cuda PASSED [0.7244s] [ 6%] 2025-12-04T13:28:26.4437050Z test_ops.py::TestCommonCUDA::test_errors_rot90_cuda PASSED [0.7109s] [ 6%] 2025-12-04T13:28:26.4437257Z test_ops.py::TestCommonCUDA::test_errors_scatter_cuda PASSED [0.7313s] [ 6%] 2025-12-04T13:28:26.4437495Z test_ops.py::TestCommonCUDA::test_errors_signal_windows_general_hamming_cuda PASSED [0.0051s] [ 6%] 2025-12-04T13:28:26.4437751Z test_ops.py::TestCommonCUDA::test_errors_signal_windows_nuttall_cuda PASSED [0.0042s] [ 6%] 2025-12-04T13:28:26.4437993Z test_ops.py::TestCommonCUDA::test_errors_sparse_mul_layout3_cuda PASSED [0.0028s] [ 6%] 2025-12-04T13:28:26.4438227Z test_ops.py::TestCommonCUDA::test_errors_sparse_sum_layout4_cuda PASSED [0.0010s] [ 6%] 2025-12-04T13:28:26.4438478Z test_ops.py::TestCommonCUDA::test_errors_special_chebyshev_polynomial_t_cuda PASSED [0.0022s] [ 6%] 2025-12-04T13:28:26.4438746Z test_ops.py::TestCommonCUDA::test_errors_special_chebyshev_polynomial_w_cuda PASSED [0.0020s] [ 6%] 2025-12-04T13:28:26.4438984Z test_ops.py::TestCommonCUDA::test_errors_trace_cuda PASSED [0.7161s] [ 6%] 2025-12-04T13:28:26.4439194Z test_ops.py::TestCommonCUDA::test_errors_tril_cuda PASSED [0.0032s] [ 6%] 2025-12-04T13:28:26.4439408Z test_ops.py::TestCommonCUDA::test_errors_true_divide_cuda PASSED [0.0016s] [ 6%] 2025-12-04T13:28:26.4439713Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch__batch_norm_with_update_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4440091Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_abs_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4440449Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_addcmul_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4440797Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_any_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4441157Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_as_strided_copy_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4441515Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_asinh_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4442034Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_atan2_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4442382Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_atan_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4442726Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_ceil_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4443090Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_cumprod_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4443439Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_erfc_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4443797Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_erfinv_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4444147Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_fft_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4444495Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_hfft_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4444851Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ifft2_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4445209Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ihfft2_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4445579Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ihfft_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4445936Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_ihfftn_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4446289Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_fft_rfft2_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4446639Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_floor_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4446986Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_frexp_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4447329Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_gt_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4447670Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_hypot_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4448015Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_i0_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4448355Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_ldexp_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4448707Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_cond_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4449079Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_cross_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4449438Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_det_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4449878Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_householder_product_cuda_float32 SKIPPED [0.0005s] (skipCUDAIfRocm: test doesn't currently work on the ROCm stack) [ 6%] 2025-12-04T13:28:26.4450329Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_ldl_factor_ex_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4450706Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_lu_factor_ex_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4451080Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_matrix_rank_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4451458Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_pinv_hermitian_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4451889Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_linalg_slogdet_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4452248Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_log1p_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4452601Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_log_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4452953Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logical_not_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4453333Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_logspace_tensor_overload_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4453705Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_lu_solve_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4454054Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_lu_unpack_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4454424Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_masked_select_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4454779Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_matmul_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4455143Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_max_reduction_with_dim_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4455527Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_min_reduction_no_dim_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4455886Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mm_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4456224Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_msort_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4456561Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_mv_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 6%] 2025-12-04T13:28:26.4456909Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nanquantile_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4457267Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_narrow_copy_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4457641Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_nn_functional_avg_pool2d_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4458022Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_norm_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4458365Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_normal_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4458725Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_normal_number_mean_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4459090Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_rad2deg_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4459456Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_round_decimals_0_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4459817Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_round_decimals_3_cuda_float32 SKIPPED [0.0001s] (Skipped!) [ 7%] 2025-12-04T13:28:26.4460184Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_scatter_reduce_amax_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4460572Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_scatter_reduce_sum_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4460933Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sigmoid_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4461292Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_signbit_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4461635Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sort_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4462031Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_sparse_sampled_addmm_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4462413Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_bessel_j1_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4462804Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_chebyshev_polynomial_u_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4463212Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_chebyshev_polynomial_v_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4463634Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_chebyshev_polynomial_w_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4464017Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_i0e_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4464397Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_legendre_polynomial_p_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4464785Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_log_ndtr_cuda_float32 SKIPPED [0.0008s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4465152Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_ndtri_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4465550Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_special_shifted_chebyshev_polynomial_v_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4465937Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_topk_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4466277Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_tril_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4466628Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_true_divide_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4467002Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_unsqueeze_copy_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4467356Z test_ops.py::TestCommonCUDA::test_meta_consistency_out_dtype_mismatch_vdot_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 7%] 2025-12-04T13:28:26.4467702Z test_ops.py::TestCommonCUDA::test_multiple_devices___getitem___cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4468041Z test_ops.py::TestCommonCUDA::test_multiple_devices___radd___cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4468372Z test_ops.py::TestCommonCUDA::test_multiple_devices___rdiv___cuda_int64 SKIPPED [0.0011s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4468704Z test_ops.py::TestCommonCUDA::test_multiple_devices__chunk_cat_cuda_int64 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4469059Z test_ops.py::TestCommonCUDA::test_multiple_devices__unsafe_masked_index_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4469452Z test_ops.py::TestCommonCUDA::test_multiple_devices__upsample_bilinear2d_aa_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4469804Z test_ops.py::TestCommonCUDA::test_multiple_devices_abs_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4470145Z test_ops.py::TestCommonCUDA::test_multiple_devices_all_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4470467Z test_ops.py::TestCommonCUDA::test_multiple_devices_all_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4470789Z test_ops.py::TestCommonCUDA::test_multiple_devices_amax_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4471114Z test_ops.py::TestCommonCUDA::test_multiple_devices_amax_cuda_int64 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4471438Z test_ops.py::TestCommonCUDA::test_multiple_devices_amin_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4471773Z test_ops.py::TestCommonCUDA::test_multiple_devices_aminmax_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4472152Z test_ops.py::TestCommonCUDA::test_multiple_devices_angle_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4472478Z test_ops.py::TestCommonCUDA::test_multiple_devices_argmin_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4472815Z test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_copy_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4473176Z test_ops.py::TestCommonCUDA::test_multiple_devices_as_strided_partial_views_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4473527Z test_ops.py::TestCommonCUDA::test_multiple_devices_asin_cuda_int64 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4473853Z test_ops.py::TestCommonCUDA::test_multiple_devices_asinh_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4474177Z test_ops.py::TestCommonCUDA::test_multiple_devices_atan2_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4474501Z test_ops.py::TestCommonCUDA::test_multiple_devices_atan_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4474826Z test_ops.py::TestCommonCUDA::test_multiple_devices_atanh_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4475152Z test_ops.py::TestCommonCUDA::test_multiple_devices_atanh_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4475502Z test_ops.py::TestCommonCUDA::test_multiple_devices_atleast_2d_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4475841Z test_ops.py::TestCommonCUDA::test_multiple_devices_baddbmm_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4476182Z test_ops.py::TestCommonCUDA::test_multiple_devices_bernoulli_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4476525Z test_ops.py::TestCommonCUDA::test_multiple_devices_bfloat16_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4476862Z test_ops.py::TestCommonCUDA::test_multiple_devices_bfloat16_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4477195Z test_ops.py::TestCommonCUDA::test_multiple_devices_bincount_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4477531Z test_ops.py::TestCommonCUDA::test_multiple_devices_bitwise_xor_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4477872Z test_ops.py::TestCommonCUDA::test_multiple_devices_block_diag_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4478221Z test_ops.py::TestCommonCUDA::test_multiple_devices_bool_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4478562Z test_ops.py::TestCommonCUDA::test_multiple_devices_broadcast_tensors_cuda_int64 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4478937Z test_ops.py::TestCommonCUDA::test_multiple_devices_cartesian_prod_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4479286Z test_ops.py::TestCommonCUDA::test_multiple_devices_cdouble_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4479619Z test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4479947Z test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_cuda_int64 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4480284Z test_ops.py::TestCommonCUDA::test_multiple_devices_clamp_min_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 7%] 2025-12-04T13:28:26.4480634Z test_ops.py::TestCommonCUDA::test_multiple_devices_combinations_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4480996Z test_ops.py::TestCommonCUDA::test_multiple_devices_conj_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4481340Z test_ops.py::TestCommonCUDA::test_multiple_devices_constant_pad_nd_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4481694Z test_ops.py::TestCommonCUDA::test_multiple_devices_constant_pad_nd_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4482070Z test_ops.py::TestCommonCUDA::test_multiple_devices_corrcoef_cuda_int64 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4482402Z test_ops.py::TestCommonCUDA::test_multiple_devices_cos_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4482731Z test_ops.py::TestCommonCUDA::test_multiple_devices_cross_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4483063Z test_ops.py::TestCommonCUDA::test_multiple_devices_cumsum_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4483420Z test_ops.py::TestCommonCUDA::test_multiple_devices_cumulative_trapezoid_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4483793Z test_ops.py::TestCommonCUDA::test_multiple_devices_cumulative_trapezoid_cuda_int64 SKIPPED [0.0011s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4484149Z test_ops.py::TestCommonCUDA::test_multiple_devices_diag_embed_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4484505Z test_ops.py::TestCommonCUDA::test_multiple_devices_diagonal_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4484846Z test_ops.py::TestCommonCUDA::test_multiple_devices_digamma_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4485181Z test_ops.py::TestCommonCUDA::test_multiple_devices_dsplit_cuda_int64 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4485525Z test_ops.py::TestCommonCUDA::test_multiple_devices_empty_permuted_cuda_int64 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4485864Z test_ops.py::TestCommonCUDA::test_multiple_devices_eq_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4486191Z test_ops.py::TestCommonCUDA::test_multiple_devices_erfc_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4486516Z test_ops.py::TestCommonCUDA::test_multiple_devices_erfc_cuda_int64 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4486841Z test_ops.py::TestCommonCUDA::test_multiple_devices_exp2_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4487179Z test_ops.py::TestCommonCUDA::test_multiple_devices_exp_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4487504Z test_ops.py::TestCommonCUDA::test_multiple_devices_exp_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4487844Z test_ops.py::TestCommonCUDA::test_multiple_devices_expand_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4488177Z test_ops.py::TestCommonCUDA::test_multiple_devices_expand_cuda_int64 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4488507Z test_ops.py::TestCommonCUDA::test_multiple_devices_expm1_cuda_float32 SKIPPED [0.0011s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4488840Z test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fft_cuda_int64 SKIPPED [0.0014s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4489180Z test_ops.py::TestCommonCUDA::test_multiple_devices_fft_fftshift_cuda_int64 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4489524Z test_ops.py::TestCommonCUDA::test_multiple_devices_fft_hfftn_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4489878Z test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifft2_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4490217Z test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifft_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4490554Z test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ifftn_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4490896Z test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfft2_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4491240Z test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfft_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4491581Z test_ops.py::TestCommonCUDA::test_multiple_devices_fft_ihfftn_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4491959Z test_ops.py::TestCommonCUDA::test_multiple_devices_fft_rfft_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4492297Z test_ops.py::TestCommonCUDA::test_multiple_devices_flatten_cuda_int64 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4492626Z test_ops.py::TestCommonCUDA::test_multiple_devices_flip_cuda_int64 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4492951Z test_ops.py::TestCommonCUDA::test_multiple_devices_floor_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4493292Z test_ops.py::TestCommonCUDA::test_multiple_devices_fmax_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4493623Z test_ops.py::TestCommonCUDA::test_multiple_devices_fmin_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4493955Z test_ops.py::TestCommonCUDA::test_multiple_devices_frexp_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4494287Z test_ops.py::TestCommonCUDA::test_multiple_devices_full_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4494621Z test_ops.py::TestCommonCUDA::test_multiple_devices_full_like_cuda_int64 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4494960Z test_ops.py::TestCommonCUDA::test_multiple_devices_gradient_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4495304Z test_ops.py::TestCommonCUDA::test_multiple_devices_hash_tensor_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4495642Z test_ops.py::TestCommonCUDA::test_multiple_devices_hstack_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4495989Z test_ops.py::TestCommonCUDA::test_multiple_devices_hypot_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4496327Z test_ops.py::TestCommonCUDA::test_multiple_devices_index_copy_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4496683Z test_ops.py::TestCommonCUDA::test_multiple_devices_index_fill_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4497025Z test_ops.py::TestCommonCUDA::test_multiple_devices_index_fill_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4497371Z test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_amax_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4497731Z test_ops.py::TestCommonCUDA::test_multiple_devices_index_reduce_prod_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4498081Z test_ops.py::TestCommonCUDA::test_multiple_devices_index_select_cuda_int64 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4498420Z test_ops.py::TestCommonCUDA::test_multiple_devices_isnan_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4498763Z test_ops.py::TestCommonCUDA::test_multiple_devices_isnan_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4499089Z test_ops.py::TestCommonCUDA::test_multiple_devices_item_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4499452Z test_ops.py::TestCommonCUDA::test_multiple_devices_jiterator_binary_return_by_ref_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4499813Z test_ops.py::TestCommonCUDA::test_multiple_devices_le_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4500139Z test_ops.py::TestCommonCUDA::test_multiple_devices_lgamma_cuda_int64 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4500484Z test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eigvals_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4500843Z test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_eigvalsh_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4501201Z test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_inv_ex_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4501557Z test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_matrix_norm_cuda_float32 SKIPPED [0.0011s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4501970Z test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_matrix_rank_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4502347Z test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_norm_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4502703Z test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_slogdet_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4503056Z test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_solve_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4503415Z test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_tensorinv_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 8%] 2025-12-04T13:28:26.4503772Z test_ops.py::TestCommonCUDA::test_multiple_devices_linalg_vecdot_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4504120Z test_ops.py::TestCommonCUDA::test_multiple_devices_linspace_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4504482Z test_ops.py::TestCommonCUDA::test_multiple_devices_linspace_tensor_overload_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4504850Z test_ops.py::TestCommonCUDA::test_multiple_devices_log_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4505186Z test_ops.py::TestCommonCUDA::test_multiple_devices_logaddexp_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4505541Z test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_cuda_int64 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4505902Z test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_tensor_overload_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4506284Z test_ops.py::TestCommonCUDA::test_multiple_devices_logspace_tensor_overload_cuda_int64 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4506647Z test_ops.py::TestCommonCUDA::test_multiple_devices_logsumexp_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4506979Z test_ops.py::TestCommonCUDA::test_multiple_devices_lt_cuda_int64 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4507306Z test_ops.py::TestCommonCUDA::test_multiple_devices_lu_solve_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4507656Z test_ops.py::TestCommonCUDA::test_multiple_devices_masked_amax_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4507998Z test_ops.py::TestCommonCUDA::test_multiple_devices_masked_fill_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4508349Z test_ops.py::TestCommonCUDA::test_multiple_devices_masked_logaddexp_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4508705Z test_ops.py::TestCommonCUDA::test_multiple_devices_masked_logsumexp_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4509055Z test_ops.py::TestCommonCUDA::test_multiple_devices_masked_prod_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4509402Z test_ops.py::TestCommonCUDA::test_multiple_devices_masked_select_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4509756Z test_ops.py::TestCommonCUDA::test_multiple_devices_max_reduction_no_dim_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4510121Z test_ops.py::TestCommonCUDA::test_multiple_devices_max_reduction_with_dim_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4510473Z test_ops.py::TestCommonCUDA::test_multiple_devices_maximum_cuda_int64 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4510801Z test_ops.py::TestCommonCUDA::test_multiple_devices_mean_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4511157Z test_ops.py::TestCommonCUDA::test_multiple_devices_min_reduction_no_dim_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4511520Z test_ops.py::TestCommonCUDA::test_multiple_devices_mvlgamma_mvlgamma_p_5_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4511914Z test_ops.py::TestCommonCUDA::test_multiple_devices_nansum_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4512247Z test_ops.py::TestCommonCUDA::test_multiple_devices_nansum_cuda_int64 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4512583Z test_ops.py::TestCommonCUDA::test_multiple_devices_narrow_copy_cuda_int64 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4512914Z test_ops.py::TestCommonCUDA::test_multiple_devices_ne_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4513242Z test_ops.py::TestCommonCUDA::test_multiple_devices_new_zeros_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4513597Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_avg_pool3d_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4513997Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_conv_transpose2d_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4514416Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_fractional_max_pool3d_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4514802Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_gelu_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4515172Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_grid_sample_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4515568Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_bilinear_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4515972Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_interpolate_linear_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4516369Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_margin_ranking_loss_cuda_int64 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4516768Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_pool1d_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4517145Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_max_pool2d_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4517521Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_mse_loss_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4517918Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_multilabel_soft_margin_loss_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4518317Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_reflect_cuda_int64 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4518698Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_replicate_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4519099Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_pad_replicate_negative_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4519501Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_poisson_nll_loss_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4519891Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_relu_cuda_int64 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4520261Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_smooth_l1_loss_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4520650Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_soft_margin_loss_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4521044Z test_ops.py::TestCommonCUDA::test_multiple_devices_nn_functional_triplet_margin_loss_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4521405Z test_ops.py::TestCommonCUDA::test_multiple_devices_norm_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4521747Z test_ops.py::TestCommonCUDA::test_multiple_devices_normal_in_place_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4522144Z test_ops.py::TestCommonCUDA::test_multiple_devices_normal_number_mean_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4522487Z test_ops.py::TestCommonCUDA::test_multiple_devices_ones_cuda_int64 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4522827Z test_ops.py::TestCommonCUDA::test_multiple_devices_polar_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4523180Z test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_0_cuda_int64 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4523543Z test_ops.py::TestCommonCUDA::test_multiple_devices_polygamma_polygamma_n_2_cuda_float32 SKIPPED [0.0001s] (Skipped!) [ 9%] 2025-12-04T13:28:26.4523870Z test_ops.py::TestCommonCUDA::test_multiple_devices_pow_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4524205Z test_ops.py::TestCommonCUDA::test_multiple_devices_quantile_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4524517Z test_ops.py::TestCommonCUDA::test_multiple_devices_randint_cuda_int64 SKIPPED [0.0001s] (Skipped!) [ 9%] 2025-12-04T13:28:26.4524823Z test_ops.py::TestCommonCUDA::test_multiple_devices_real_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4525165Z test_ops.py::TestCommonCUDA::test_multiple_devices_remainder_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4525512Z test_ops.py::TestCommonCUDA::test_multiple_devices_repeat_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4525848Z test_ops.py::TestCommonCUDA::test_multiple_devices_resolve_neg_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4526181Z test_ops.py::TestCommonCUDA::test_multiple_devices_roll_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4526507Z test_ops.py::TestCommonCUDA::test_multiple_devices_rsub_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4526847Z test_ops.py::TestCommonCUDA::test_multiple_devices_scatter_reduce_sum_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4527188Z test_ops.py::TestCommonCUDA::test_multiple_devices_short_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4527513Z test_ops.py::TestCommonCUDA::test_multiple_devices_sign_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 9%] 2025-12-04T13:28:26.4527863Z test_ops.py::TestCommonCUDA::test_multiple_devices_signal_windows_cosine_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4528223Z test_ops.py::TestCommonCUDA::test_multiple_devices_slice_scatter_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4528563Z test_ops.py::TestCommonCUDA::test_multiple_devices_sort_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4528918Z test_ops.py::TestCommonCUDA::test_multiple_devices_special_airy_ai_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4529276Z test_ops.py::TestCommonCUDA::test_multiple_devices_special_bessel_y1_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4529658Z test_ops.py::TestCommonCUDA::test_multiple_devices_special_chebyshev_polynomial_w_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4530033Z test_ops.py::TestCommonCUDA::test_multiple_devices_special_i1e_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4530404Z test_ops.py::TestCommonCUDA::test_multiple_devices_special_legendre_polynomial_p_cuda_int64 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4530781Z test_ops.py::TestCommonCUDA::test_multiple_devices_special_log_ndtr_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4531148Z test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_i0_cuda_int64 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4531545Z test_ops.py::TestCommonCUDA::test_multiple_devices_special_modified_bessel_k0_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4531958Z test_ops.py::TestCommonCUDA::test_multiple_devices_special_ndtr_cuda_float32 SKIPPED [0.0011s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4532321Z test_ops.py::TestCommonCUDA::test_multiple_devices_special_ndtri_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4532675Z test_ops.py::TestCommonCUDA::test_multiple_devices_special_xlog1py_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4533033Z test_ops.py::TestCommonCUDA::test_multiple_devices_special_xlog1py_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4533382Z test_ops.py::TestCommonCUDA::test_multiple_devices_squeeze_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4533717Z test_ops.py::TestCommonCUDA::test_multiple_devices_stack_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4534054Z test_ops.py::TestCommonCUDA::test_multiple_devices_sum_to_size_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4534406Z test_ops.py::TestCommonCUDA::test_multiple_devices_t_cuda_int64 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4534727Z test_ops.py::TestCommonCUDA::test_multiple_devices_take_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4535067Z test_ops.py::TestCommonCUDA::test_multiple_devices_tensor_split_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4535415Z test_ops.py::TestCommonCUDA::test_multiple_devices_tensordot_cuda_float32 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4535757Z test_ops.py::TestCommonCUDA::test_multiple_devices_to_sparse_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4536094Z test_ops.py::TestCommonCUDA::test_multiple_devices_to_sparse_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4536426Z test_ops.py::TestCommonCUDA::test_multiple_devices_trapz_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4536758Z test_ops.py::TestCommonCUDA::test_multiple_devices_trapz_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4537084Z test_ops.py::TestCommonCUDA::test_multiple_devices_triu_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4537418Z test_ops.py::TestCommonCUDA::test_multiple_devices_triu_indices_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4537765Z test_ops.py::TestCommonCUDA::test_multiple_devices_trunc_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4538103Z test_ops.py::TestCommonCUDA::test_multiple_devices_unfold_copy_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4538448Z test_ops.py::TestCommonCUDA::test_multiple_devices_unfold_cuda_float32 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4538794Z test_ops.py::TestCommonCUDA::test_multiple_devices_unsafe_split_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4539130Z test_ops.py::TestCommonCUDA::test_multiple_devices_view_cuda_int64 SKIPPED [0.0009s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4539461Z test_ops.py::TestCommonCUDA::test_multiple_devices_vstack_cuda_float32 SKIPPED [0.0010s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4539791Z test_ops.py::TestCommonCUDA::test_multiple_devices_xlogy_cuda_int64 SKIPPED [0.0008s] (fewer than 2 devices detected) [ 10%] 2025-12-04T13:28:26.4540083Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_T_cuda_bool PASSED [0.7025s] [ 10%] 2025-12-04T13:28:26.4540359Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values___rdiv___cuda_bool PASSED [0.0080s] [ 10%] 2025-12-04T13:28:26.4540627Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_argsort_cuda_bool PASSED [0.7676s] [ 10%] 2025-12-04T13:28:26.4540907Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_as_strided_copy_cuda_bool PASSED [0.7066s] [ 10%] 2025-12-04T13:28:26.4541194Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_atan2_cuda_bool PASSED [0.0076s] [ 10%] 2025-12-04T13:28:26.4541456Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_bool_cuda_bool PASSED [0.7171s] [ 10%] 2025-12-04T13:28:26.4541719Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cdouble_cuda_bool PASSED [0.0054s] [ 10%] 2025-12-04T13:28:26.4542024Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_cfloat_cuda_bool PASSED [0.7253s] [ 10%] 2025-12-04T13:28:26.4542287Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_chunk_cuda_bool PASSED [0.0045s] [ 10%] 2025-12-04T13:28:26.4542560Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_count_nonzero_cuda_bool PASSED [0.7061s] [ 10%] 2025-12-04T13:28:26.4542831Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diag_cuda_bool PASSED [0.0074s] [ 10%] 2025-12-04T13:28:26.4543111Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diagflat_cuda_bool PASSED [0.7208s] [ 10%] 2025-12-04T13:28:26.4543375Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_diff_cuda_bool PASSED [0.0263s] [ 10%] 2025-12-04T13:28:26.4543635Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_double_cuda_bool PASSED [0.0042s] [ 10%] 2025-12-04T13:28:26.4543895Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_erf_cuda_bool PASSED [0.7257s] [ 10%] 2025-12-04T13:28:26.4544154Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_expm1_cuda_bool PASSED [0.0037s] [ 10%] 2025-12-04T13:28:26.4544417Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_hfft_cuda_bool PASSED [0.8473s] [ 10%] 2025-12-04T13:28:26.4544686Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifft2_cuda_bool PASSED [1.0198s] [ 10%] 2025-12-04T13:28:26.4544955Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifft_cuda_bool PASSED [0.7449s] [ 10%] 2025-12-04T13:28:26.4545222Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ifftn_cuda_bool PASSED [0.7066s] [ 10%] 2025-12-04T13:28:26.4545496Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_ihfftn_cuda_bool PASSED [0.5798s] [ 10%] 2025-12-04T13:28:26.4545769Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_fft_irfft2_cuda_bool PASSED [0.8658s] [ 10%] 2025-12-04T13:28:26.4546036Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_float_cuda_bool PASSED [0.6967s] [ 10%] 2025-12-04T13:28:26.4546309Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_hstack_cuda_bool PASSED [0.0041s] [ 10%] 2025-12-04T13:28:26.4546576Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_index_copy_cuda_bool PASSED [0.6881s] [ 10%] 2025-12-04T13:28:26.4546876Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_jiterator_2inputs_2outputs_cuda_bool PASSED [0.0094s] [ 10%] 2025-12-04T13:28:26.4547170Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_lgamma_cuda_bool PASSED [0.7035s] [ 10%] 2025-12-04T13:28:26.4547455Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_log_softmax_with_dtype_cuda_bool PASSED [0.0060s] [ 10%] 2025-12-04T13:28:26.4547745Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logical_and_cuda_bool PASSED [0.0053s] [ 10%] 2025-12-04T13:28:26.4548023Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logical_not_cuda_bool PASSED [0.6983s] [ 10%] 2025-12-04T13:28:26.4548297Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_logsumexp_cuda_bool PASSED [0.0059s] [ 10%] 2025-12-04T13:28:26.4548560Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_long_cuda_bool PASSED [0.7103s] [ 10%] 2025-12-04T13:28:26.4548840Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_masked_prod_cuda_bool PASSED [0.0375s] [ 11%] 2025-12-04T13:28:26.4549117Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_narrow_copy_cuda_bool PASSED [0.7074s] [ 11%] 2025-12-04T13:28:26.4549387Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_new_full_cuda_bool PASSED [0.0059s] [ 11%] 2025-12-04T13:28:26.4549666Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_new_ones_cuda_bool PASSED [0.7073s] [ 11%] 2025-12-04T13:28:26.4549932Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_new_zeros_cuda_bool PASSED [0.0057s] [ 11%] 2025-12-04T13:28:26.4550237Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_2_cuda_bool SKIPPED [0.0002s] (Skipped!) [ 11%] 2025-12-04T13:28:26.4550580Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_polygamma_polygamma_n_4_cuda_bool SKIPPED [0.0001s] (Skipped!) [ 11%] 2025-12-04T13:28:26.4550884Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_ravel_cuda_bool PASSED [0.0032s] [ 11%] 2025-12-04T13:28:26.4551150Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_resize__cuda_bool PASSED [0.6994s] [ 11%] 2025-12-04T13:28:26.4551434Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_resolve_conj_cuda_bool PASSED [0.0037s] [ 11%] 2025-12-04T13:28:26.4551724Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_scatter_cuda_bool SKIPPED [0.0002s] (Skipped!) [ 11%] 2025-12-04T13:28:26.4552060Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_bessel_j0_cuda_bool PASSED [0.7095s] [ 11%] 2025-12-04T13:28:26.4552352Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_bessel_j1_cuda_bool PASSED [0.0059s] [ 11%] 2025-12-04T13:28:26.4552640Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_erfcx_cuda_bool PASSED [0.7082s] [ 11%] 2025-12-04T13:28:26.4552920Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_i1_cuda_bool PASSED [0.0057s] [ 11%] 2025-12-04T13:28:26.4553196Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_i1e_cuda_bool PASSED [0.7055s] [ 11%] 2025-12-04T13:28:26.4553493Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_modified_bessel_i0_cuda_bool PASSED [0.0057s] [ 11%] 2025-12-04T13:28:26.4553827Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_shifted_chebyshev_polynomial_w_cuda_bool PASSED [0.0083s] [ 11%] 2025-12-04T13:28:26.4554149Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_xlog1py_cuda_bool PASSED [0.0058s] [ 11%] 2025-12-04T13:28:26.4554432Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_special_zeta_cuda_bool PASSED [0.0077s] [ 11%] 2025-12-04T13:28:26.4554701Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sqrt_cuda_bool PASSED [0.7206s] [ 11%] 2025-12-04T13:28:26.4554972Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_sum_cuda_bool PASSED [0.0087s] [ 11%] 2025-12-04T13:28:26.4555229Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_t_cuda_bool PASSED [0.7191s] [ 11%] 2025-12-04T13:28:26.4555500Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_take_along_dim_cuda_bool PASSED [0.0059s] [ 11%] 2025-12-04T13:28:26.4555774Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_trace_cuda_bool PASSED [0.7138s] [ 11%] 2025-12-04T13:28:26.4556040Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_transpose_cuda_bool PASSED [0.0050s] [ 11%] 2025-12-04T13:28:26.4556313Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unfold_copy_cuda_bool PASSED [0.7064s] [ 11%] 2025-12-04T13:28:26.4556598Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unique_consecutive_cuda_bool PASSED [0.1624s] [ 11%] 2025-12-04T13:28:26.4556886Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_unsafe_chunk_cuda_bool PASSED [0.0036s] [ 11%] 2025-12-04T13:28:26.4557154Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_view_cuda_bool PASSED [0.7251s] [ 11%] 2025-12-04T13:28:26.4557431Z test_ops.py::TestCommonCUDA::test_non_standard_bool_values_zeros_cuda_bool PASSED [0.0035s] [ 11%] 2025-12-04T13:28:26.4557690Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_T_cuda_int64 PASSED [0.7193s] [ 11%] 2025-12-04T13:28:26.4557956Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples___rpow___cuda_float32 PASSED [0.0125s] [ 11%] 2025-12-04T13:28:26.4558258Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_cuda_float32 PASSED [0.7280s] [ 11%] 2025-12-04T13:28:26.4558562Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples__unsafe_masked_index_cuda_int64 PASSED [0.0083s] [ 11%] 2025-12-04T13:28:26.4558846Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_abs_cuda_float32 PASSED [0.7183s] [ 11%] 2025-12-04T13:28:26.4559110Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acosh_cuda_float32 PASSED [0.0136s] [ 11%] 2025-12-04T13:28:26.4559374Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_acosh_cuda_int64 PASSED [0.7147s] [ 11%] 2025-12-04T13:28:26.4559639Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addbmm_cuda_float32 PASSED [0.0119s] [ 11%] 2025-12-04T13:28:26.4559927Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addmm_decomposed_cuda_complex64 PASSED [0.7248s] [ 11%] 2025-12-04T13:28:26.4560226Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_addmv_cuda_complex64 PASSED [0.0110s] [ 11%] 2025-12-04T13:28:26.4560499Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_allclose_cuda_complex64 PASSED [0.0115s] [ 11%] 2025-12-04T13:28:26.4560774Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_allclose_cuda_float32 PASSED [0.7268s] [ 11%] 2025-12-04T13:28:26.4561043Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_amax_cuda_float32 PASSED [0.0169s] [ 11%] 2025-12-04T13:28:26.4561311Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_aminmax_cuda_float32 PASSED [0.0054s] [ 11%] 2025-12-04T13:28:26.4561577Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_aminmax_cuda_int64 PASSED [0.0040s] [ 11%] 2025-12-04T13:28:26.4561842Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_angle_cuda_float32 PASSED [0.7309s] [ 11%] 2025-12-04T13:28:26.4562148Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_any_cuda_float32 PASSED [0.0077s] [ 11%] 2025-12-04T13:28:26.4562412Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argsort_cuda_int64 PASSED [0.0112s] [ 11%] 2025-12-04T13:28:26.4562679Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_argwhere_cuda_int64 PASSED [0.7185s] [ 11%] 2025-12-04T13:28:26.4562954Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_copy_cuda_int64 XFAIL [0.0042s] [ 11%] 2025-12-04T13:28:26.4563308Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_as_strided_scatter_cuda_complex64 SKIPPED [0.0001s] (Works for int64, fails for everything else) [ 11%] 2025-12-04T13:28:26.4563663Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_asinh_cuda_float32 PASSED [1.4440s] [ 11%] 2025-12-04T13:28:26.4563926Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atan2_cuda_float32 PASSED [0.0136s] [ 11%] 2025-12-04T13:28:26.4564193Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atanh_cuda_complex64 PASSED [0.9433s] [ 11%] 2025-12-04T13:28:26.4564468Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_1d_cuda_float32 PASSED [0.0091s] [ 11%] 2025-12-04T13:28:26.4564743Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_atleast_1d_cuda_int64 PASSED [0.7692s] [ 11%] 2025-12-04T13:28:26.4565016Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bfloat16_cuda_complex64 PASSED [0.0077s] [ 11%] 2025-12-04T13:28:26.4565289Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bitwise_or_cuda_int64 PASSED [0.0048s] [ 11%] 2025-12-04T13:28:26.4565557Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_block_diag_cuda_int64 PASSED [0.7259s] [ 11%] 2025-12-04T13:28:26.4565837Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_to_cuda_complex64 PASSED [0.7340s] [ 11%] 2025-12-04T13:28:26.4566138Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_broadcast_to_cuda_float32 PASSED [0.7244s] [ 11%] 2025-12-04T13:28:26.4566414Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_bucketize_cuda_int64 PASSED [0.0100s] [ 11%] 2025-12-04T13:28:26.4566682Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_byte_cuda_complex64 PASSED [0.7295s] [ 11%] 2025-12-04T13:28:26.4566966Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_byte_cuda_int64 PASSED [0.0048s] [ 11%] 2025-12-04T13:28:26.4567236Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cdouble_cuda_complex64 PASSED [0.7456s] [ 11%] 2025-12-04T13:28:26.4567505Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_chalf_cuda_int64 PASSED [0.0056s] [ 11%] 2025-12-04T13:28:26.4567788Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cholesky_inverse_cuda_float32 PASSED [0.7446s] [ 12%] 2025-12-04T13:28:26.4568495Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cholesky_solve_cuda_complex64 SKIPPED [0.0004s] (Test is disabled because an issue exists disabling it: https://github.com/pytorch/pytorch/issues/165294 for platform(s) rocm. If you're seeing this on your local machine and would like to enable this test, please make sure CI is not set and you are not using the flag --import-disabled-tests.) [ 12%] 2025-12-04T13:28:26.4569184Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cholesky_solve_cuda_float32 PASSED [0.7407s] [ 12%] 2025-12-04T13:28:26.4569463Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_chunk_cuda_float32 PASSED [0.7335s] [ 12%] 2025-12-04T13:28:26.4569729Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_clamp_cuda_int64 PASSED [0.0054s] [ 12%] 2025-12-04T13:28:26.4569989Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_conj_cuda_int64 PASSED [0.7284s] [ 12%] 2025-12-04T13:28:26.4570271Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_constant_pad_nd_cuda_complex64 PASSED [0.0326s] [ 12%] 2025-12-04T13:28:26.4570560Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_contiguous_cuda_float32 PASSED [0.7493s] [ 12%] 2025-12-04T13:28:26.4570835Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_corrcoef_cuda_int64 PASSED [0.0063s] [ 12%] 2025-12-04T13:28:26.4571104Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cos_cuda_complex64 PASSED [1.4739s] [ 12%] 2025-12-04T13:28:26.4571366Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cos_cuda_int64 PASSED [0.0045s] [ 12%] 2025-12-04T13:28:26.4571640Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_count_nonzero_cuda_complex64 PASSED [0.7253s] [ 12%] 2025-12-04T13:28:26.4571957Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_cross_cuda_complex64 PASSED [0.0065s] [ 12%] 2025-12-04T13:28:26.4572235Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_embed_cuda_complex64 PASSED [0.7431s] [ 12%] 2025-12-04T13:28:26.4572530Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_diag_embed_cuda_float32 PASSED [0.0132s] [ 12%] 2025-12-04T13:28:26.4572803Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dist_cuda_float32 PASSED [0.0465s] [ 12%] 2025-12-04T13:28:26.4573065Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dot_cuda_float32 PASSED [0.0032s] [ 12%] 2025-12-04T13:28:26.4573327Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_double_cuda_int64 PASSED [0.7242s] [ 12%] 2025-12-04T13:28:26.4573592Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_dsplit_cuda_int64 PASSED [0.0041s] [ 12%] 2025-12-04T13:28:26.4581312Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_permuted_cuda_complex64 SKIPPED [0.0002s] (Skipped!) [ 12%] 2025-12-04T13:28:26.4581658Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_empty_strided_cuda_complex64 SKIPPED [0.0001s] (Skipped!) [ 12%] 2025-12-04T13:28:26.4582010Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_equal_cuda_int64 PASSED [0.0044s] [ 12%] 2025-12-04T13:28:26.4582268Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_erf_cuda_float32 PASSED [0.7207s] [ 12%] 2025-12-04T13:28:26.4582570Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expand_as_cuda_int64 PASSED [0.0043s] [ 12%] 2025-12-04T13:28:26.4582833Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expm1_cuda_float32 PASSED [0.7306s] [ 12%] 2025-12-04T13:28:26.4583092Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_expm1_cuda_int64 PASSED [0.0036s] [ 12%] 2025-12-04T13:28:26.4583381Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftshift_cuda_complex64 PASSED [0.7508s] [ 12%] 2025-12-04T13:28:26.4583660Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_fftshift_cuda_int64 PASSED [0.0045s] [ 12%] 2025-12-04T13:28:26.4583927Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ifftn_cuda_int64 PASSED [0.7586s] [ 12%] 2025-12-04T13:28:26.4584191Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfft2_cuda_int64 PASSED [0.7698s] [ 12%] 2025-12-04T13:28:26.4584455Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_ihfftn_cuda_float32 PASSED [0.7908s] [ 12%] 2025-12-04T13:28:26.4584724Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_irfft2_cuda_int64 PASSED [0.7595s] [ 12%] 2025-12-04T13:28:26.4584989Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fft_rfft2_cuda_float32 PASSED [1.0381s] [ 12%] 2025-12-04T13:28:26.4585270Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_flatten_cuda_complex64 PASSED [0.7539s] [ 12%] 2025-12-04T13:28:26.4585535Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fliplr_cuda_int64 PASSED [0.0040s] [ 12%] 2025-12-04T13:28:26.4585796Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_float_cuda_complex64 PASSED [0.7665s] [ 12%] 2025-12-04T13:28:26.4586053Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_floor_cuda_int64 PASSED [0.0036s] [ 12%] 2025-12-04T13:28:26.4586312Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmod_cuda_float32 PASSED [0.0109s] [ 12%] 2025-12-04T13:28:26.4586567Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_fmod_cuda_int64 PASSED [0.0049s] [ 12%] 2025-12-04T13:28:26.4586825Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_frexp_cuda_float32 PASSED [0.7405s] [ 12%] 2025-12-04T13:28:26.4587085Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_full_cuda_complex64 PASSED [0.0049s] [ 12%] 2025-12-04T13:28:26.4587342Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_gcd_cuda_int64 PASSED [0.1252s] [ 12%] 2025-12-04T13:28:26.4587596Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ge_cuda_float32 PASSED [0.0047s] [ 12%] 2025-12-04T13:28:26.4587857Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_geometric_cuda_float32 PASSED [0.7697s] [ 12%] 2025-12-04T13:28:26.4588129Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hash_tensor_cuda_float32 PASSED [0.0076s] [ 12%] 2025-12-04T13:28:26.4588409Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_hsplit_cuda_int64 PASSED [0.7541s] [ 12%] 2025-12-04T13:28:26.4588666Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_i0_cuda_float32 PASSED [0.3125s] [ 12%] 2025-12-04T13:28:26.4588923Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_igamma_cuda_float32 PASSED [0.0055s] [ 12%] 2025-12-04T13:28:26.4589192Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_copy_cuda_complex64 PASSED [0.7486s] [ 12%] 2025-12-04T13:28:26.4589469Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_fill_cuda_complex64 PASSED [0.0103s] [ 12%] 2025-12-04T13:28:26.4589741Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_put_cuda_float32 PASSED [0.7499s] [ 12%] 2025-12-04T13:28:26.4590019Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_index_reduce_amax_cuda_float32 PASSED [0.0122s] [ 12%] 2025-12-04T13:28:26.4590293Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_int_cuda_complex64 PASSED [0.7551s] [ 12%] 2025-12-04T13:28:26.4590550Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_int_cuda_float32 PASSED [0.0048s] [ 12%] 2025-12-04T13:28:26.4590809Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isclose_cuda_complex64 PASSED [0.7508s] [ 12%] 2025-12-04T13:28:26.4591086Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isclose_cuda_float32 PASSED [0.0112s] [ 12%] 2025-12-04T13:28:26.4591345Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isin_cuda_int64 PASSED [0.7646s] [ 12%] 2025-12-04T13:28:26.4591612Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isinf_cuda_float32 PASSED [0.0036s] [ 12%] 2025-12-04T13:28:26.4591920Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_isneginf_cuda_float32 PASSED [0.7617s] [ 12%] 2025-12-04T13:28:26.4592215Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_2inputs_2outputs_cuda_complex64 PASSED [0.1268s] [ 12%] 2025-12-04T13:28:26.4592536Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_2inputs_2outputs_cuda_float32 PASSED [0.1136s] [ 12%] 2025-12-04T13:28:26.4592860Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_4inputs_with_extra_args_cuda_float32 PASSED [0.1146s] [ 12%] 2025-12-04T13:28:26.4593175Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_cuda_complex64 PASSED [0.1358s] [ 12%] 2025-12-04T13:28:26.4593467Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_cuda_float32 PASSED [0.1162s] [ 12%] 2025-12-04T13:28:26.4593792Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_jiterator_binary_return_by_ref_cuda_complex64 PASSED [0.1251s] [ 12%] 2025-12-04T13:28:26.4594095Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_kron_cuda_complex64 PASSED [0.7510s] [ 12%] 2025-12-04T13:28:26.4594366Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lerp_cuda_complex64 PASSED [0.1790s] [ 12%] 2025-12-04T13:28:26.4594635Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lgamma_cuda_float32 PASSED [1.1112s] [ 13%] 2025-12-04T13:28:26.4594922Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cholesky_ex_cuda_float32 PASSED [0.0155s] [ 13%] 2025-12-04T13:28:26.4595220Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cond_cuda_complex64 PASSED [0.7760s] [ 13%] 2025-12-04T13:28:26.4595508Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_cross_cuda_complex64 PASSED [0.7602s] [ 13%] 2025-12-04T13:28:26.4595799Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_det_cuda_complex64 PASSED [0.0133s] [ 13%] 2025-12-04T13:28:26.4596083Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eig_cuda_float32 PASSED [0.0899s] [ 13%] 2025-12-04T13:28:26.4596367Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_eigvals_cuda_float32 PASSED [0.0682s] [ 13%] 2025-12-04T13:28:26.4596668Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_ldl_factor_ex_cuda_float32 PASSED [0.0120s] [ 13%] 2025-12-04T13:28:26.4596979Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_lstsq_cuda_complex64 PASSED [0.2094s] [ 13%] 2025-12-04T13:28:26.4597278Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_matrix_rank_cuda_complex64 PASSED [0.0474s] [ 13%] 2025-12-04T13:28:26.4597577Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_cuda_complex64 PASSED [1.2646s] [ 13%] 2025-12-04T13:28:26.4597858Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_cuda_float32 PASSED [1.2412s] [ 13%] 2025-12-04T13:28:26.4598156Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_hermitian_cuda_complex64 PASSED [0.0230s] [ 13%] 2025-12-04T13:28:26.4598454Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_cuda_complex64 PASSED [0.0285s] [ 13%] 2025-12-04T13:28:26.4598736Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_cuda_float32 PASSED [0.0272s] [ 13%] 2025-12-04T13:28:26.4599025Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_ex_cuda_complex64 PASSED [0.0278s] [ 13%] 2025-12-04T13:28:26.4599329Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_solve_triangular_cuda_float32 PASSED [0.2625s] [ 13%] 2025-12-04T13:28:26.4599649Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_svdvals_cuda_float32 PASSED [0.0316s] [ 13%] 2025-12-04T13:28:26.4599939Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_tensorsolve_cuda_float32 PASSED [1.2060s] [ 13%] 2025-12-04T13:28:26.4600229Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vander_cuda_float32 PASSED [1.2275s] [ 13%] 2025-12-04T13:28:26.4600533Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_vector_norm_cuda_float32 PASSED [0.1253s] [ 13%] 2025-12-04T13:28:26.4600812Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_cuda_int64 PASSED [1.2502s] [ 13%] 2025-12-04T13:28:26.4601098Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_log_softmax_with_dtype_cuda_complex64 PASSED [1.2507s] [ 13%] 2025-12-04T13:28:26.4601396Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logaddexp_cuda_complex64 PASSED [0.3235s] [ 13%] 2025-12-04T13:28:26.4601671Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logaddexp_cuda_float32 PASSED [0.0228s] [ 13%] 2025-12-04T13:28:26.4601986Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logcumsumexp_cuda_float32 PASSED [1.2114s] [ 13%] 2025-12-04T13:28:26.4602281Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_and_cuda_float32 PASSED [0.0066s] [ 13%] 2025-12-04T13:28:26.4602557Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_not_cuda_int64 PASSED [1.2160s] [ 13%] 2025-12-04T13:28:26.4602835Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_or_cuda_complex64 PASSED [0.0065s] [ 13%] 2025-12-04T13:28:26.4603119Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_xor_cuda_complex64 PASSED [0.0049s] [ 13%] 2025-12-04T13:28:26.4603395Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logical_xor_cuda_int64 PASSED [0.0046s] [ 13%] 2025-12-04T13:28:26.4603667Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_logsumexp_cuda_float32 PASSED [1.2237s] [ 13%] 2025-12-04T13:28:26.4603938Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_long_cuda_complex64 PASSED [0.0051s] [ 13%] 2025-12-04T13:28:26.4604200Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_cuda_float32 PASSED [0.0540s] [ 13%] 2025-12-04T13:28:26.4604466Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_solve_cuda_complex64 PASSED [0.0685s] [ 13%] 2025-12-04T13:28:26.4604742Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_lu_unpack_cuda_complex64 PASSED [0.0447s] [ 13%] 2025-12-04T13:28:26.4605010Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mH_cuda_float32 PASSED [1.2033s] [ 13%] 2025-12-04T13:28:26.4605264Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mH_cuda_int64 PASSED [1.2088s] [ 13%] 2025-12-04T13:28:26.4605524Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_amin_cuda_int64 PASSED [0.0275s] [ 13%] 2025-12-04T13:28:26.4605816Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumsum_cuda_complex64 PASSED [0.0169s] [ 13%] 2025-12-04T13:28:26.4606098Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_cumsum_cuda_int64 PASSED [0.0081s] [ 13%] 2025-12-04T13:28:26.4606375Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_fill_cuda_complex64 PASSED [0.0101s] [ 13%] 2025-12-04T13:28:26.4606652Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_norm_cuda_float32 PASSED [0.3185s] [ 13%] 2025-12-04T13:28:26.4606936Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_normalize_cuda_float32 PASSED [0.0313s] [ 13%] 2025-12-04T13:28:26.4607224Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_select_cuda_complex64 PASSED [1.2451s] [ 13%] 2025-12-04T13:28:26.4607499Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_std_cuda_int64 PASSED [0.0545s] [ 13%] 2025-12-04T13:28:26.4607772Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_masked_sum_cuda_complex64 PASSED [0.0660s] [ 13%] 2025-12-04T13:28:26.4608042Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_matmul_cuda_float32 PASSED [0.0169s] [ 13%] 2025-12-04T13:28:26.4608323Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_binary_cuda_float32 PASSED [0.0102s] [ 13%] 2025-12-04T13:28:26.4608590Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_max_binary_cuda_int64 PASSED [0.0046s] [ 13%] 2025-12-04T13:28:26.4608858Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_maximum_cuda_float32 PASSED [0.0100s] [ 13%] 2025-12-04T13:28:26.4609138Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mm_cuda_complex64 PASSED [1.1920s] [ 13%] 2025-12-04T13:28:26.4609396Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mode_cuda_int64 PASSED [1.3369s] [ 13%] 2025-12-04T13:28:26.4609662Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_multinomial_cuda_float32 PASSED [1.2123s] [ 13%] 2025-12-04T13:28:26.4609928Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_mv_cuda_float32 PASSED [0.0062s] [ 13%] 2025-12-04T13:28:26.4610195Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_copy_cuda_complex64 PASSED [1.2156s] [ 13%] 2025-12-04T13:28:26.4610467Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_narrow_cuda_float32 PASSED [0.0109s] [ 13%] 2025-12-04T13:28:26.4610757Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_native_batch_norm_cuda_float32 PASSED [0.0152s] [ 13%] 2025-12-04T13:28:26.4611031Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_neg_cuda_float32 PASSED [1.2116s] [ 13%] 2025-12-04T13:28:26.4611308Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_empty_cuda_int64 SKIPPED [0.0003s] (Skipped!) [ 13%] 2025-12-04T13:28:26.4611592Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_ones_cuda_complex64 PASSED [1.2119s] [ 13%] 2025-12-04T13:28:26.4611899Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_new_zeros_cuda_float32 PASSED [0.0068s] [ 13%] 2025-12-04T13:28:26.4612192Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_avg_pool3d_cuda_float32 PASSED [0.0157s] [ 13%] 2025-12-04T13:28:26.4612503Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_batch_norm_cuda_float32 PASSED [1.3250s] [ 13%] 2025-12-04T13:28:26.4612805Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv1d_cuda_float32 PASSED [1.2389s] [ 13%] 2025-12-04T13:28:26.4613102Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv2d_cuda_float32 PASSED [0.0416s] [ 13%] 2025-12-04T13:28:26.4613414Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_conv_transpose2d_cuda_float32 PASSED [0.1417s] [ 13%] 2025-12-04T13:28:26.4613751Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_cosine_embedding_loss_cuda_float32 PASSED [1.2238s] [ 14%] 2025-12-04T13:28:26.4614080Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_cross_entropy_cuda_float32 PASSED [0.0416s] [ 14%] 2025-12-04T13:28:26.4614402Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_dropout_cuda_float32 PASSED [0.0187s] [ 14%] 2025-12-04T13:28:26.4614737Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_feature_alpha_dropout_with_train_cuda_float32 PASSED [1.2179s] [ 14%] 2025-12-04T13:28:26.4615067Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_gelu_cuda_float32 PASSED [0.0264s] [ 14%] 2025-12-04T13:28:26.4615371Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_grid_sample_cuda_float32 PASSED [0.0344s] [ 14%] 2025-12-04T13:28:26.4615682Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hardsigmoid_cuda_float32 PASSED [1.2076s] [ 14%] 2025-12-04T13:28:26.4615992Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_hardswish_cuda_float32 PASSED [1.2226s] [ 14%] 2025-12-04T13:28:26.4616310Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_interpolate_area_cuda_float32 PASSED [1.2145s] [ 14%] 2025-12-04T13:28:26.4616623Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_kl_div_cuda_float32 PASSED [1.2236s] [ 14%] 2025-12-04T13:28:26.4616954Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_margin_ranking_loss_cuda_float32 PASSED [0.0253s] [ 14%] 2025-12-04T13:28:26.4617284Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool2d_grad_cuda_float32 PASSED [0.0584s] [ 14%] 2025-12-04T13:28:26.4617625Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_max_unpool3d_grad_cuda_float32 PASSED [0.0242s] [ 14%] 2025-12-04T13:28:26.4617965Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_multilabel_soft_margin_loss_cuda_float32 PASSED [1.2040s] [ 14%] 2025-12-04T13:28:26.4618297Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_normalize_cuda_complex64 PASSED [0.1305s] [ 14%] 2025-12-04T13:28:26.4618606Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_normalize_cuda_float32 PASSED [1.1995s] [ 14%] 2025-12-04T13:28:26.4618929Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pad_replicate_negative_cuda_float32 PASSED [0.0173s] [ 14%] 2025-12-04T13:28:26.4619264Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_shuffle_cuda_complex64 PASSED [0.0054s] [ 14%] 2025-12-04T13:28:26.4619598Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_pixel_unshuffle_cuda_int64 PASSED [0.0033s] [ 14%] 2025-12-04T13:28:26.4619903Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_relu6_cuda_int64 PASSED [1.1956s] [ 14%] 2025-12-04T13:28:26.4620193Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_relu_cuda_int64 PASSED [0.0050s] [ 14%] 2025-12-04T13:28:26.4620485Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_selu_cuda_float32 PASSED [1.2293s] [ 14%] 2025-12-04T13:28:26.4620793Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_smooth_l1_loss_cuda_float32 PASSED [0.0338s] [ 14%] 2025-12-04T13:28:26.4621109Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softsign_cuda_complex64 PASSED [1.2511s] [ 14%] 2025-12-04T13:28:26.4621421Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_softsign_cuda_float32 PASSED [0.0075s] [ 14%] 2025-12-04T13:28:26.4621742Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_triplet_margin_loss_cuda_float32 PASSED [1.2191s] [ 14%] 2025-12-04T13:28:26.4622264Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nn_functional_unfold_cuda_complex64 PASSED [0.1087s] [ 14%] 2025-12-04T13:28:26.4622554Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_cuda_complex64 PASSED [0.0096s] [ 14%] 2025-12-04T13:28:26.4622856Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_nonzero_static_cuda_int64 SKIPPED [0.0006s] (Only runs on cpu) [ 14%] 2025-12-04T13:28:26.4623168Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_norm_fro_cuda_float32 PASSED [0.0045s] [ 14%] 2025-12-04T13:28:26.4623486Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_normal_in_place_cuda_complex64 SKIPPED [0.0001s] (Test expects tensor input) [ 14%] 2025-12-04T13:28:26.4623835Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_normal_number_mean_cuda_float32 SKIPPED [0.0001s] (Skipped!) [ 14%] 2025-12-04T13:28:26.4624128Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_cuda_float32 PASSED [1.2285s] [ 14%] 2025-12-04T13:28:26.4624393Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ones_like_cuda_int64 PASSED [0.0062s] [ 14%] 2025-12-04T13:28:26.4624666Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_copy_cuda_float32 PASSED [1.2279s] [ 14%] 2025-12-04T13:28:26.4624938Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_permute_cuda_int64 PASSED [0.0050s] [ 14%] 2025-12-04T13:28:26.4625202Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pinverse_cuda_float32 PASSED [0.0132s] [ 14%] 2025-12-04T13:28:26.4625503Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_polygamma_polygamma_n_3_cuda_int64 SKIPPED [0.0002s] (Skipped!) [ 14%] 2025-12-04T13:28:26.4625819Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_positive_cuda_float32 PASSED [1.1944s] [ 14%] 2025-12-04T13:28:26.4626083Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_pow_cuda_complex64 PASSED [0.2839s] [ 14%] 2025-12-04T13:28:26.4626345Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rad2deg_cuda_float32 PASSED [1.2077s] [ 14%] 2025-12-04T13:28:26.4626621Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rad2deg_cuda_int64 PASSED [0.0043s] [ 14%] 2025-12-04T13:28:26.4626922Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randint_cuda_int64 SKIPPED [0.0002s] (Test expects tensor input) [ 14%] 2025-12-04T13:28:26.4627261Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_randn_cuda_complex64 SKIPPED [0.0001s] (Test expects tensor input) [ 14%] 2025-12-04T13:28:26.4627567Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_ravel_cuda_int64 PASSED [0.0033s] [ 14%] 2025-12-04T13:28:26.4627824Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_real_cuda_float32 PASSED [1.2438s] [ 14%] 2025-12-04T13:28:26.4628090Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_renorm_cuda_complex64 PASSED [0.0159s] [ 14%] 2025-12-04T13:28:26.4628388Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_repeat_interleave_cuda_complex64 PASSED [1.2324s] [ 14%] 2025-12-04T13:28:26.4628679Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resize_as__cuda_complex64 PASSED [0.0052s] [ 14%] 2025-12-04T13:28:26.4628958Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_resolve_neg_cuda_complex64 PASSED [1.1983s] [ 14%] 2025-12-04T13:28:26.4629232Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_round_cuda_int64 PASSED [0.0041s] [ 14%] 2025-12-04T13:28:26.4629505Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_round_decimals_0_cuda_float32 PASSED [1.2141s] [ 14%] 2025-12-04T13:28:26.4629783Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_rsqrt_cuda_complex64 PASSED [0.0100s] [ 14%] 2025-12-04T13:28:26.4630073Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scalar_tensor_cuda_complex64 SKIPPED [0.0002s] (Skipped!) [ 14%] 2025-12-04T13:28:26.4630391Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scalar_tensor_cuda_float32 SKIPPED [0.0001s] (Skipped!) [ 14%] 2025-12-04T13:28:26.4630698Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_scatter_reduce_sum_cuda_float32 PASSED [0.0217s] [ 14%] 2025-12-04T13:28:26.4630983Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_searchsorted_cuda_int64 PASSED [0.0746s] [ 14%] 2025-12-04T13:28:26.4631262Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_select_scatter_cuda_int64 PASSED [1.2294s] [ 14%] 2025-12-04T13:28:26.4631531Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sgn_cuda_complex64 PASSED [0.0056s] [ 14%] 2025-12-04T13:28:26.4631809Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_short_cuda_float32 PASSED [1.2314s] [ 14%] 2025-12-04T13:28:26.4632107Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sign_cuda_int64 PASSED [0.0041s] [ 14%] 2025-12-04T13:28:26.4632414Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_signal_windows_general_hamming_cuda_float32 SKIPPED [0.0002s] (Skipped!) [ 14%] 2025-12-04T13:28:26.4632531Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_sinh_cuda_complex64 PASSED [1.6052s] [ 14%] 2025-12-04T13:28:26.4632646Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_slice_cuda_complex64 PASSED [1.2189s] [ 14%] 2025-12-04T13:28:26.4632763Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_softmax_cuda_float32 PASSED [1.2206s] [ 14%] 2025-12-04T13:28:26.4632896Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_softmax_with_dtype_cuda_complex64 PASSED [1.2407s] [ 14%] 2025-12-04T13:28:26.4633022Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_airy_ai_cuda_float32 PASSED [0.2134s] [ 14%] 2025-12-04T13:28:26.4633150Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_bessel_j1_cuda_float32 PASSED [1.3987s] [ 15%] 2025-12-04T13:28:26.4633311Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_t_cuda_int64 PASSED [0.0107s] [ 15%] 2025-12-04T13:28:26.4633460Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_u_cuda_float32 PASSED [0.0075s] [ 15%] 2025-12-04T13:28:26.4633617Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_chebyshev_polynomial_u_cuda_int64 PASSED [0.0068s] [ 15%] 2025-12-04T13:28:26.4633760Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_hermite_polynomial_he_cuda_int64 PASSED [0.0068s] [ 15%] 2025-12-04T13:28:26.4633879Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_i0e_cuda_float32 PASSED [1.5464s] [ 15%] 2025-12-04T13:28:26.4633994Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_i1_cuda_int64 PASSED [1.4146s] [ 15%] 2025-12-04T13:28:26.4634130Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_modified_bessel_i1_cuda_int64 PASSED [1.4296s] [ 15%] 2025-12-04T13:28:26.4634290Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_polygamma_special_polygamma_n_0_cuda_float32 PASSED [1.7507s] [ 15%] 2025-12-04T13:28:26.4634437Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_scaled_modified_bessel_k0_cuda_int64 PASSED [1.4282s] [ 15%] 2025-12-04T13:28:26.4634611Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_t_cuda_float32 PASSED [0.0102s] [ 15%] 2025-12-04T13:28:26.4634766Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_shifted_chebyshev_polynomial_u_cuda_float32 PASSED [0.0073s] [ 15%] 2025-12-04T13:28:26.4634888Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_special_zeta_cuda_float32 PASSED [0.0068s] [ 15%] 2025-12-04T13:28:26.4635016Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_list_args_cuda_complex64 PASSED [1.2328s] [ 15%] 2025-12-04T13:28:26.4635138Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_list_args_cuda_int64 PASSED [0.0050s] [ 15%] 2025-12-04T13:28:26.4635267Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_copy_cuda_int64 PASSED [1.2233s] [ 15%] 2025-12-04T13:28:26.4635394Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_cuda_complex64 PASSED [0.0090s] [ 15%] 2025-12-04T13:28:26.4635520Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_split_with_sizes_cuda_int64 PASSED [1.2328s] [ 15%] 2025-12-04T13:28:26.4635636Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_square_cuda_complex64 PASSED [0.0068s] [ 15%] 2025-12-04T13:28:26.4635751Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_square_cuda_float32 PASSED [1.2354s] [ 15%] 2025-12-04T13:28:26.4635864Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_square_cuda_int64 PASSED [0.0048s] [ 15%] 2025-12-04T13:28:26.4635994Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_cuda_float32 PASSED [1.2259s] [ 15%] 2025-12-04T13:28:26.4636123Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_multiple_cuda_complex64 PASSED [0.0091s] [ 15%] 2025-12-04T13:28:26.4636248Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_squeeze_multiple_cuda_int64 PASSED [1.2411s] [ 15%] 2025-12-04T13:28:26.4636359Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_cuda_float32 PASSED [0.0150s] [ 15%] 2025-12-04T13:28:26.4636480Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_mean_cuda_complex64 PASSED [0.0165s] [ 15%] 2025-12-04T13:28:26.4636595Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_std_mean_cuda_float32 PASSED [1.2526s] [ 15%] 2025-12-04T13:28:26.4636708Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stft_cuda_complex64 PASSED [1.3502s] [ 15%] 2025-12-04T13:28:26.4636819Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_stft_cuda_float32 PASSED [1.1908s] [ 15%] 2025-12-04T13:28:26.4636939Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_svd_lowrank_cuda_float32 PASSED [0.9896s] [ 15%] 2025-12-04T13:28:26.4637050Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_t_cuda_complex64 PASSED [0.7693s] [ 15%] 2025-12-04T13:28:26.4637165Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_t_cuda_int64 PASSED [0.0043s] [ 15%] 2025-12-04T13:28:26.4637295Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_along_dim_cuda_complex64 PASSED [0.7691s] [ 15%] 2025-12-04T13:28:26.4637415Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_take_cuda_float32 PASSED [0.0105s] [ 15%] 2025-12-04T13:28:26.4637526Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tan_cuda_float32 PASSED [0.7787s] [ 15%] 2025-12-04T13:28:26.4637632Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tan_cuda_int64 PASSED [0.0041s] [ 15%] 2025-12-04T13:28:26.4637758Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_tensor_split_cuda_complex64 PASSED [0.7829s] [ 15%] 2025-12-04T13:28:26.4637870Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_cuda_complex64 PASSED [0.0199s] [ 15%] 2025-12-04T13:28:26.4637990Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_to_sparse_cuda_float32 SKIPPED [0.0002s] [ 15%] 2025-12-04T13:28:26.4638244Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_torch_ops_aten__efficient_attention_forward_cuda_float32 SKIPPED [0.0007s] (Efficient attention on ROCM doesn't support custom_mask_type==2) [ 15%] 2025-12-04T13:28:26.4638374Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trace_cuda_complex64 PASSED [0.7707s] [ 15%] 2025-12-04T13:28:26.4638488Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_trace_cuda_float32 PASSED [0.0047s] [ 15%] 2025-12-04T13:28:26.4638604Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_transpose_cuda_float32 PASSED [0.7747s] [ 15%] 2025-12-04T13:28:26.4638733Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triangular_solve_cuda_float32 PASSED [0.0222s] [ 15%] 2025-12-04T13:28:26.4638843Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_triu_cuda_int64 PASSED [0.7298s] [ 15%] 2025-12-04T13:28:26.4638967Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unbind_copy_cuda_complex64 PASSED [0.0119s] [ 15%] 2025-12-04T13:28:26.4639088Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unfold_copy_cuda_complex64 PASSED [0.0178s] [ 15%] 2025-12-04T13:28:26.4639209Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unravel_index_cuda_int64 PASSED [0.7181s] [ 15%] 2025-12-04T13:28:26.4639331Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_unsafe_split_cuda_complex64 PASSED [0.0060s] [ 15%] 2025-12-04T13:28:26.4639444Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_cuda_float32 PASSED [0.7258s] [ 15%] 2025-12-04T13:28:26.4639566Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_as_real_cuda_complex64 PASSED [0.7182s] [ 15%] 2025-12-04T13:28:26.4639682Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_copy_cuda_float32 PASSED [0.0090s] [ 15%] 2025-12-04T13:28:26.4639806Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_copy_cuda_int64 PASSED [0.7212s] [ 15%] 2025-12-04T13:28:26.4639919Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_view_cuda_int64 PASSED [0.0051s] [ 15%] 2025-12-04T13:28:26.4640030Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_vsplit_cuda_int64 PASSED [0.7207s] [ 15%] 2025-12-04T13:28:26.4640141Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zeros_cuda_int64 PASSED [0.0036s] [ 15%] 2025-12-04T13:28:26.4640259Z test_ops.py::TestCommonCUDA::test_noncontiguous_samples_zeros_like_cuda_float32 PASSED [0.7188s] [ 15%] 2025-12-04T13:28:26.4640361Z test_ops.py::TestCommonCUDA::test_numpy_ref_aminmax_cuda_float64 PASSED [0.0057s] [ 15%] 2025-12-04T13:28:26.4640456Z test_ops.py::TestCommonCUDA::test_numpy_ref_aminmax_cuda_int64 PASSED [0.0041s] [ 15%] 2025-12-04T13:28:26.4640555Z test_ops.py::TestCommonCUDA::test_numpy_ref_argwhere_cuda_float64 PASSED [0.7114s] [ 15%] 2025-12-04T13:28:26.4640667Z test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_tensors_cuda_float64 PASSED [0.0067s] [ 15%] 2025-12-04T13:28:26.4640785Z test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_to_cuda_complex128 PASSED [0.7053s] [ 15%] 2025-12-04T13:28:26.4640888Z test_ops.py::TestCommonCUDA::test_numpy_ref_broadcast_to_cuda_int64 PASSED [0.7061s] [ 15%] 2025-12-04T13:28:26.4640980Z test_ops.py::TestCommonCUDA::test_numpy_ref_cat_cuda_int64 PASSED [0.0065s] [ 15%] 2025-12-04T13:28:26.4641087Z test_ops.py::TestCommonCUDA::test_numpy_ref_clone_cuda_complex128 XFAIL [0.0043s] [ 15%] 2025-12-04T13:28:26.4641180Z test_ops.py::TestCommonCUDA::test_numpy_ref_diff_cuda_int64 PASSED [0.0189s] [ 15%] 2025-12-04T13:28:26.4641274Z test_ops.py::TestCommonCUDA::test_numpy_ref_equal_cuda_int64 PASSED [0.7146s] [ 15%] 2025-12-04T13:28:26.4641367Z test_ops.py::TestCommonCUDA::test_numpy_ref_item_cuda_float64 PASSED [0.0049s] [ 16%] 2025-12-04T13:28:26.4641469Z test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_cross_cuda_int64 PASSED [0.7350s] [ 16%] 2025-12-04T13:28:26.4641578Z test_ops.py::TestCommonCUDA::test_numpy_ref_linalg_tensorinv_cuda_float64 PASSED [0.0069s] [ 16%] 2025-12-04T13:28:26.4641697Z test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_l1_loss_cuda_complex128 PASSED [0.7217s] [ 16%] 2025-12-04T13:28:26.4641808Z test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_l1_loss_cuda_float64 PASSED [0.0054s] [ 16%] 2025-12-04T13:28:26.4641985Z test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_pairwise_distance_cuda_float64 PASSED [0.7129s] [ 16%] 2025-12-04T13:28:26.4642096Z test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_pdist_cuda_float64 PASSED [0.0134s] [ 16%] 2025-12-04T13:28:26.4642219Z test_ops.py::TestCommonCUDA::test_numpy_ref_nn_functional_smooth_l1_loss_cuda_float64 PASSED [0.7162s] [ 16%] 2025-12-04T13:28:26.4642323Z test_ops.py::TestCommonCUDA::test_numpy_ref_permute_cuda_complex128 PASSED [0.0180s] [ 16%] 2025-12-04T13:28:26.4642424Z test_ops.py::TestCommonCUDA::test_numpy_ref_repeat_cuda_float64 PASSED [0.0108s] [ 16%] 2025-12-04T13:28:26.4642521Z test_ops.py::TestCommonCUDA::test_numpy_ref_roll_cuda_complex128 PASSED [0.0062s] [ 16%] 2025-12-04T13:28:26.4642647Z test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_general_cosine_cuda_float64 PASSED [0.0083s] [ 16%] 2025-12-04T13:28:26.4642777Z test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_general_hamming_cuda_float64 PASSED [0.0077s] [ 16%] 2025-12-04T13:28:26.4642892Z test_ops.py::TestCommonCUDA::test_numpy_ref_signal_windows_kaiser_cuda_float64 PASSED [0.0083s] [ 16%] 2025-12-04T13:28:26.4642995Z test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_copy_cuda_int64 PASSED [0.7289s] [ 16%] 2025-12-04T13:28:26.4643094Z test_ops.py::TestCommonCUDA::test_numpy_ref_squeeze_cuda_complex128 PASSED [0.0056s] [ 16%] 2025-12-04T13:28:26.4643202Z test_ops.py::TestCommonCUDA::test_numpy_ref_tensor_split_cuda_complex128 PASSED [0.7186s] [ 16%] 2025-12-04T13:28:26.4643314Z test_ops.py::TestCommonCUDA::test_numpy_ref_tril_indices_cuda_int64 PASSED [0.0048s] [ 16%] 2025-12-04T13:28:26.4643414Z test_ops.py::TestCommonCUDA::test_numpy_ref_unbind_copy_cuda_int64 PASSED [0.7153s] [ 16%] 2025-12-04T13:28:26.4643513Z test_ops.py::TestCommonCUDA::test_numpy_ref_view_copy_cuda_float64 PASSED [0.0055s] [ 16%] 2025-12-04T13:28:26.4643606Z test_ops.py::TestCommonCUDA::test_out___getitem___cuda_float32 PASSED [0.7171s] [ 16%] 2025-12-04T13:28:26.4643697Z test_ops.py::TestCommonCUDA::test_out___rmul___cuda_float32 PASSED [0.0031s] [ 16%] 2025-12-04T13:28:26.4643808Z test_ops.py::TestCommonCUDA::test_out__refs__conversions_bool_cuda_float32 PASSED [0.7073s] [ 16%] 2025-12-04T13:28:26.4643917Z test_ops.py::TestCommonCUDA::test_out__refs__conversions_cfloat_cuda_float32 PASSED [0.0029s] [ 16%] 2025-12-04T13:28:26.4644027Z test_ops.py::TestCommonCUDA::test_out__refs__conversions_chalf_cuda_float32 PASSED [0.7073s] [ 16%] 2025-12-04T13:28:26.4644135Z test_ops.py::TestCommonCUDA::test_out__refs__conversions_char_cuda_float32 PASSED [0.0029s] [ 16%] 2025-12-04T13:28:26.4644233Z test_ops.py::TestCommonCUDA::test_out__refs_addcdiv_cuda_float32 PASSED [0.0205s] [ 16%] 2025-12-04T13:28:26.4644339Z test_ops.py::TestCommonCUDA::test_out__refs_any_cuda_float32 PASSED [0.7731s] [ 16%] 2025-12-04T13:28:26.4644433Z test_ops.py::TestCommonCUDA::test_out__refs_arange_cuda_float32 PASSED [0.0294s] [ 16%] 2025-12-04T13:28:26.4644542Z test_ops.py::TestCommonCUDA::test_out__refs_bitwise_left_shift_cuda_int64 PASSED [0.0104s] [ 16%] 2025-12-04T13:28:26.4644653Z test_ops.py::TestCommonCUDA::test_out__refs_bitwise_xor_cuda_int64 PASSED [0.0100s] [ 16%] 2025-12-04T13:28:26.4644759Z test_ops.py::TestCommonCUDA::test_out__refs_broadcast_shapes_cuda_float32 PASSED [0.0019s] [ 16%] 2025-12-04T13:28:26.4644904Z test_ops.py::TestCommonCUDA::test_out__refs_cauchy_cuda_float32 SKIPPED [0.0001s] (Expected: cauchy is not comparable) [ 16%] 2025-12-04T13:28:26.4644997Z test_ops.py::TestCommonCUDA::test_out__refs_clamp_cuda_float32 PASSED [0.7490s] [ 16%] 2025-12-04T13:28:26.4645094Z test_ops.py::TestCommonCUDA::test_out__refs_clamp_max_cuda_float32 PASSED [0.0117s] [ 16%] 2025-12-04T13:28:26.4645198Z test_ops.py::TestCommonCUDA::test_out__refs_conj_physical_cuda_float32 PASSED [0.7209s] [ 16%] 2025-12-04T13:28:26.4645291Z test_ops.py::TestCommonCUDA::test_out__refs_diag_cuda_float32 PASSED [0.0157s] [ 16%] 2025-12-04T13:28:26.4645418Z test_ops.py::TestCommonCUDA::test_out__refs_div_trunc_rounding_cuda_float32 PASSED [0.0164s] [ 16%] 2025-12-04T13:28:26.4645508Z test_ops.py::TestCommonCUDA::test_out__refs_dot_cuda_float32 PASSED [0.7171s] [ 16%] 2025-12-04T13:28:26.4645667Z test_ops.py::TestCommonCUDA::test_out__refs_exponential_cuda_float32 SKIPPED [0.0002s] (Expected: exponential is not comparable) [ 16%] 2025-12-04T13:28:26.4645768Z test_ops.py::TestCommonCUDA::test_out__refs_fft_fftshift_cuda_float32 PASSED [0.7148s] [ 16%] 2025-12-04T13:28:26.4645863Z test_ops.py::TestCommonCUDA::test_out__refs_fft_hfft2_cuda_float32 PASSED [0.0165s] [ 16%] 2025-12-04T13:28:26.4645958Z test_ops.py::TestCommonCUDA::test_out__refs_fft_ifft2_cuda_float32 PASSED [0.7155s] [ 16%] 2025-12-04T13:28:26.4646057Z test_ops.py::TestCommonCUDA::test_out__refs_fft_irfftn_cuda_float32 PASSED [0.0153s] [ 16%] 2025-12-04T13:28:26.4646154Z test_ops.py::TestCommonCUDA::test_out__refs_fft_rfft_cuda_float32 PASSED [0.0115s] [ 16%] 2025-12-04T13:28:26.4646247Z test_ops.py::TestCommonCUDA::test_out__refs_flatten_cuda_float32 PASSED [0.7300s] [ 16%] 2025-12-04T13:28:26.4646341Z test_ops.py::TestCommonCUDA::test_out__refs_fliplr_cuda_float32 PASSED [0.0031s] [ 16%] 2025-12-04T13:28:26.4646432Z test_ops.py::TestCommonCUDA::test_out__refs_fmod_cuda_float32 PASSED [0.0157s] [ 16%] 2025-12-04T13:28:26.4646587Z test_ops.py::TestCommonCUDA::test_out__refs_geometric_cuda_float32 SKIPPED [0.0001s] (Expected: geometric is not comparable) [ 16%] 2025-12-04T13:28:26.4646677Z test_ops.py::TestCommonCUDA::test_out__refs_gt_cuda_float32 PASSED [0.0103s] [ 16%] 2025-12-04T13:28:26.4646781Z test_ops.py::TestCommonCUDA::test_out__refs_hstack_cuda_float32 PASSED [0.0048s] [ 16%] 2025-12-04T13:28:26.4646874Z test_ops.py::TestCommonCUDA::test_out__refs_igammac_cuda_float32 PASSED [0.0132s] [ 16%] 2025-12-04T13:28:26.4646966Z test_ops.py::TestCommonCUDA::test_out__refs_lerp_cuda_float32 PASSED [0.0223s] [ 16%] 2025-12-04T13:28:26.4647066Z test_ops.py::TestCommonCUDA::test_out__refs_linalg_cross_cuda_float32 PASSED [0.7105s] [ 16%] 2025-12-04T13:28:26.4647176Z test_ops.py::TestCommonCUDA::test_out__refs_linalg_vector_norm_cuda_float32 PASSED [0.1632s] [ 16%] 2025-12-04T13:28:26.4647272Z test_ops.py::TestCommonCUDA::test_out__refs_linspace_cuda_float32 PASSED [0.0542s] [ 16%] 2025-12-04T13:28:26.4647424Z test_ops.py::TestCommonCUDA::test_out__refs_log_normal_cuda_float32 SKIPPED [0.0001s] (Expected: log_normal is not comparable) [ 16%] 2025-12-04T13:28:26.4647536Z test_ops.py::TestCommonCUDA::test_out__refs_log_softmax_with_dtype_cuda_float32 PASSED [0.7280s] [ 16%] 2025-12-04T13:28:26.4647637Z test_ops.py::TestCommonCUDA::test_out__refs_logaddexp2_cuda_float32 PASSED [0.0069s] [ 16%] 2025-12-04T13:28:26.4647738Z test_ops.py::TestCommonCUDA::test_out__refs_masked_fill_cuda_float32 PASSED [0.7252s] [ 16%] 2025-12-04T13:28:26.4647865Z test_ops.py::TestCommonCUDA::test_out__refs_meshgrid_list_of_tensors_cuda_float32 PASSED [0.0031s] [ 16%] 2025-12-04T13:28:26.4647965Z test_ops.py::TestCommonCUDA::test_out__refs_narrow_copy_cuda_float32 PASSED [0.7268s] [ 16%] 2025-12-04T13:28:26.4648061Z test_ops.py::TestCommonCUDA::test_out__refs_new_ones_cuda_float32 PASSED [0.0030s] [ 16%] 2025-12-04T13:28:26.4648167Z test_ops.py::TestCommonCUDA::test_out__refs_new_zeros_cuda_float32 PASSED [0.7049s] [ 16%] 2025-12-04T13:28:26.4648294Z test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_pairwise_distance_cuda_float32 PASSED [0.0099s] [ 16%] 2025-12-04T13:28:26.4648408Z test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_softplus_cuda_float32 PASSED [0.7211s] [ 16%] 2025-12-04T13:28:26.4648526Z test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_softshrink_cuda_float32 PASSED [0.0088s] [ 16%] 2025-12-04T13:28:26.4648659Z test_ops.py::TestCommonCUDA::test_out__refs_nn_functional_triplet_margin_loss_cuda_float32 PASSED [0.7092s] [ 17%] 2025-12-04T13:28:26.4648752Z test_ops.py::TestCommonCUDA::test_out__refs_prod_cuda_float32 XFAIL [0.0052s] [ 17%] 2025-12-04T13:28:26.4648846Z test_ops.py::TestCommonCUDA::test_out__refs_real_cuda_float32 PASSED [1.4822s] [ 17%] 2025-12-04T13:28:26.4648950Z test_ops.py::TestCommonCUDA::test_out__refs_reshape_cuda_float32 PASSED [0.0032s] [ 17%] 2025-12-04T13:28:26.4649060Z test_ops.py::TestCommonCUDA::test_out__refs_softmax_with_dtype_cuda_float32 PASSED [0.0214s] [ 17%] 2025-12-04T13:28:26.4649163Z test_ops.py::TestCommonCUDA::test_out__refs_special_erfcx_cuda_float32 PASSED [0.7158s] [ 17%] 2025-12-04T13:28:26.4649262Z test_ops.py::TestCommonCUDA::test_out__refs_special_i1e_cuda_float32 PASSED [0.0058s] [ 17%] 2025-12-04T13:28:26.4649393Z test_ops.py::TestCommonCUDA::test_out__refs_special_multigammaln_mvlgamma_p_1_cuda_float32 PASSED [0.0134s] [ 17%] 2025-12-04T13:28:26.4649495Z test_ops.py::TestCommonCUDA::test_out__refs_special_ndtri_cuda_float32 PASSED [0.7118s] [ 17%] 2025-12-04T13:28:26.4649600Z test_ops.py::TestCommonCUDA::test_out__refs_special_xlog1py_cuda_float32 PASSED [0.0145s] [ 17%] 2025-12-04T13:28:26.4649703Z test_ops.py::TestCommonCUDA::test_out__refs_squeeze_copy_cuda_float32 PASSED [0.7235s] [ 17%] 2025-12-04T13:28:26.4649800Z test_ops.py::TestCommonCUDA::test_out__refs_stack_cuda_float32 PASSED [0.0113s] [ 17%] 2025-12-04T13:28:26.4649892Z test_ops.py::TestCommonCUDA::test_out__refs_std_cuda_float32 PASSED [0.7300s] [ 17%] 2025-12-04T13:28:26.4649986Z test_ops.py::TestCommonCUDA::test_out__refs_triu_cuda_float32 PASSED [0.0125s] [ 17%] 2025-12-04T13:28:26.4650086Z test_ops.py::TestCommonCUDA::test_out__refs_unfold_copy_cuda_float32 PASSED [0.7361s] [ 17%] 2025-12-04T13:28:26.4650192Z test_ops.py::TestCommonCUDA::test_out__refs_view_as_complex_cuda_float32 PASSED [0.0031s] [ 17%] 2025-12-04T13:28:26.4650298Z test_ops.py::TestCommonCUDA::test_out__refs_view_as_cuda_float32 PASSED [0.7275s] [ 17%] 2025-12-04T13:28:26.4650395Z test_ops.py::TestCommonCUDA::test_out__refs_vstack_cuda_float32 PASSED [0.0078s] [ 17%] 2025-12-04T13:28:26.4650487Z test_ops.py::TestCommonCUDA::test_out__refs_where_cuda_float32 PASSED [0.7377s] [ 17%] 2025-12-04T13:28:26.4650600Z test_ops.py::TestCommonCUDA::test_out__segment_reduce_lengths_cuda_float32 PASSED [0.0033s] [ 17%] 2025-12-04T13:28:26.4650705Z test_ops.py::TestCommonCUDA::test_out__unsafe_masked_index_cuda_float32 PASSED [0.7254s] [ 17%] 2025-12-04T13:28:26.4650797Z test_ops.py::TestCommonCUDA::test_out_addcdiv_cuda_float32 PASSED [0.0145s] [ 17%] 2025-12-04T13:28:26.4650884Z test_ops.py::TestCommonCUDA::test_out_angle_cuda_float32 PASSED [0.7312s] [ 17%] 2025-12-04T13:28:26.4650984Z test_ops.py::TestCommonCUDA::test_out_as_strided_copy_cuda_float32 PASSED [0.0080s] [ 17%] 2025-12-04T13:28:26.4651095Z test_ops.py::TestCommonCUDA::test_out_as_strided_partial_views_cuda_float32 PASSED [0.7072s] [ 17%] 2025-12-04T13:28:26.4651198Z test_ops.py::TestCommonCUDA::test_out_as_strided_scatter_cuda_float32 PASSED [0.0031s] [ 17%] 2025-12-04T13:28:26.4651295Z test_ops.py::TestCommonCUDA::test_out_asin_cuda_float32 PASSED [0.7121s] [ 17%] 2025-12-04T13:28:26.4651388Z test_ops.py::TestCommonCUDA::test_out_atleast_1d_cuda_float32 PASSED [0.0027s] [ 17%] 2025-12-04T13:28:26.4651477Z test_ops.py::TestCommonCUDA::test_out_bincount_cuda_int64 PASSED [0.7103s] [ 17%] 2025-12-04T13:28:26.4651580Z test_ops.py::TestCommonCUDA::test_out_bitwise_and_cuda_int64 PASSED [0.0081s] [ 17%] 2025-12-04T13:28:26.4651681Z test_ops.py::TestCommonCUDA::test_out_broadcast_shapes_cuda_float32 PASSED [0.0020s] [ 17%] 2025-12-04T13:28:26.4651767Z test_ops.py::TestCommonCUDA::test_out_byte_cuda_float32 PASSED [0.7095s] [ 17%] 2025-12-04T13:28:26.4651899Z test_ops.py::TestCommonCUDA::test_out_cholesky_inverse_cuda_float32 XFAIL [0.0078s] [ 17%] 2025-12-04T13:28:26.4651996Z test_ops.py::TestCommonCUDA::test_out_constant_pad_nd_cuda_float32 PASSED [1.4693s] [ 17%] 2025-12-04T13:28:26.4652089Z test_ops.py::TestCommonCUDA::test_out_copysign_cuda_float32 PASSED [0.0235s] [ 17%] 2025-12-04T13:28:26.4652175Z test_ops.py::TestCommonCUDA::test_out_cos_cuda_float32 PASSED [0.7391s] [ 17%] 2025-12-04T13:28:26.4652265Z test_ops.py::TestCommonCUDA::test_out_cumprod_cuda_float32 XFAIL [0.0047s] [ 17%] 2025-12-04T13:28:26.4652372Z test_ops.py::TestCommonCUDA::test_out_deg2rad_cuda_float32 PASSED [1.4405s] [ 17%] 2025-12-04T13:28:26.4652457Z test_ops.py::TestCommonCUDA::test_out_diag_cuda_float32 PASSED [0.0164s] [ 17%] 2025-12-04T13:28:26.4652547Z test_ops.py::TestCommonCUDA::test_out_diagflat_cuda_float32 PASSED [0.7072s] [ 17%] 2025-12-04T13:28:26.4652652Z test_ops.py::TestCommonCUDA::test_out_div_floor_rounding_cuda_float32 PASSED [0.0118s] [ 17%] 2025-12-04T13:28:26.4652791Z test_ops.py::TestCommonCUDA::test_out_empty_cuda_float32 SKIPPED [0.0002s] (Expected: empty is not comparable) [ 17%] 2025-12-04T13:28:26.4652883Z test_ops.py::TestCommonCUDA::test_out_expand_as_cuda_float32 PASSED [0.7048s] [ 17%] 2025-12-04T13:28:26.4652970Z test_ops.py::TestCommonCUDA::test_out_expand_cuda_float32 PASSED [0.0029s] [ 17%] 2025-12-04T13:28:26.4653059Z test_ops.py::TestCommonCUDA::test_out_expm1_cuda_float32 PASSED [0.7058s] [ 17%] 2025-12-04T13:28:26.4653153Z test_ops.py::TestCommonCUDA::test_out_exponential_cuda_float32 PASSED [0.0034s] [ 17%] 2025-12-04T13:28:26.4653244Z test_ops.py::TestCommonCUDA::test_out_fft_fftn_cuda_float32 PASSED [0.7399s] [ 17%] 2025-12-04T13:28:26.4653333Z test_ops.py::TestCommonCUDA::test_out_fft_hfft2_cuda_float32 PASSED [0.0112s] [ 17%] 2025-12-04T13:28:26.4653424Z test_ops.py::TestCommonCUDA::test_out_fft_ifft_cuda_float32 PASSED [0.7155s] [ 17%] 2025-12-04T13:28:26.4653511Z test_ops.py::TestCommonCUDA::test_out_fft_ihfft2_cuda_float32 XFAIL [0.0089s] [ 17%] 2025-12-04T13:28:26.4653601Z test_ops.py::TestCommonCUDA::test_out_fft_rfftn_cuda_float32 PASSED [1.4501s] [ 17%] 2025-12-04T13:28:26.4653701Z test_ops.py::TestCommonCUDA::test_out_float_cuda_float32 PASSED [0.0030s] [ 17%] 2025-12-04T13:28:26.4653788Z test_ops.py::TestCommonCUDA::test_out_floor_cuda_float32 PASSED [0.7071s] [ 17%] 2025-12-04T13:28:26.4653876Z test_ops.py::TestCommonCUDA::test_out_frac_cuda_float32 PASSED [0.0043s] [ 17%] 2025-12-04T13:28:26.4653959Z test_ops.py::TestCommonCUDA::test_out_gcd_cuda_int64 PASSED [0.0070s] [ 17%] 2025-12-04T13:28:26.4654049Z test_ops.py::TestCommonCUDA::test_out_gradient_cuda_float32 PASSED [0.7072s] [ 17%] 2025-12-04T13:28:26.4654134Z test_ops.py::TestCommonCUDA::test_out_gt_cuda_float32 PASSED [0.0082s] [ 17%] 2025-12-04T13:28:26.4654220Z test_ops.py::TestCommonCUDA::test_out_hypot_cuda_float32 PASSED [0.0093s] [ 17%] 2025-12-04T13:28:26.4654307Z test_ops.py::TestCommonCUDA::test_out_imag_cuda_complex64 PASSED [0.7166s] [ 17%] 2025-12-04T13:28:26.4654399Z test_ops.py::TestCommonCUDA::test_out_index_put_cuda_float32 PASSED [0.0031s] [ 17%] 2025-12-04T13:28:26.4654501Z test_ops.py::TestCommonCUDA::test_out_index_reduce_prod_cuda_float32 PASSED [0.7250s] [ 17%] 2025-12-04T13:28:26.4654586Z test_ops.py::TestCommonCUDA::test_out_isin_cuda_float32 PASSED [0.0056s] [ 17%] 2025-12-04T13:28:26.4654688Z test_ops.py::TestCommonCUDA::test_out_isreal_cuda_float32 PASSED [0.7102s] [ 17%] 2025-12-04T13:28:26.4654781Z test_ops.py::TestCommonCUDA::test_out_linalg_eig_cuda_float32 PASSED [0.3170s] [ 17%] 2025-12-04T13:28:26.4654879Z test_ops.py::TestCommonCUDA::test_out_linalg_eigvals_cuda_float32 PASSED [0.0159s] [ 17%] 2025-12-04T13:28:26.4654991Z test_ops.py::TestCommonCUDA::test_out_linalg_lu_factor_cuda_float32 PASSED [0.0634s] [ 17%] 2025-12-04T13:28:26.4655089Z test_ops.py::TestCommonCUDA::test_out_linalg_lu_solve_cuda_float32 PASSED [0.0759s] [ 17%] 2025-12-04T13:28:26.4655184Z test_ops.py::TestCommonCUDA::test_out_linalg_solve_cuda_float32 PASSED [0.0268s] [ 18%] 2025-12-04T13:28:26.4655295Z test_ops.py::TestCommonCUDA::test_out_linalg_solve_triangular_cuda_float32 PASSED [0.0968s] [ 18%] 2025-12-04T13:28:26.4655398Z test_ops.py::TestCommonCUDA::test_out_linalg_tensorinv_cuda_float32 PASSED [0.7116s] [ 18%] 2025-12-04T13:28:26.4655488Z test_ops.py::TestCommonCUDA::test_out_log_normal_cuda_float32 PASSED [0.0098s] [ 18%] 2025-12-04T13:28:26.4655585Z test_ops.py::TestCommonCUDA::test_out_logcumsumexp_cuda_float32 PASSED [0.7122s] [ 18%] 2025-12-04T13:28:26.4655690Z test_ops.py::TestCommonCUDA::test_out_logical_xor_cuda_float32 PASSED [0.0109s] [ 18%] 2025-12-04T13:28:26.4655778Z test_ops.py::TestCommonCUDA::test_out_lu_solve_cuda_float32 PASSED [0.0290s] [ 18%] 2025-12-04T13:28:26.4655870Z test_ops.py::TestCommonCUDA::test_out_lu_unpack_cuda_float32 PASSED [0.0750s] [ 18%] 2025-12-04T13:28:26.4655965Z test_ops.py::TestCommonCUDA::test_out_masked_argmax_cuda_float32 PASSED [0.7131s] [ 18%] 2025-12-04T13:28:26.4656060Z test_ops.py::TestCommonCUDA::test_out_masked_argmin_cuda_float32 PASSED [0.0107s] [ 18%] 2025-12-04T13:28:26.4656161Z test_ops.py::TestCommonCUDA::test_out_masked_log_softmax_cuda_float32 PASSED [0.7004s] [ 18%] 2025-12-04T13:28:26.4656264Z test_ops.py::TestCommonCUDA::test_out_masked_logaddexp_cuda_float32 PASSED [0.0032s] [ 18%] 2025-12-04T13:28:26.4656364Z test_ops.py::TestCommonCUDA::test_out_masked_scatter_cuda_float32 PASSED [0.7149s] [ 18%] 2025-12-04T13:28:26.4656504Z test_ops.py::TestCommonCUDA::test_out_max_pool2d_with_indices_backward_cuda_float32 SKIPPED [0.0002s] (Skipped!) [ 18%] 2025-12-04T13:28:26.4656604Z test_ops.py::TestCommonCUDA::test_out_mean_cuda_float32 SKIPPED [0.0001s] (Skipped!) [ 18%] 2025-12-04T13:28:26.4656693Z test_ops.py::TestCommonCUDA::test_out_median_cuda_float32 PASSED [0.7167s] [ 18%] 2025-12-04T13:28:26.4656802Z test_ops.py::TestCommonCUDA::test_out_meshgrid_list_of_tensors_cuda_float32 PASSED [0.0030s] [ 18%] 2025-12-04T13:28:26.4656889Z test_ops.py::TestCommonCUDA::test_out_mode_cuda_float32 PASSED [0.7518s] [ 18%] 2025-12-04T13:28:26.4656978Z test_ops.py::TestCommonCUDA::test_out_nanmedian_cuda_float32 PASSED [0.0033s] [ 18%] 2025-12-04T13:28:26.4657091Z test_ops.py::TestCommonCUDA::test_out_new_empty_strided_cuda_float32 PASSED [0.7193s] [ 18%] 2025-12-04T13:28:26.4657181Z test_ops.py::TestCommonCUDA::test_out_nextafter_cuda_float32 PASSED [0.0311s] [ 18%] 2025-12-04T13:28:26.4657308Z test_ops.py::TestCommonCUDA::test_out_nn_functional_adaptive_max_pool2d_cuda_float32 PASSED [0.7168s] [ 18%] 2025-12-04T13:28:26.4657424Z test_ops.py::TestCommonCUDA::test_out_nn_functional_alpha_dropout_cuda_float32 PASSED [0.0033s] [ 18%] 2025-12-04T13:28:26.4657556Z test_ops.py::TestCommonCUDA::test_out_nn_functional_batch_norm_without_cudnn_cuda_float32 PASSED [0.7047s] [ 18%] 2025-12-04T13:28:26.4657700Z test_ops.py::TestCommonCUDA::test_out_nn_functional_binary_cross_entropy_with_logits_cuda_float32 PASSED [0.0033s] [ 18%] 2025-12-04T13:28:26.4657807Z test_ops.py::TestCommonCUDA::test_out_nn_functional_conv3d_cuda_float32 PASSED [0.7209s] [ 18%] 2025-12-04T13:28:26.4657951Z test_ops.py::TestCommonCUDA::test_out_nn_functional_feature_alpha_dropout_with_train_cuda_float32 PASSED [0.0031s] [ 18%] 2025-12-04T13:28:26.4658063Z test_ops.py::TestCommonCUDA::test_out_nn_functional_huber_loss_cuda_float32 PASSED [0.7295s] [ 18%] 2025-12-04T13:28:26.4658184Z test_ops.py::TestCommonCUDA::test_out_nn_functional_l1_loss_cuda_float32 PASSED [0.0030s] [ 18%] 2025-12-04T13:28:26.4658307Z test_ops.py::TestCommonCUDA::test_out_nn_functional_margin_ranking_loss_cuda_float32 PASSED [0.7097s] [ 18%] 2025-12-04T13:28:26.4658425Z test_ops.py::TestCommonCUDA::test_out_nn_functional_max_unpool2d_cuda_float32 PASSED [0.0031s] [ 18%] 2025-12-04T13:28:26.4658551Z test_ops.py::TestCommonCUDA::test_out_nn_functional_normalize_cuda_float32 PASSED [0.7074s] [ 18%] 2025-12-04T13:28:26.4658668Z test_ops.py::TestCommonCUDA::test_out_nn_functional_pad_circular_cuda_float32 PASSED [0.0027s] [ 18%] 2025-12-04T13:28:26.4658795Z test_ops.py::TestCommonCUDA::test_out_nn_functional_pad_replicate_negative_cuda_float32 PASSED [0.7028s] [ 18%] 2025-12-04T13:28:26.4658918Z test_ops.py::TestCommonCUDA::test_out_nn_functional_poisson_nll_loss_cuda_float32 PASSED [0.0033s] [ 18%] 2025-12-04T13:28:26.4659023Z test_ops.py::TestCommonCUDA::test_out_nn_functional_relu_cuda_float32 PASSED [0.7224s] [ 18%] 2025-12-04T13:28:26.4659147Z test_ops.py::TestCommonCUDA::test_out_nn_functional_soft_margin_loss_cuda_float32 PASSED [0.0031s] [ 18%] 2025-12-04T13:28:26.4659259Z test_ops.py::TestCommonCUDA::test_out_nn_functional_softshrink_cuda_float32 PASSED [0.7066s] [ 18%] 2025-12-04T13:28:26.4659382Z test_ops.py::TestCommonCUDA::test_out_nn_functional_tanhshrink_cuda_float32 PASSED [0.0108s] [ 18%] 2025-12-04T13:28:26.4659507Z test_ops.py::TestCommonCUDA::test_out_nn_functional_upsample_bilinear_cuda_float32 PASSED [0.7097s] [ 18%] 2025-12-04T13:28:26.4659595Z test_ops.py::TestCommonCUDA::test_out_norm_cuda_float32 PASSED [0.0354s] [ 18%] 2025-12-04T13:28:26.4659690Z test_ops.py::TestCommonCUDA::test_out_norm_nuc_cuda_float32 PASSED [0.7226s] [ 18%] 2025-12-04T13:28:26.4659777Z test_ops.py::TestCommonCUDA::test_out_ones_cuda_float32 PASSED [0.0051s] [ 18%] 2025-12-04T13:28:26.4659873Z test_ops.py::TestCommonCUDA::test_out_ones_like_cuda_float32 PASSED [0.7059s] [ 18%] 2025-12-04T13:28:26.4659959Z test_ops.py::TestCommonCUDA::test_out_pow_cuda_float32 PASSED [0.0109s] [ 18%] 2025-12-04T13:28:26.4660051Z test_ops.py::TestCommonCUDA::test_out_randint_cuda_float32 XFAIL [0.0029s] [ 18%] 2025-12-04T13:28:26.4660145Z test_ops.py::TestCommonCUDA::test_out_randn_like_cuda_float32 PASSED [0.7187s] [ 18%] 2025-12-04T13:28:26.4660235Z test_ops.py::TestCommonCUDA::test_out_ravel_cuda_float32 PASSED [0.0027s] [ 18%] 2025-12-04T13:28:26.4660321Z test_ops.py::TestCommonCUDA::test_out_real_cuda_float32 PASSED [0.7108s] [ 18%] 2025-12-04T13:28:26.4660412Z test_ops.py::TestCommonCUDA::test_out_renorm_cuda_float32 PASSED [0.0069s] [ 18%] 2025-12-04T13:28:26.4660525Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_abs_cuda_complex64 PASSED [0.7074s] [ 18%] 2025-12-04T13:28:26.4660641Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmm_cuda_float32 PASSED [0.0032s] [ 18%] 2025-12-04T13:28:26.4660782Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmm_decomposed_cuda_complex64 PASSED [0.7104s] [ 18%] 2025-12-04T13:28:26.4660914Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmm_decomposed_cuda_float32 PASSED [0.0032s] [ 18%] 2025-12-04T13:28:26.4661029Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_addmv_cuda_complex64 PASSED [0.7172s] [ 18%] 2025-12-04T13:28:26.4661156Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_alias_copy_cuda_complex64 PASSED [0.0029s] [ 18%] 2025-12-04T13:28:26.4661274Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_angle_cuda_complex64 PASSED [0.7070s] [ 18%] 2025-12-04T13:28:26.4661386Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_angle_cuda_float32 PASSED [0.0029s] [ 18%] 2025-12-04T13:28:26.4661501Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_atanh_cuda_complex64 PASSED [0.7031s] [ 18%] 2025-12-04T13:28:26.4661613Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_atanh_cuda_float32 PASSED [0.0093s] [ 18%] 2025-12-04T13:28:26.4661726Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cat_cuda_float32 PASSED [0.7176s] [ 18%] 2025-12-04T13:28:26.4661903Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cholesky_inverse_cuda_complex64 PASSED [0.0045s] [ 18%] 2025-12-04T13:28:26.4662021Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cos_cuda_complex64 PASSED [0.7026s] [ 18%] 2025-12-04T13:28:26.4662154Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cumprod_cuda_complex64 PASSED [0.0028s] [ 18%] 2025-12-04T13:28:26.4662274Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_cumsum_cuda_complex64 PASSED [0.7040s] [ 18%] 2025-12-04T13:28:26.4662385Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diff_cuda_complex64 PASSED [0.0030s] [ 18%] 2025-12-04T13:28:26.4662500Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_diff_cuda_float32 PASSED [0.7123s] [ 18%] 2025-12-04T13:28:26.4662618Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_dstack_cuda_complex64 PASSED [0.0030s] [ 19%] 2025-12-04T13:28:26.4662731Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_erf_cuda_float32 PASSED [0.7186s] [ 19%] 2025-12-04T13:28:26.4662845Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_exp2_cuda_complex64 PASSED [0.0045s] [ 19%] 2025-12-04T13:28:26.4662985Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_expand_copy_cuda_complex64 PASSED [0.7164s] [ 19%] 2025-12-04T13:28:26.4663100Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_expm1_cuda_complex64 PASSED [0.0030s] [ 19%] 2025-12-04T13:28:26.4663214Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_expm1_cuda_float32 PASSED [0.7202s] [ 19%] 2025-12-04T13:28:26.4663332Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fft_cuda_complex64 PASSED [0.0033s] [ 19%] 2025-12-04T13:28:26.4663451Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fftn_cuda_complex64 PASSED [0.7141s] [ 19%] 2025-12-04T13:28:26.4663569Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_fftn_cuda_float32 PASSED [0.0034s] [ 19%] 2025-12-04T13:28:26.4663689Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_hfft2_cuda_complex64 PASSED [0.7256s] [ 19%] 2025-12-04T13:28:26.4663808Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_ifft2_cuda_float32 PASSED [0.0034s] [ 19%] 2025-12-04T13:28:26.4663927Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_irfft2_cuda_float32 PASSED [0.7110s] [ 19%] 2025-12-04T13:28:26.4664045Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_fft_rfftn_cuda_float32 PASSED [0.0033s] [ 19%] 2025-12-04T13:28:26.4664157Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_gather_cuda_float32 PASSED [0.7006s] [ 19%] 2025-12-04T13:28:26.4664276Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_index_add_cuda_float32 PASSED [0.0032s] [ 19%] 2025-12-04T13:28:26.4664400Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_kron_cuda_float32 PASSED [0.7156s] [ 19%] 2025-12-04T13:28:26.4664518Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_ldexp_cuda_complex64 PASSED [0.0031s] [ 19%] 2025-12-04T13:28:26.4664639Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_cross_cuda_float32 PASSED [0.7065s] [ 19%] 2025-12-04T13:28:26.4664761Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_eig_cuda_float32 PASSED [0.0143s] [ 19%] 2025-12-04T13:28:26.4664879Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_inv_cuda_float32 PASSED [0.7563s] [ 19%] 2025-12-04T13:28:26.4665004Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_cuda_complex64 PASSED [0.0034s] [ 19%] 2025-12-04T13:28:26.4665139Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_lu_factor_cuda_complex64 PASSED [0.7399s] [ 19%] 2025-12-04T13:28:26.4665261Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_norm_cuda_complex64 PASSED [0.0032s] [ 19%] 2025-12-04T13:28:26.4665415Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_norm_subgradients_at_zero_cuda_float32 PASSED [0.7164s] [ 19%] 2025-12-04T13:28:26.4665554Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_slogdet_cuda_float32 PASSED [0.0034s] [ 19%] 2025-12-04T13:28:26.4665681Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_cuda_complex64 PASSED [0.7065s] [ 19%] 2025-12-04T13:28:26.4665809Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_solve_ex_cuda_complex64 PASSED [0.0040s] [ 19%] 2025-12-04T13:28:26.4665948Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_svdvals_cuda_complex64 PASSED [0.7170s] [ 19%] 2025-12-04T13:28:26.4666074Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_tensorinv_cuda_float32 PASSED [0.0046s] [ 19%] 2025-12-04T13:28:26.4666201Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_vecdot_cuda_complex64 PASSED [0.7050s] [ 19%] 2025-12-04T13:28:26.4666330Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_linalg_vector_norm_cuda_float32 PASSED [0.0030s] [ 19%] 2025-12-04T13:28:26.4666448Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log1p_cuda_complex64 PASSED [0.6983s] [ 19%] 2025-12-04T13:28:26.4666560Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log1p_cuda_float32 PASSED [0.0029s] [ 19%] 2025-12-04T13:28:26.4666715Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log_softmax_with_dtype_cuda_complex64 PASSED [0.7056s] [ 19%] 2025-12-04T13:28:26.4666854Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_log_softmax_with_dtype_cuda_float32 PASSED [0.0030s] [ 19%] 2025-12-04T13:28:26.4666973Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_logspace_cuda_complex64 PASSED [0.0021s] [ 19%] 2025-12-04T13:28:26.4667086Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_cuda_float32 PASSED [0.7019s] [ 19%] 2025-12-04T13:28:26.4667204Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_solve_cuda_complex64 PASSED [0.0042s] [ 19%] 2025-12-04T13:28:26.4667328Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_lu_unpack_cuda_complex64 PASSED [0.7296s] [ 19%] 2025-12-04T13:28:26.4667444Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_matmul_cuda_complex64 PASSED [0.0031s] [ 19%] 2025-12-04T13:28:26.4667559Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_matmul_cuda_float32 PASSED [0.7249s] [ 19%] 2025-12-04T13:28:26.4667678Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_min_binary_cuda_float32 PASSED [0.0031s] [ 19%] 2025-12-04T13:28:26.4667813Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_min_reduction_no_dim_cuda_float32 PASSED [0.7182s] [ 19%] 2025-12-04T13:28:26.4667923Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mm_cuda_float32 PASSED [0.0032s] [ 19%] 2025-12-04T13:28:26.4668037Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mul_cuda_float32 PASSED [0.7198s] [ 19%] 2025-12-04T13:28:26.4668181Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_mvlgamma_mvlgamma_p_3_cuda_float32 PASSED [0.0031s] [ 19%] 2025-12-04T13:28:26.4668322Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_avg_pool2d_cuda_float32 PASSED [0.7217s] [ 19%] 2025-12-04T13:28:26.4668464Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_nn_functional_softshrink_cuda_float32 PASSED [0.0030s] [ 19%] 2025-12-04T13:28:26.4668582Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_norm_nuc_cuda_complex64 PASSED [0.7188s] [ 19%] 2025-12-04T13:28:26.4668700Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_normal_cuda_float32 PASSED [0.7163s] [ 19%] 2025-12-04T13:28:26.4668823Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_permute_copy_cuda_complex64 PASSED [0.0031s] [ 19%] 2025-12-04T13:28:26.4668942Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_quantile_cuda_float32 PASSED [0.7321s] [ 19%] 2025-12-04T13:28:26.4669055Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_rad2deg_cuda_float32 PASSED [0.0030s] [ 19%] 2025-12-04T13:28:26.4669200Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_round_decimals_3_cuda_float32 SKIPPED [0.0002s] (Skipped!) [ 19%] 2025-12-04T13:28:26.4669334Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_add_cuda_complex64 PASSED [0.7124s] [ 19%] 2025-12-04T13:28:26.4669469Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_reduce_amax_cuda_float32 PASSED [0.0045s] [ 19%] 2025-12-04T13:28:26.4669610Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_reduce_mean_cuda_float32 PASSED [0.7105s] [ 19%] 2025-12-04T13:28:26.4669743Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_scatter_reduce_sum_cuda_float32 PASSED [0.0044s] [ 19%] 2025-12-04T13:28:26.4669854Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sgn_cuda_complex64 PASSED [0.7121s] [ 19%] 2025-12-04T13:28:26.4669968Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sinc_cuda_complex64 PASSED [0.0042s] [ 19%] 2025-12-04T13:28:26.4670080Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sinc_cuda_float32 PASSED [0.7256s] [ 19%] 2025-12-04T13:28:26.4670195Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sort_cuda_float32 PASSED [0.0031s] [ 19%] 2025-12-04T13:28:26.4670331Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_sparse_sampled_addmm_cuda_complex64 PASSED [0.7221s] [ 19%] 2025-12-04T13:28:26.4670463Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_square_cuda_float32 PASSED [0.0032s] [ 19%] 2025-12-04T13:28:26.4670589Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_squeeze_copy_cuda_complex64 PASSED [0.7267s] [ 19%] 2025-12-04T13:28:26.4670702Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_stack_cuda_complex64 PASSED [0.0030s] [ 19%] 2025-12-04T13:28:26.4670815Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_tanh_cuda_complex64 PASSED [0.7068s] [ 19%] 2025-12-04T13:28:26.4670942Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_triangular_solve_cuda_float32 PASSED [0.0041s] [ 20%] 2025-12-04T13:28:26.4671055Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_trunc_cuda_float32 PASSED [0.7183s] [ 20%] 2025-12-04T13:28:26.4671178Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unbind_copy_cuda_complex64 PASSED [0.0030s] [ 20%] 2025-12-04T13:28:26.4671305Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_unsqueeze_copy_cuda_float32 PASSED [0.7131s] [ 20%] 2025-12-04T13:28:26.4671417Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_var_cuda_complex64 PASSED [0.0030s] [ 20%] 2025-12-04T13:28:26.4671533Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_where_cuda_float32 PASSED [0.7033s] [ 20%] 2025-12-04T13:28:26.4671646Z test_ops.py::TestCommonCUDA::test_out_requires_grad_error_zeros_cuda_complex64 PASSED [0.0027s] [ 20%] 2025-12-04T13:28:26.4671743Z test_ops.py::TestCommonCUDA::test_out_resize__cuda_float32 PASSED [0.7262s] [ 20%] 2025-12-04T13:28:26.4671919Z test_ops.py::TestCommonCUDA::test_out_resolve_neg_cuda_float32 PASSED [0.0029s] [ 20%] 2025-12-04T13:28:26.4672013Z test_ops.py::TestCommonCUDA::test_out_round_cuda_float32 PASSED [0.7250s] [ 20%] 2025-12-04T13:28:26.4672104Z test_ops.py::TestCommonCUDA::test_out_scatter_cuda_float32 PASSED [0.0256s] [ 20%] 2025-12-04T13:28:26.4672194Z test_ops.py::TestCommonCUDA::test_out_short_cuda_float32 PASSED [0.7465s] [ 20%] 2025-12-04T13:28:26.4672312Z test_ops.py::TestCommonCUDA::test_out_signal_windows_exponential_cuda_float32 PASSED [0.0025s] [ 20%] 2025-12-04T13:28:26.4672425Z test_ops.py::TestCommonCUDA::test_out_signal_windows_gaussian_cuda_float32 PASSED [0.0020s] [ 20%] 2025-12-04T13:28:26.4672535Z test_ops.py::TestCommonCUDA::test_out_signal_windows_hamming_cuda_float32 PASSED [0.0020s] [ 20%] 2025-12-04T13:28:26.4672625Z test_ops.py::TestCommonCUDA::test_out_signbit_cuda_float32 PASSED [0.0031s] [ 20%] 2025-12-04T13:28:26.4672714Z test_ops.py::TestCommonCUDA::test_out_slice_cuda_float32 PASSED [0.7033s] [ 20%] 2025-12-04T13:28:26.4672811Z test_ops.py::TestCommonCUDA::test_out_slice_scatter_cuda_float32 PASSED [0.0122s] [ 20%] 2025-12-04T13:28:26.4672899Z test_ops.py::TestCommonCUDA::test_out_sort_cuda_float32 PASSED [0.0403s] [ 20%] 2025-12-04T13:28:26.4673050Z test_ops.py::TestCommonCUDA::test_out_sparse_mm_reduce_cuda_float32 SKIPPED [0.0006s] (Only runs on cpu) [ 20%] 2025-12-04T13:28:26.4673157Z test_ops.py::TestCommonCUDA::test_out_special_bessel_j1_cuda_float32 PASSED [0.7302s] [ 20%] 2025-12-04T13:28:26.4673278Z test_ops.py::TestCommonCUDA::test_out_special_chebyshev_polynomial_w_cuda_float32 PASSED [0.0107s] [ 20%] 2025-12-04T13:28:26.4673389Z test_ops.py::TestCommonCUDA::test_out_special_entr_cuda_float32 PASSED [0.1461s] [ 20%] 2025-12-04T13:28:26.4673485Z test_ops.py::TestCommonCUDA::test_out_special_erfcx_cuda_float32 PASSED [0.9026s] [ 20%] 2025-12-04T13:28:26.4673583Z test_ops.py::TestCommonCUDA::test_out_special_i1e_cuda_float32 PASSED [0.0051s] [ 20%] 2025-12-04T13:28:26.4673679Z test_ops.py::TestCommonCUDA::test_out_special_zeta_cuda_float32 PASSED [0.0092s] [ 20%] 2025-12-04T13:28:26.4673772Z test_ops.py::TestCommonCUDA::test_out_square_cuda_float32 PASSED [0.7200s] [ 20%] 2025-12-04T13:28:26.4673861Z test_ops.py::TestCommonCUDA::test_out_squeeze_cuda_float32 PASSED [0.0029s] [ 20%] 2025-12-04T13:28:26.4673950Z test_ops.py::TestCommonCUDA::test_out_stft_cuda_float32 PASSED [0.7214s] [ 20%] 2025-12-04T13:28:26.4674049Z test_ops.py::TestCommonCUDA::test_out_svd_cuda_float32 PASSED [0.3894s] [ 20%] 2025-12-04T13:28:26.4674139Z test_ops.py::TestCommonCUDA::test_out_t_copy_cuda_float32 PASSED [0.7358s] [ 20%] 2025-12-04T13:28:26.4674227Z test_ops.py::TestCommonCUDA::test_out_take_cuda_float32 PASSED [0.0101s] [ 20%] 2025-12-04T13:28:26.4674352Z test_ops.py::TestCommonCUDA::test_out_torch__scaled_mm_v2_cuda_float8_e4m3fn SKIPPED [0.0002s] (Skipped!) [ 20%] 2025-12-04T13:28:26.4674454Z test_ops.py::TestCommonCUDA::test_out_triangular_solve_cuda_float32 XFAIL [0.0079s] [ 20%] 2025-12-04T13:28:26.4674540Z test_ops.py::TestCommonCUDA::test_out_trunc_cuda_float32 PASSED [1.4284s] [ 20%] 2025-12-04T13:28:26.4674632Z test_ops.py::TestCommonCUDA::test_out_uniform_cuda_float32 PASSED [0.0033s] [ 20%] 2025-12-04T13:28:26.4674732Z test_ops.py::TestCommonCUDA::test_out_view_as_complex_cuda_float32 PASSED [0.7062s] [ 20%] 2025-12-04T13:28:26.4674823Z test_ops.py::TestCommonCUDA::test_out_view_as_cuda_float32 PASSED [0.0030s] [ 20%] 2025-12-04T13:28:26.4674921Z test_ops.py::TestCommonCUDA::test_out_view_as_real_cuda_complex64 PASSED [0.7183s] [ 20%] 2025-12-04T13:28:26.4675015Z test_ops.py::TestCommonCUDA::test_out_warning___rdiv___cuda PASSED [0.0032s] [ 20%] 2025-12-04T13:28:26.4675105Z test_ops.py::TestCommonCUDA::test_out_warning___rpow___cuda PASSED [0.7085s] [ 20%] 2025-12-04T13:28:26.4675196Z test_ops.py::TestCommonCUDA::test_out_warning___rsub___cuda PASSED [0.0030s] [ 20%] 2025-12-04T13:28:26.4675307Z test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_bool_cuda PASSED [0.7240s] [ 20%] 2025-12-04T13:28:26.4675436Z test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_cfloat_cuda PASSED [0.0030s] [ 20%] 2025-12-04T13:28:26.4675547Z test_ops.py::TestCommonCUDA::test_out_warning__refs__conversions_short_cuda PASSED [0.7108s] [ 20%] 2025-12-04T13:28:26.4675642Z test_ops.py::TestCommonCUDA::test_out_warning__refs_add_cuda PASSED [0.0293s] [ 20%] 2025-12-04T13:28:26.4675735Z test_ops.py::TestCommonCUDA::test_out_warning__refs_amax_cuda PASSED [0.7546s] [ 20%] 2025-12-04T13:28:26.4675831Z test_ops.py::TestCommonCUDA::test_out_warning__refs_atan_cuda PASSED [0.0080s] [ 20%] 2025-12-04T13:28:26.4675933Z test_ops.py::TestCommonCUDA::test_out_warning__refs_bitwise_and_cuda PASSED [0.0191s] [ 20%] 2025-12-04T13:28:26.4676026Z test_ops.py::TestCommonCUDA::test_out_warning__refs_cat_cuda PASSED [0.7233s] [ 20%] 2025-12-04T13:28:26.4676121Z test_ops.py::TestCommonCUDA::test_out_warning__refs_clone_cuda PASSED [0.0032s] [ 20%] 2025-12-04T13:28:26.4676223Z test_ops.py::TestCommonCUDA::test_out_warning__refs_contiguous_cuda PASSED [0.7016s] [ 20%] 2025-12-04T13:28:26.4676321Z test_ops.py::TestCommonCUDA::test_out_warning__refs_cumsum_cuda PASSED [0.0159s] [ 20%] 2025-12-04T13:28:26.4676427Z test_ops.py::TestCommonCUDA::test_out_warning__refs_diagonal_copy_cuda PASSED [0.7399s] [ 20%] 2025-12-04T13:28:26.4676552Z test_ops.py::TestCommonCUDA::test_out_warning__refs_diagonal_scatter_cuda PASSED [0.0378s] [ 20%] 2025-12-04T13:28:26.4676661Z test_ops.py::TestCommonCUDA::test_out_warning__refs_div_floor_rounding_cuda PASSED [0.0566s] [ 20%] 2025-12-04T13:28:26.4676781Z test_ops.py::TestCommonCUDA::test_out_warning__refs_div_trunc_rounding_cuda PASSED [0.0191s] [ 20%] 2025-12-04T13:28:26.4676931Z test_ops.py::TestCommonCUDA::test_out_warning__refs_empty_like_cuda SKIPPED [0.0001s] (Expected: empty is not comparable) [ 20%] 2025-12-04T13:28:26.4677025Z test_ops.py::TestCommonCUDA::test_out_warning__refs_eq_cuda PASSED [0.0184s] [ 20%] 2025-12-04T13:28:26.4677120Z test_ops.py::TestCommonCUDA::test_out_warning__refs_expm1_cuda PASSED [0.7239s] [ 20%] 2025-12-04T13:28:26.4677221Z test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_hfft_cuda PASSED [0.0242s] [ 20%] 2025-12-04T13:28:26.4677319Z test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ifft2_cuda PASSED [0.7425s] [ 20%] 2025-12-04T13:28:26.4677420Z test_ops.py::TestCommonCUDA::test_out_warning__refs_fft_ifftn_cuda PASSED [0.0268s] [ 20%] 2025-12-04T13:28:26.4677534Z test_ops.py::TestCommonCUDA::test_out_warning__refs_floor_divide_cuda PASSED [0.0569s] [ 20%] 2025-12-04T13:28:26.4677629Z test_ops.py::TestCommonCUDA::test_out_warning__refs_fmin_cuda PASSED [0.0175s] [ 20%] 2025-12-04T13:28:26.4677723Z test_ops.py::TestCommonCUDA::test_out_warning__refs_fmod_cuda PASSED [0.0179s] [ 20%] 2025-12-04T13:28:26.4677816Z test_ops.py::TestCommonCUDA::test_out_warning__refs_ge_cuda PASSED [0.0165s] [ 20%] 2025-12-04T13:28:26.4677912Z test_ops.py::TestCommonCUDA::test_out_warning__refs_hsplit_cuda PASSED [0.7164s] [ 21%] 2025-12-04T13:28:26.4678010Z test_ops.py::TestCommonCUDA::test_out_warning__refs_igammac_cuda PASSED [0.0261s] [ 21%] 2025-12-04T13:28:26.4678111Z test_ops.py::TestCommonCUDA::test_out_warning__refs_index_add_cuda PASSED [0.7395s] [ 21%] 2025-12-04T13:28:26.4678207Z test_ops.py::TestCommonCUDA::test_out_warning__refs_isclose_cuda PASSED [0.0041s] [ 21%] 2025-12-04T13:28:26.4678304Z test_ops.py::TestCommonCUDA::test_out_warning__refs_isreal_cuda PASSED [0.7419s] [ 21%] 2025-12-04T13:28:26.4678502Z test_ops.py::TestCommonCUDA::test_out_warning__refs_item_cuda SKIPPED [0.0031s] (Skipped! Only supports single tensor or iterable of tensor outputs.) [ 21%] 2025-12-04T13:28:26.4678610Z test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_cross_cuda PASSED [0.7383s] [ 21%] 2025-12-04T13:28:26.4678711Z test_ops.py::TestCommonCUDA::test_out_warning__refs_linalg_norm_cuda PASSED [0.2218s] [ 21%] 2025-12-04T13:28:26.4678835Z test_ops.py::TestCommonCUDA::test_out_warning__refs_linspace_tensor_overload_cuda PASSED [0.3242s] [ 21%] 2025-12-04T13:28:26.4678927Z test_ops.py::TestCommonCUDA::test_out_warning__refs_log_cuda PASSED [0.7221s] [ 21%] 2025-12-04T13:28:26.4679089Z test_ops.py::TestCommonCUDA::test_out_warning__refs_new_empty_cuda SKIPPED [0.0002s] (Expected: empty is not comparable) [ 21%] 2025-12-04T13:28:26.4679200Z test_ops.py::TestCommonCUDA::test_out_warning__refs_new_empty_strided_cuda PASSED [0.7198s] [ 21%] 2025-12-04T13:28:26.4679301Z test_ops.py::TestCommonCUDA::test_out_warning__refs_new_full_cuda PASSED [0.0030s] [ 21%] 2025-12-04T13:28:26.4679399Z test_ops.py::TestCommonCUDA::test_out_warning__refs_new_zeros_cuda PASSED [0.7329s] [ 21%] 2025-12-04T13:28:26.4679511Z test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_elu_cuda PASSED [0.0105s] [ 21%] 2025-12-04T13:28:26.4679630Z test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_group_norm_cuda PASSED [0.7399s] [ 21%] 2025-12-04T13:28:26.4679750Z test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_huber_loss_cuda PASSED [0.7397s] [ 21%] 2025-12-04T13:28:26.4679882Z test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pairwise_distance_cuda PASSED [0.0229s] [ 21%] 2025-12-04T13:28:26.4680004Z test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_pixel_shuffle_cuda PASSED [0.7135s] [ 21%] 2025-12-04T13:28:26.4680141Z test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_poisson_nll_loss_cuda PASSED [0.0032s] [ 21%] 2025-12-04T13:28:26.4680252Z test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_relu_cuda PASSED [0.7206s] [ 21%] 2025-12-04T13:28:26.4680388Z test_ops.py::TestCommonCUDA::test_out_warning__refs_nn_functional_threshold_cuda PASSED [0.0177s] [ 21%] 2025-12-04T13:28:26.4680532Z test_ops.py::TestCommonCUDA::test_out_warning__refs_normal_cuda SKIPPED [0.0001s] (Expected: normal is not comparable) [ 21%] 2025-12-04T13:28:26.4680629Z test_ops.py::TestCommonCUDA::test_out_warning__refs_prod_cuda PASSED [0.0709s] [ 21%] 2025-12-04T13:28:26.4680724Z test_ops.py::TestCommonCUDA::test_out_warning__refs_randn_cuda PASSED [0.7262s] [ 21%] 2025-12-04T13:28:26.4680821Z test_ops.py::TestCommonCUDA::test_out_warning__refs_real_cuda PASSED [0.0028s] [ 21%] 2025-12-04T13:28:26.4680918Z test_ops.py::TestCommonCUDA::test_out_warning__refs_renorm_cuda PASSED [0.7385s] [ 21%] 2025-12-04T13:28:26.4681019Z test_ops.py::TestCommonCUDA::test_out_warning__refs_repeat_cuda PASSED [0.0029s] [ 21%] 2025-12-04T13:28:26.4681125Z test_ops.py::TestCommonCUDA::test_out_warning__refs_sigmoid_cuda PASSED [0.0137s] [ 21%] 2025-12-04T13:28:26.4681220Z test_ops.py::TestCommonCUDA::test_out_warning__refs_sign_cuda PASSED [0.7298s] [ 21%] 2025-12-04T13:28:26.4681313Z test_ops.py::TestCommonCUDA::test_out_warning__refs_sin_cuda PASSED [0.0085s] [ 21%] 2025-12-04T13:28:26.4681420Z test_ops.py::TestCommonCUDA::test_out_warning__refs_special_ndtri_cuda PASSED [0.7471s] [ 21%] 2025-12-04T13:28:26.4681543Z test_ops.py::TestCommonCUDA::test_out_warning__refs_special_spherical_bessel_j0_cuda PASSED [0.0138s] [ 21%] 2025-12-04T13:28:26.4681637Z test_ops.py::TestCommonCUDA::test_out_warning__refs_sqrt_cuda PASSED [0.7617s] [ 21%] 2025-12-04T13:28:26.4681731Z test_ops.py::TestCommonCUDA::test_out_warning__refs_sub_cuda PASSED [0.0350s] [ 21%] 2025-12-04T13:28:26.4681835Z test_ops.py::TestCommonCUDA::test_out_warning__refs_tensor_split_cuda PASSED [0.0025s] [ 21%] 2025-12-04T13:28:26.4681970Z test_ops.py::TestCommonCUDA::test_out_warning__refs_trace_cuda PASSED [0.7460s] [ 21%] 2025-12-04T13:28:26.4682075Z test_ops.py::TestCommonCUDA::test_out_warning__refs_tril_indices_cuda PASSED [0.0029s] [ 21%] 2025-12-04T13:28:26.4682170Z test_ops.py::TestCommonCUDA::test_out_warning__refs_triu_cuda PASSED [0.7535s] [ 21%] 2025-12-04T13:28:26.4682268Z test_ops.py::TestCommonCUDA::test_out_warning__refs_unsqueeze_cuda PASSED [0.0032s] [ 21%] 2025-12-04T13:28:26.4682397Z test_ops.py::TestCommonCUDA::test_out_warning__unsafe_masked_index_put_accumulate_cuda PASSED [0.7349s] [ 21%] 2025-12-04T13:28:26.4682485Z test_ops.py::TestCommonCUDA::test_out_warning_addmv_cuda PASSED [0.0195s] [ 21%] 2025-12-04T13:28:26.4682595Z test_ops.py::TestCommonCUDA::test_out_warning_alias_copy_cuda PASSED [0.7419s] [ 21%] 2025-12-04T13:28:26.4682684Z test_ops.py::TestCommonCUDA::test_out_warning_amin_cuda PASSED [0.0393s] [ 21%] 2025-12-04T13:28:26.4682776Z test_ops.py::TestCommonCUDA::test_out_warning_aminmax_cuda PASSED [0.0126s] [ 21%] 2025-12-04T13:28:26.4682864Z test_ops.py::TestCommonCUDA::test_out_warning_argmax_cuda PASSED [0.7762s] [ 21%] 2025-12-04T13:28:26.4682956Z test_ops.py::TestCommonCUDA::test_out_warning_argmin_cuda PASSED [0.0322s] [ 21%] 2025-12-04T13:28:26.4683048Z test_ops.py::TestCommonCUDA::test_out_warning_as_strided_cuda PASSED [0.7265s] [ 21%] 2025-12-04T13:28:26.4683135Z test_ops.py::TestCommonCUDA::test_out_warning_asin_cuda PASSED [0.0076s] [ 21%] 2025-12-04T13:28:26.4683227Z test_ops.py::TestCommonCUDA::test_out_warning_atleast_3d_cuda PASSED [0.7098s] [ 21%] 2025-12-04T13:28:26.4683319Z test_ops.py::TestCommonCUDA::test_out_warning_baddbmm_cuda PASSED [0.0222s] [ 21%] 2025-12-04T13:28:26.4683410Z test_ops.py::TestCommonCUDA::test_out_warning_bernoulli_cuda XFAIL [0.0044s] [ 21%] 2025-12-04T13:28:26.4683506Z test_ops.py::TestCommonCUDA::test_out_warning_bitwise_and_cuda PASSED [0.0141s] [ 21%] 2025-12-04T13:28:26.4683613Z test_ops.py::TestCommonCUDA::test_out_warning_bitwise_not_cuda PASSED [0.7490s] [ 21%] 2025-12-04T13:28:26.4683709Z test_ops.py::TestCommonCUDA::test_out_warning_bitwise_or_cuda PASSED [0.0205s] [ 21%] 2025-12-04T13:28:26.4683814Z test_ops.py::TestCommonCUDA::test_out_warning_broadcast_tensors_cuda PASSED [0.7321s] [ 21%] 2025-12-04T13:28:26.4683915Z test_ops.py::TestCommonCUDA::test_out_warning_cauchy_cuda PASSED [0.0092s] [ 21%] 2025-12-04T13:28:26.4684009Z test_ops.py::TestCommonCUDA::test_out_warning_cdouble_cuda PASSED [0.7345s] [ 21%] 2025-12-04T13:28:26.4684107Z test_ops.py::TestCommonCUDA::test_out_warning_cholesky_solve_cuda PASSED [0.0236s] [ 21%] 2025-12-04T13:28:26.4684199Z test_ops.py::TestCommonCUDA::test_out_warning_chunk_cuda PASSED [0.7433s] [ 21%] 2025-12-04T13:28:26.4684295Z test_ops.py::TestCommonCUDA::test_out_warning_column_stack_cuda PASSED [0.0161s] [ 21%] 2025-12-04T13:28:26.4684384Z test_ops.py::TestCommonCUDA::test_out_warning_cross_cuda PASSED [0.7132s] [ 21%] 2025-12-04T13:28:26.4684473Z test_ops.py::TestCommonCUDA::test_out_warning_cummax_cuda PASSED [0.0154s] [ 21%] 2025-12-04T13:28:26.4684563Z test_ops.py::TestCommonCUDA::test_out_warning_cumsum_cuda PASSED [0.7420s] [ 21%] 2025-12-04T13:28:26.4684665Z test_ops.py::TestCommonCUDA::test_out_warning_deg2rad_cuda PASSED [0.0078s] [ 21%] 2025-12-04T13:28:26.4684755Z test_ops.py::TestCommonCUDA::test_out_warning_diag_cuda PASSED [0.7831s] [ 21%] 2025-12-04T13:28:26.4684845Z test_ops.py::TestCommonCUDA::test_out_warning_digamma_cuda PASSED [0.0124s] [ 21%] 2025-12-04T13:28:26.4684951Z test_ops.py::TestCommonCUDA::test_out_warning_div_floor_rounding_cuda PASSED [0.0192s] [ 22%] 2025-12-04T13:28:26.4685047Z test_ops.py::TestCommonCUDA::test_out_warning_empty_strided_cuda PASSED [0.7318s] [ 22%] 2025-12-04T13:28:26.4685232Z test_ops.py::TestCommonCUDA::test_out_warning_equal_cuda SKIPPED [0.0031s] (Skipped! Only supports single tensor or iterable of tensor outputs.) [ 22%] 2025-12-04T13:28:26.4685319Z test_ops.py::TestCommonCUDA::test_out_warning_erfc_cuda PASSED [0.7113s] [ 22%] 2025-12-04T13:28:26.4685415Z test_ops.py::TestCommonCUDA::test_out_warning_expand_copy_cuda PASSED [0.0260s] [ 22%] 2025-12-04T13:28:26.4685506Z test_ops.py::TestCommonCUDA::test_out_warning_fft_fft2_cuda PASSED [0.7218s] [ 22%] 2025-12-04T13:28:26.4685597Z test_ops.py::TestCommonCUDA::test_out_warning_fft_fft_cuda PASSED [0.0237s] [ 22%] 2025-12-04T13:28:26.4685689Z test_ops.py::TestCommonCUDA::test_out_warning_fft_hfft_cuda PASSED [0.7246s] [ 22%] 2025-12-04T13:28:26.4685780Z test_ops.py::TestCommonCUDA::test_out_warning_fft_irfft2_cuda PASSED [0.0204s] [ 22%] 2025-12-04T13:28:26.4685872Z test_ops.py::TestCommonCUDA::test_out_warning_fft_rfft_cuda PASSED [0.7215s] [ 22%] 2025-12-04T13:28:26.4685957Z test_ops.py::TestCommonCUDA::test_out_warning_fill_cuda PASSED [0.0030s] [ 22%] 2025-12-04T13:28:26.4686063Z test_ops.py::TestCommonCUDA::test_out_warning_flip_cuda PASSED [0.7076s] [ 22%] 2025-12-04T13:28:26.4686158Z test_ops.py::TestCommonCUDA::test_out_warning_float_power_cuda PASSED [0.0230s] [ 22%] 2025-12-04T13:28:26.4686320Z test_ops.py::TestCommonCUDA::test_out_warning_gather_cuda PASSED [0.7237s] [ 22%] 2025-12-04T13:28:26.4686407Z test_ops.py::TestCommonCUDA::test_out_warning_half_cuda PASSED [0.0030s] [ 22%] 2025-12-04T13:28:26.4686500Z test_ops.py::TestCommonCUDA::test_out_warning_hash_tensor_cuda PASSED [0.7482s] [ 22%] 2025-12-04T13:28:26.4686588Z test_ops.py::TestCommonCUDA::test_out_warning_hsplit_cuda PASSED [0.0029s] [ 22%] 2025-12-04T13:28:26.4686677Z test_ops.py::TestCommonCUDA::test_out_warning_igammac_cuda PASSED [0.0209s] [ 22%] 2025-12-04T13:28:26.4686761Z test_ops.py::TestCommonCUDA::test_out_warning_imag_cuda PASSED [0.7285s] [ 22%] 2025-12-04T13:28:26.4686854Z test_ops.py::TestCommonCUDA::test_out_warning_index_copy_cuda PASSED [0.0146s] [ 22%] 2025-12-04T13:28:26.4686957Z test_ops.py::TestCommonCUDA::test_out_warning_index_reduce_amin_cuda PASSED [0.7401s] [ 22%] 2025-12-04T13:28:26.4687072Z test_ops.py::TestCommonCUDA::test_out_warning_index_reduce_mean_cuda PASSED [0.0241s] [ 22%] 2025-12-04T13:28:26.4687159Z test_ops.py::TestCommonCUDA::test_out_warning_istft_cuda PASSED [0.7231s] [ 22%] 2025-12-04T13:28:26.4687338Z test_ops.py::TestCommonCUDA::test_out_warning_item_cuda SKIPPED [0.0031s] (Skipped! Only supports single tensor or iterable of tensor outputs.) [ 22%] 2025-12-04T13:28:26.4687439Z test_ops.py::TestCommonCUDA::test_out_warning_kthvalue_cuda PASSED [0.7383s] [ 22%] 2025-12-04T13:28:26.4687525Z test_ops.py::TestCommonCUDA::test_out_warning_le_cuda PASSED [0.0206s] [ 22%] 2025-12-04T13:28:26.4687609Z test_ops.py::TestCommonCUDA::test_out_warning_lerp_cuda PASSED [0.7587s] [ 22%] 2025-12-04T13:28:26.4687706Z test_ops.py::TestCommonCUDA::test_out_warning_linalg_cross_cuda PASSED [0.0155s] [ 22%] 2025-12-04T13:28:26.4687809Z test_ops.py::TestCommonCUDA::test_out_warning_linalg_ldl_factor_cuda PASSED [0.0114s] [ 22%] 2025-12-04T13:28:26.4687911Z test_ops.py::TestCommonCUDA::test_out_warning_linalg_lu_factor_cuda PASSED [0.0755s] [ 22%] 2025-12-04T13:28:26.4688018Z test_ops.py::TestCommonCUDA::test_out_warning_linalg_lu_factor_ex_cuda PASSED [0.7894s] [ 22%] 2025-12-04T13:28:26.4688122Z test_ops.py::TestCommonCUDA::test_out_warning_linalg_norm_cuda PASSED [0.2142s] [ 22%] 2025-12-04T13:28:26.4688234Z test_ops.py::TestCommonCUDA::test_out_warning_linalg_solve_triangular_cuda PASSED [0.1029s] [ 22%] 2025-12-04T13:28:26.4688328Z test_ops.py::TestCommonCUDA::test_out_warning_logical_and_cuda PASSED [0.0144s] [ 22%] 2025-12-04T13:28:26.4688422Z test_ops.py::TestCommonCUDA::test_out_warning_logical_not_cuda PASSED [0.0053s] [ 22%] 2025-12-04T13:28:26.4688515Z test_ops.py::TestCommonCUDA::test_out_warning_logical_xor_cuda PASSED [0.0140s] [ 22%] 2025-12-04T13:28:26.4688606Z test_ops.py::TestCommonCUDA::test_out_warning_lu_solve_cuda PASSED [0.0323s] [ 22%] 2025-12-04T13:28:26.4688702Z test_ops.py::TestCommonCUDA::test_out_warning_masked_argmax_cuda PASSED [0.7235s] [ 22%] 2025-12-04T13:28:26.4688801Z test_ops.py::TestCommonCUDA::test_out_warning_masked_cumprod_cuda PASSED [0.0032s] [ 22%] 2025-12-04T13:28:26.4688903Z test_ops.py::TestCommonCUDA::test_out_warning_masked_log_softmax_cuda PASSED [0.7041s] [ 22%] 2025-12-04T13:28:26.4688999Z test_ops.py::TestCommonCUDA::test_out_warning_masked_median_cuda PASSED [0.0030s] [ 22%] 2025-12-04T13:28:26.4689096Z test_ops.py::TestCommonCUDA::test_out_warning_masked_softmin_cuda PASSED [0.7186s] [ 22%] 2025-12-04T13:28:26.4689189Z test_ops.py::TestCommonCUDA::test_out_warning_masked_sum_cuda PASSED [0.0030s] [ 22%] 2025-12-04T13:28:26.4689302Z test_ops.py::TestCommonCUDA::test_out_warning_meshgrid_variadic_tensors_cuda PASSED [0.7200s] [ 22%] 2025-12-04T13:28:26.4689388Z test_ops.py::TestCommonCUDA::test_out_warning_mm_cuda PASSED [0.0129s] [ 22%] 2025-12-04T13:28:26.4689483Z test_ops.py::TestCommonCUDA::test_out_warning_mode_cuda XFAIL [0.0111s] [ 22%] 2025-12-04T13:28:26.4689577Z test_ops.py::TestCommonCUDA::test_out_warning_narrow_copy_cuda XFAIL [0.0027s] [ 22%] 2025-12-04T13:28:26.4689663Z test_ops.py::TestCommonCUDA::test_out_warning_neg_cuda PASSED [1.4516s] [ 22%] 2025-12-04T13:28:26.4689753Z test_ops.py::TestCommonCUDA::test_out_warning_new_ones_cuda PASSED [0.0027s] [ 22%] 2025-12-04T13:28:26.4689885Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_batch_norm_without_cudnn_cuda PASSED [0.7297s] [ 22%] 2025-12-04T13:28:26.4689995Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_bilinear_cuda PASSED [0.0037s] [ 22%] 2025-12-04T13:28:26.4690104Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv1d_cuda PASSED [0.7266s] [ 22%] 2025-12-04T13:28:26.4690209Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv2d_cuda PASSED [0.0034s] [ 22%] 2025-12-04T13:28:26.4690330Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_conv_transpose2d_cuda PASSED [0.7094s] [ 22%] 2025-12-04T13:28:26.4690451Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_cosine_similarity_cuda PASSED [0.0032s] [ 22%] 2025-12-04T13:28:26.4690577Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_cross_entropy_cuda PASSED [0.7183s] [ 22%] 2025-12-04T13:28:26.4690684Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_ctc_loss_cuda PASSED [0.0095s] [ 22%] 2025-12-04T13:28:26.4690788Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_elu_cuda PASSED [0.7064s] [ 22%] 2025-12-04T13:28:26.4690902Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_gelu_cuda PASSED [0.0231s] [ 22%] 2025-12-04T13:28:26.4691014Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_grid_sample_cuda PASSED [0.7368s] [ 22%] 2025-12-04T13:28:26.4691124Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hardshrink_cuda PASSED [0.0174s] [ 22%] 2025-12-04T13:28:26.4691235Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hardswish_cuda PASSED [0.7271s] [ 22%] 2025-12-04T13:28:26.4691360Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_hinge_embedding_loss_cuda PASSED [0.0033s] [ 22%] 2025-12-04T13:28:26.4691480Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_area_cuda PASSED [0.7222s] [ 22%] 2025-12-04T13:28:26.4691601Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_interpolate_linear_cuda PASSED [0.0085s] [ 22%] 2025-12-04T13:28:26.4691733Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_local_response_norm_cuda PASSED [0.7202s] [ 22%] 2025-12-04T13:28:26.4691838Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_mish_cuda PASSED [0.0123s] [ 22%] 2025-12-04T13:28:26.4692017Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_multi_head_attention_forward_cuda PASSED [0.7294s] [ 23%] 2025-12-04T13:28:26.4692131Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_circular_cuda PASSED [0.0027s] [ 23%] 2025-12-04T13:28:26.4692242Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_pad_reflect_cuda PASSED [0.7200s] [ 23%] 2025-12-04T13:28:26.4692346Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_relu_cuda PASSED [0.0029s] [ 23%] 2025-12-04T13:28:26.4692468Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_softmin_with_dtype_cuda PASSED [0.7135s] [ 23%] 2025-12-04T13:28:26.4692579Z test_ops.py::TestCommonCUDA::test_out_warning_nn_functional_threshold_cuda PASSED [0.0030s] [ 23%] 2025-12-04T13:28:26.4692667Z test_ops.py::TestCommonCUDA::test_out_warning_nonzero_cuda XFAIL [0.0035s] [ 23%] 2025-12-04T13:28:26.4692788Z test_ops.py::TestCommonCUDA::test_out_warning_normal_number_mean_cuda SKIPPED [0.0003s] (Skipped!) [ 23%] 2025-12-04T13:28:26.4692880Z test_ops.py::TestCommonCUDA::test_out_warning_ones_like_cuda PASSED [0.7238s] [ 23%] 2025-12-04T13:28:26.4692972Z test_ops.py::TestCommonCUDA::test_out_warning_permute_cuda PASSED [0.0029s] [ 23%] 2025-12-04T13:28:26.4693063Z test_ops.py::TestCommonCUDA::test_out_warning_pinverse_cuda PASSED [0.7247s] [ 23%] 2025-12-04T13:28:26.4693171Z test_ops.py::TestCommonCUDA::test_out_warning_quantile_cuda PASSED [0.1199s] [ 23%] 2025-12-04T13:28:26.4693267Z test_ops.py::TestCommonCUDA::test_out_warning_randint_like_cuda PASSED [0.7586s] [ 23%] 2025-12-04T13:28:26.4693361Z test_ops.py::TestCommonCUDA::test_out_warning_reshape_as_cuda PASSED [0.0030s] [ 23%] 2025-12-04T13:28:26.4693452Z test_ops.py::TestCommonCUDA::test_out_warning_resize__cuda PASSED [0.7271s] [ 23%] 2025-12-04T13:28:26.4693538Z test_ops.py::TestCommonCUDA::test_out_warning_rot90_cuda PASSED [0.0031s] [ 23%] 2025-12-04T13:28:26.4693627Z test_ops.py::TestCommonCUDA::test_out_warning_round_cuda PASSED [0.7277s] [ 23%] 2025-12-04T13:28:26.4693712Z test_ops.py::TestCommonCUDA::test_out_warning_rsqrt_cuda PASSED [0.0111s] [ 23%] 2025-12-04T13:28:26.4693798Z test_ops.py::TestCommonCUDA::test_out_warning_rsub_cuda PASSED [0.7362s] [ 23%] 2025-12-04T13:28:26.4693895Z test_ops.py::TestCommonCUDA::test_out_warning_select_scatter_cuda PASSED [0.0031s] [ 23%] 2025-12-04T13:28:26.4693983Z test_ops.py::TestCommonCUDA::test_out_warning_short_cuda PASSED [0.7362s] [ 23%] 2025-12-04T13:28:26.4694080Z test_ops.py::TestCommonCUDA::test_out_warning_sign_cuda PASSED [0.0076s] [ 23%] 2025-12-04T13:28:26.4694188Z test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_cosine_cuda PASSED [0.0020s] [ 23%] 2025-12-04T13:28:26.4694300Z test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_gaussian_cuda PASSED [0.0020s] [ 23%] 2025-12-04T13:28:26.4694422Z test_ops.py::TestCommonCUDA::test_out_warning_signal_windows_kaiser_cuda PASSED [0.0019s] [ 23%] 2025-12-04T13:28:26.4694511Z test_ops.py::TestCommonCUDA::test_out_warning_signbit_cuda PASSED [0.7283s] [ 23%] 2025-12-04T13:28:26.4694597Z test_ops.py::TestCommonCUDA::test_out_warning_sinh_cuda PASSED [0.0078s] [ 23%] 2025-12-04T13:28:26.4694698Z test_ops.py::TestCommonCUDA::test_out_warning_special_bessel_j1_cuda PASSED [0.7510s] [ 23%] 2025-12-04T13:28:26.4694823Z test_ops.py::TestCommonCUDA::test_out_warning_special_chebyshev_polynomial_w_cuda PASSED [0.0194s] [ 23%] 2025-12-04T13:28:26.4694919Z test_ops.py::TestCommonCUDA::test_out_warning_special_entr_cuda PASSED [0.0051s] [ 23%] 2025-12-04T13:28:26.4695014Z test_ops.py::TestCommonCUDA::test_out_warning_special_i1_cuda PASSED [0.7480s] [ 23%] 2025-12-04T13:28:26.4695128Z test_ops.py::TestCommonCUDA::test_out_warning_special_modified_bessel_i0_cuda PASSED [0.0134s] [ 23%] 2025-12-04T13:28:26.4695256Z test_ops.py::TestCommonCUDA::test_out_warning_special_modified_bessel_i1_cuda PASSED [0.7359s] [ 23%] 2025-12-04T13:28:26.4695357Z test_ops.py::TestCommonCUDA::test_out_warning_split_list_args_cuda PASSED [0.0029s] [ 23%] 2025-12-04T13:28:26.4695458Z test_ops.py::TestCommonCUDA::test_out_warning_split_with_sizes_cuda PASSED [0.7181s] [ 23%] 2025-12-04T13:28:26.4695555Z test_ops.py::TestCommonCUDA::test_out_warning_squeeze_copy_cuda PASSED [0.0273s] [ 23%] 2025-12-04T13:28:26.4695649Z test_ops.py::TestCommonCUDA::test_out_warning_svd_lowrank_cuda PASSED [0.7087s] [ 23%] 2025-12-04T13:28:26.4695747Z test_ops.py::TestCommonCUDA::test_out_warning_take_along_dim_cuda PASSED [0.0201s] [ 23%] 2025-12-04T13:28:26.4695842Z test_ops.py::TestCommonCUDA::test_out_warning_tensor_split_cuda PASSED [0.7099s] [ 23%] 2025-12-04T13:28:26.4695929Z test_ops.py::TestCommonCUDA::test_out_warning_triu_cuda PASSED [0.0235s] [ 23%] 2025-12-04T13:28:26.4696018Z test_ops.py::TestCommonCUDA::test_out_warning_uniform_cuda PASSED [0.7351s] [ 23%] 2025-12-04T13:28:26.4696121Z test_ops.py::TestCommonCUDA::test_out_warning_var_mean_unbiased_cuda PASSED [0.0031s] [ 23%] 2025-12-04T13:28:26.4696220Z test_ops.py::TestCommonCUDA::test_out_warning_view_as_complex_cuda PASSED [0.7248s] [ 23%] 2025-12-04T13:28:26.4696311Z test_ops.py::TestCommonCUDA::test_out_warning_view_as_cuda PASSED [0.0029s] [ 23%] 2025-12-04T13:28:26.4696405Z test_ops.py::TestCommonCUDA::test_out_warning_view_as_real_cuda PASSED [0.7259s] [ 23%] 2025-12-04T13:28:26.4696498Z test_ops.py::TestCommonCUDA::test_out_warning_view_copy_cuda PASSED [0.0222s] [ 23%] 2025-12-04T13:28:26.4696593Z test_ops.py::TestCommonCUDA::test_out_xlogy_cuda_float32 PASSED [0.0095s] [ 23%] 2025-12-04T13:28:26.4696681Z test_ops.py::TestCommonCUDA::test_out_zeros_cuda_float32 PASSED [0.7235s] [ 23%] 2025-12-04T13:28:26.4696772Z test_ops.py::TestCommonCUDA::test_out_zeros_like_cuda_float32 PASSED [0.0030s] [ 23%] 2025-12-04T13:28:26.4696867Z test_ops.py::TestCommonCUDA::test_pointwise_tag_coverage_cuda PASSED [0.0037s] [ 23%] 2025-12-04T13:28:26.4696977Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_bool PASSED [0.0060s] [ 23%] 2025-12-04T13:28:26.4697088Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int16 PASSED [0.0044s] [ 23%] 2025-12-04T13:28:26.4697198Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int64 PASSED [0.0039s] [ 23%] 2025-12-04T13:28:26.4697306Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float___rdiv___cuda_int8 PASSED [0.0035s] [ 23%] 2025-12-04T13:28:26.4697413Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_bool PASSED [0.7219s] [ 23%] 2025-12-04T13:28:26.4697519Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acos_cuda_uint8 PASSED [0.0032s] [ 23%] 2025-12-04T13:28:26.4697637Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_int16 PASSED [0.7151s] [ 23%] 2025-12-04T13:28:26.4697743Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_acosh_cuda_int64 PASSED [0.0033s] [ 23%] 2025-12-04T13:28:26.4697861Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asin_cuda_uint8 PASSED [0.7330s] [ 23%] 2025-12-04T13:28:26.4697966Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_asinh_cuda_int8 PASSED [0.0030s] [ 23%] 2025-12-04T13:28:26.4698071Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan2_cuda_int64 PASSED [0.0036s] [ 23%] 2025-12-04T13:28:26.4698176Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_atan2_cuda_uint8 PASSED [0.0033s] [ 23%] 2025-12-04T13:28:26.4698288Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_copysign_cuda_int8 PASSED [0.0034s] [ 23%] 2025-12-04T13:28:26.4698392Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_cosh_cuda_int8 PASSED [0.7214s] [ 23%] 2025-12-04T13:28:26.4698502Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_deg2rad_cuda_bool PASSED [0.0031s] [ 23%] 2025-12-04T13:28:26.4698621Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erf_cuda_int16 PASSED [0.7194s] [ 23%] 2025-12-04T13:28:26.4698726Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erf_cuda_int64 PASSED [0.0030s] [ 23%] 2025-12-04T13:28:26.4698828Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfc_cuda_bool PASSED [0.7234s] [ 24%] 2025-12-04T13:28:26.4698935Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfc_cuda_uint8 PASSED [0.0033s] [ 24%] 2025-12-04T13:28:26.4699043Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_int16 PASSED [0.7331s] [ 24%] 2025-12-04T13:28:26.4699151Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_erfinv_cuda_uint8 PASSED [0.0030s] [ 24%] 2025-12-04T13:28:26.4699255Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_bool PASSED [0.7158s] [ 24%] 2025-12-04T13:28:26.4699361Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_int16 PASSED [0.0032s] [ 24%] 2025-12-04T13:28:26.4699465Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp2_cuda_int8 PASSED [0.7242s] [ 24%] 2025-12-04T13:28:26.4699568Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_exp_cuda_int32 PASSED [0.0033s] [ 24%] 2025-12-04T13:28:26.4699674Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_bool PASSED [0.7219s] [ 24%] 2025-12-04T13:28:26.4699780Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_int16 PASSED [0.0030s] [ 24%] 2025-12-04T13:28:26.4699887Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_expm1_cuda_int64 PASSED [0.7049s] [ 24%] 2025-12-04T13:28:26.4700000Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_float_power_cuda_int16 PASSED [0.0044s] [ 24%] 2025-12-04T13:28:26.4700113Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_i0_cuda_bool PASSED [0.7301s] [ 24%] 2025-12-04T13:28:26.4700219Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_int16 PASSED [0.0045s] [ 24%] 2025-12-04T13:28:26.4700327Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_int64 PASSED [0.0036s] [ 24%] 2025-12-04T13:28:26.4700433Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_ldexp_cuda_uint8 PASSED [0.0035s] [ 24%] 2025-12-04T13:28:26.4700540Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log1p_cuda_bool PASSED [0.0021s] [ 24%] 2025-12-04T13:28:26.4700643Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_log_cuda_int32 PASSED [0.7207s] [ 24%] 2025-12-04T13:28:26.4700755Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_int16 PASSED [0.0280s] [ 24%] 2025-12-04T13:28:26.4700868Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_masked_std_cuda_uint8 PASSED [0.0262s] [ 24%] 2025-12-04T13:28:26.4700996Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_1_cuda_int64 PASSED [0.7167s] [ 24%] 2025-12-04T13:28:26.4701133Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_3_cuda_uint8 PASSED [0.0045s] [ 24%] 2025-12-04T13:28:26.4701259Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_5_cuda_int16 PASSED [0.7166s] [ 24%] 2025-12-04T13:28:26.4701385Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_mvlgamma_mvlgamma_p_5_cuda_int64 PASSED [0.0044s] [ 24%] 2025-12-04T13:28:26.4701522Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_0_cuda_bool PASSED [0.7306s] [ 24%] 2025-12-04T13:28:26.4701670Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_1_cuda_int64 SKIPPED [0.0002s] (Skipped!) [ 24%] 2025-12-04T13:28:26.4701816Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_2_cuda_int64 SKIPPED [0.0001s] (Skipped!) [ 24%] 2025-12-04T13:28:26.4702003Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_3_cuda_int8 SKIPPED [0.0001s] (Skipped!) [ 24%] 2025-12-04T13:28:26.4702146Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_bool SKIPPED [0.0001s] (Skipped!) [ 24%] 2025-12-04T13:28:26.4702306Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_int64 SKIPPED [0.0001s] (Skipped!) [ 24%] 2025-12-04T13:28:26.4702449Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_polygamma_polygamma_n_4_cuda_uint8 SKIPPED [0.0001s] (Skipped!) [ 24%] 2025-12-04T13:28:26.4702561Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rad2deg_cuda_uint8 PASSED [0.0029s] [ 24%] 2025-12-04T13:28:26.4702669Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_rsqrt_cuda_int16 PASSED [0.7109s] [ 24%] 2025-12-04T13:28:26.4702778Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sigmoid_cuda_int32 PASSED [0.0033s] [ 24%] 2025-12-04T13:28:26.4702885Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_bool PASSED [0.7238s] [ 24%] 2025-12-04T13:28:26.4702991Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_int32 PASSED [0.0033s] [ 24%] 2025-12-04T13:28:26.4703096Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinc_cuda_int64 PASSED [0.7180s] [ 24%] 2025-12-04T13:28:26.4703201Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sinh_cuda_int32 PASSED [0.0030s] [ 24%] 2025-12-04T13:28:26.4703344Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_t_cuda_int64 PASSED [0.0036s] [ 24%] 2025-12-04T13:28:26.4703482Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_bool PASSED [0.0035s] [ 24%] 2025-12-04T13:28:26.4703619Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_chebyshev_polynomial_u_cuda_int8 PASSED [0.0034s] [ 24%] 2025-12-04T13:28:26.4703768Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_h_cuda_int16 PASSED [0.0055s] [ 24%] 2025-12-04T13:28:26.4703907Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_hermite_polynomial_he_cuda_int64 PASSED [0.0034s] [ 24%] 2025-12-04T13:28:26.4704047Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_bool PASSED [0.0049s] [ 24%] 2025-12-04T13:28:26.4704193Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_int16 PASSED [0.0033s] [ 24%] 2025-12-04T13:28:26.4704331Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_laguerre_polynomial_l_cuda_int64 PASSED [0.0032s] [ 24%] 2025-12-04T13:28:26.4704470Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_legendre_polynomial_p_cuda_int8 PASSED [0.0050s] [ 24%] 2025-12-04T13:28:26.4704631Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_t_cuda_uint8 PASSED [0.0064s] [ 24%] 2025-12-04T13:28:26.4704784Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_u_cuda_int32 PASSED [0.0054s] [ 24%] 2025-12-04T13:28:26.4704936Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_int32 PASSED [0.0052s] [ 24%] 2025-12-04T13:28:26.4705098Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_v_cuda_int64 PASSED [0.0033s] [ 24%] 2025-12-04T13:28:26.4705265Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_bool PASSED [0.0034s] [ 24%] 2025-12-04T13:28:26.4705415Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_shifted_chebyshev_polynomial_w_cuda_uint8 PASSED [0.0032s] [ 24%] 2025-12-04T13:28:26.4705542Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_int16 PASSED [0.0034s] [ 24%] 2025-12-04T13:28:26.4705662Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_xlog1py_cuda_int32 PASSED [0.0032s] [ 24%] 2025-12-04T13:28:26.4705785Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_int64 PASSED [0.0033s] [ 24%] 2025-12-04T13:28:26.4705901Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_special_zeta_cuda_int8 PASSED [0.0032s] [ 24%] 2025-12-04T13:28:26.4706012Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sqrt_cuda_bool PASSED [0.7260s] [ 24%] 2025-12-04T13:28:26.4706133Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_sqrt_cuda_int8 PASSED [0.0030s] [ 24%] 2025-12-04T13:28:26.4706240Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tan_cuda_bool PASSED [0.7065s] [ 24%] 2025-12-04T13:28:26.4706347Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tan_cuda_uint8 PASSED [0.0029s] [ 24%] 2025-12-04T13:28:26.4706454Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tanh_cuda_int16 PASSED [0.7144s] [ 24%] 2025-12-04T13:28:26.4706564Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_tanh_cuda_int64 PASSED [0.0030s] [ 24%] 2025-12-04T13:28:26.4706680Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_true_divide_cuda_int32 PASSED [0.0042s] [ 24%] 2025-12-04T13:28:26.4706792Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_bool PASSED [0.0034s] [ 24%] 2025-12-04T13:28:26.4706901Z test_ops.py::TestCommonCUDA::test_promotes_int_to_float_xlogy_cuda_uint8 PASSED [0.0034s] [ 24%] 2025-12-04T13:28:26.4707008Z test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_complex128 PASSED [0.7161s] [ 24%] 2025-12-04T13:28:26.4707106Z test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int32 PASSED [0.0039s] [ 25%] 2025-12-04T13:28:26.4707207Z test_ops.py::TestCommonCUDA::test_python_ref__refs_T_cuda_int64 PASSED [0.7228s] [ 25%] 2025-12-04T13:28:26.4707330Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_int16 PASSED [0.0207s] [ 25%] 2025-12-04T13:28:26.4707454Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bfloat16_cuda_int8 PASSED [0.0173s] [ 25%] 2025-12-04T13:28:26.4707664Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_float64 PASSED [0.0165s] [ 25%] 2025-12-04T13:28:26.4707784Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_bool_cuda_int16 PASSED [0.0148s] [ 25%] 2025-12-04T13:28:26.4707901Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_byte_cuda_uint8 PASSED [0.7371s] [ 25%] 2025-12-04T13:28:26.4708031Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_float16 PASSED [0.0237s] [ 25%] 2025-12-04T13:28:26.4708161Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cdouble_cuda_float64 PASSED [0.0211s] [ 25%] 2025-12-04T13:28:26.4708288Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_complex32 PASSED [0.0793s] [ 25%] 2025-12-04T13:28:26.4708416Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_complex64 PASSED [0.7690s] [ 25%] 2025-12-04T13:28:26.4708536Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_float64 PASSED [0.0242s] [ 25%] 2025-12-04T13:28:26.4708662Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_int32 PASSED [0.0191s] [ 25%] 2025-12-04T13:28:26.4708791Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_cfloat_cuda_int8 PASSED [0.0179s] [ 25%] 2025-12-04T13:28:26.4708923Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_complex128 PASSED [0.7557s] [ 25%] 2025-12-04T13:28:26.4709047Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_complex32 PASSED [0.0338s] [ 25%] 2025-12-04T13:28:26.4709182Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_chalf_cuda_int8 PASSED [0.0182s] [ 25%] 2025-12-04T13:28:26.4709304Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_char_cuda_complex32 PASSED [0.0288s] [ 25%] 2025-12-04T13:28:26.4709434Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_complex128 PASSED [0.7570s] [ 25%] 2025-12-04T13:28:26.4709558Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_double_cuda_float16 PASSED [0.0221s] [ 25%] 2025-12-04T13:28:26.4709681Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_float_cuda_bfloat16 PASSED [0.0205s] [ 25%] 2025-12-04T13:28:26.4709801Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_half_cuda_complex64 PASSED [0.0322s] [ 25%] 2025-12-04T13:28:26.4709935Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_int_cuda_bfloat16 PASSED [0.0163s] [ 25%] 2025-12-04T13:28:26.4710059Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_float16 PASSED [0.0162s] [ 25%] 2025-12-04T13:28:26.4710177Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_float32 PASSED [0.0161s] [ 25%] 2025-12-04T13:28:26.4710292Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_long_cuda_int8 PASSED [0.0142s] [ 25%] 2025-12-04T13:28:26.4710411Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_polar_cuda_float32 PASSED [0.0949s] [ 25%] 2025-12-04T13:28:26.4710535Z test_ops.py::TestCommonCUDA::test_python_ref__refs__conversions_short_cuda_float32 PASSED [0.7553s] [ 25%] 2025-12-04T13:28:26.4710637Z test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_bool PASSED [0.0153s] [ 25%] 2025-12-04T13:28:26.4710744Z test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_complex32 PASSED [0.1866s] [ 25%] 2025-12-04T13:28:26.4710849Z test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_float16 PASSED [0.0214s] [ 25%] 2025-12-04T13:28:26.4710955Z test_ops.py::TestCommonCUDA::test_python_ref__refs_abs_cuda_float32 PASSED [0.0150s] [ 25%] 2025-12-04T13:28:26.4711061Z test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_float32 PASSED [0.0166s] [ 25%] 2025-12-04T13:28:26.4711166Z test_ops.py::TestCommonCUDA::test_python_ref__refs_acos_cuda_int32 PASSED [0.0181s] [ 25%] 2025-12-04T13:28:26.4711273Z test_ops.py::TestCommonCUDA::test_python_ref__refs_acosh_cuda_bfloat16 PASSED [0.0231s] [ 25%] 2025-12-04T13:28:26.4711388Z test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_bfloat16 PASSED [0.8487s] [ 25%] 2025-12-04T13:28:26.4711495Z test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_complex128 PASSED [0.0808s] [ 25%] 2025-12-04T13:28:26.4711602Z test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_complex32 PASSED [0.1094s] [ 25%] 2025-12-04T13:28:26.4711754Z test_ops.py::TestCommonCUDA::test_python_ref__refs_add_cuda_float16 PASSED [0.0937s] [ 25%] 2025-12-04T13:28:26.4711913Z test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_complex64 PASSED [0.5865s] [ 25%] 2025-12-04T13:28:26.4712026Z test_ops.py::TestCommonCUDA::test_python_ref__refs_addcmul_cuda_int64 PASSED [0.0480s] [ 25%] 2025-12-04T13:28:26.4712126Z test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_float16 XFAIL [0.0062s] [ 25%] 2025-12-04T13:28:26.4712231Z test_ops.py::TestCommonCUDA::test_python_ref__refs_addr_cuda_int8 XFAIL [0.7173s] [ 25%] 2025-12-04T13:28:26.4712339Z test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_int64 PASSED [1.4447s] [ 25%] 2025-12-04T13:28:26.4712450Z test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_int8 PASSED [0.0041s] [ 25%] 2025-12-04T13:28:26.4712573Z test_ops.py::TestCommonCUDA::test_python_ref__refs_alias_copy_cuda_uint8 PASSED [0.7083s] [ 25%] 2025-12-04T13:28:26.4712683Z test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_float32 PASSED [0.0179s] [ 25%] 2025-12-04T13:28:26.4712784Z test_ops.py::TestCommonCUDA::test_python_ref__refs_all_cuda_uint8 PASSED [0.7265s] [ 25%] 2025-12-04T13:28:26.4712918Z test_ops.py::TestCommonCUDA::test_python_ref__refs_allclose_cuda_bfloat16 PASSED [0.0604s] [ 25%] 2025-12-04T13:28:26.4713021Z test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_float16 PASSED [0.0130s] [ 25%] 2025-12-04T13:28:26.4713125Z test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_int32 PASSED [0.7256s] [ 25%] 2025-12-04T13:28:26.4713225Z test_ops.py::TestCommonCUDA::test_python_ref__refs_amax_cuda_uint8 PASSED [0.0093s] [ 25%] 2025-12-04T13:28:26.4713334Z test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_float64 PASSED [0.0099s] [ 25%] 2025-12-04T13:28:26.4713436Z test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_int64 PASSED [0.0075s] [ 25%] 2025-12-04T13:28:26.4713537Z test_ops.py::TestCommonCUDA::test_python_ref__refs_amin_cuda_int8 PASSED [0.7164s] [ 25%] 2025-12-04T13:28:26.4713654Z test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_complex128 PASSED [0.0152s] [ 25%] 2025-12-04T13:28:26.4713761Z test_ops.py::TestCommonCUDA::test_python_ref__refs_any_cuda_complex64 PASSED [0.0134s] [ 25%] 2025-12-04T13:28:26.4713873Z test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_bfloat16 PASSED [0.0165s] [ 25%] 2025-12-04T13:28:26.4713978Z test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_float32 PASSED [0.0156s] [ 25%] 2025-12-04T13:28:26.4714085Z test_ops.py::TestCommonCUDA::test_python_ref__refs_arange_cuda_int32 PASSED [0.0080s] [ 25%] 2025-12-04T13:28:26.4714203Z test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_complex64 XFAIL [0.0025s] [ 25%] 2025-12-04T13:28:26.4714318Z test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_int16 XFAIL [0.7193s] [ 25%] 2025-12-04T13:28:26.4714429Z test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_copy_cuda_int8 XFAIL [0.0030s] [ 25%] 2025-12-04T13:28:26.4714538Z test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_bool PASSED [0.7131s] [ 25%] 2025-12-04T13:28:26.4714645Z test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_cuda_int16 PASSED [0.0037s] [ 25%] 2025-12-04T13:28:26.4714778Z test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_bfloat16 PASSED [0.7201s] [ 25%] 2025-12-04T13:28:26.4714909Z test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_complex64 PASSED [0.0051s] [ 25%] 2025-12-04T13:28:26.4715037Z test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_partial_views_cuda_int32 PASSED [0.7130s] [ 26%] 2025-12-04T13:28:26.4715168Z test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_float32 PASSED [0.0063s] [ 26%] 2025-12-04T13:28:26.4715291Z test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_int16 PASSED [0.7113s] [ 26%] 2025-12-04T13:28:26.4715407Z test_ops.py::TestCommonCUDA::test_python_ref__refs_as_strided_scatter_cuda_int64 PASSED [0.0058s] [ 26%] 2025-12-04T13:28:26.4715514Z test_ops.py::TestCommonCUDA::test_python_ref__refs_asin_cuda_bool PASSED [0.0211s] [ 26%] 2025-12-04T13:28:26.4715619Z test_ops.py::TestCommonCUDA::test_python_ref__refs_asinh_cuda_float64 PASSED [0.7292s] [ 26%] 2025-12-04T13:28:26.4715726Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_bfloat16 PASSED [0.0830s] [ 26%] 2025-12-04T13:28:26.4715832Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atan2_cuda_bool PASSED [0.0670s] [ 26%] 2025-12-04T13:28:26.4715931Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atan_cuda_uint8 PASSED [0.0158s] [ 26%] 2025-12-04T13:28:26.4716040Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_bfloat16 PASSED [0.0215s] [ 26%] 2025-12-04T13:28:26.4716159Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_complex64 PASSED [0.0283s] [ 26%] 2025-12-04T13:28:26.4716264Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_float32 PASSED [0.0152s] [ 26%] 2025-12-04T13:28:26.4716368Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atanh_cuda_float64 PASSED [0.7380s] [ 26%] 2025-12-04T13:28:26.4716489Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_bool PASSED [0.0051s] [ 26%] 2025-12-04T13:28:26.4716602Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_complex128 PASSED [0.7317s] [ 26%] 2025-12-04T13:28:26.4716718Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_complex32 PASSED [0.0064s] [ 26%] 2025-12-04T13:28:26.4716832Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_complex64 PASSED [0.7325s] [ 26%] 2025-12-04T13:28:26.4716946Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_float64 PASSED [0.0061s] [ 26%] 2025-12-04T13:28:26.4717052Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_int16 PASSED [0.7348s] [ 26%] 2025-12-04T13:28:26.4717162Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_1d_cuda_int8 PASSED [0.0052s] [ 26%] 2025-12-04T13:28:26.4717279Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_2d_cuda_bool PASSED [0.0043s] [ 26%] 2025-12-04T13:28:26.4717396Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_float64 PASSED [0.0054s] [ 26%] 2025-12-04T13:28:26.4717503Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_int32 PASSED [0.0043s] [ 26%] 2025-12-04T13:28:26.4717613Z test_ops.py::TestCommonCUDA::test_python_ref__refs_atleast_3d_cuda_uint8 PASSED [0.0043s] [ 26%] 2025-12-04T13:28:26.4717735Z test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_int16 PASSED [0.0472s] [ 26%] 2025-12-04T13:28:26.4717852Z test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_left_shift_cuda_int32 PASSED [0.0482s] [ 26%] 2025-12-04T13:28:26.4717964Z test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_int64 PASSED [0.0126s] [ 26%] 2025-12-04T13:28:26.4718071Z test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_not_cuda_uint8 PASSED [0.0117s] [ 26%] 2025-12-04T13:28:26.4718182Z test_ops.py::TestCommonCUDA::test_python_ref__refs_bitwise_xor_cuda_bool PASSED [0.0438s] [ 26%] 2025-12-04T13:28:26.4718289Z test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_bool PASSED [0.7308s] [ 26%] 2025-12-04T13:28:26.4718503Z test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_complex32 PASSED [0.0186s] [ 26%] 2025-12-04T13:28:26.4718613Z test_ops.py::TestCommonCUDA::test_python_ref__refs_block_diag_cuda_complex64 PASSED [0.0166s] [ 26%] 2025-12-04T13:28:26.4718737Z test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_tensors_cuda_float64 PASSED [0.0077s] [ 26%] 2025-12-04T13:28:26.4718864Z test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_complex64 PASSED [0.0050s] [ 26%] 2025-12-04T13:28:26.4718982Z test_ops.py::TestCommonCUDA::test_python_ref__refs_broadcast_to_cuda_float32 PASSED [0.7297s] [ 26%] 2025-12-04T13:28:26.4719090Z test_ops.py::TestCommonCUDA::test_python_ref__refs_bucketize_cuda_float64 PASSED [0.3271s] [ 26%] 2025-12-04T13:28:26.4719198Z test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_bfloat16 PASSED [0.0085s] [ 26%] 2025-12-04T13:28:26.4719300Z test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_int16 PASSED [0.0071s] [ 26%] 2025-12-04T13:28:26.4719404Z test_ops.py::TestCommonCUDA::test_python_ref__refs_cat_cuda_int64 PASSED [0.0069s] [ 26%] 2025-12-04T13:28:26.4719591Z test_ops.py::TestCommonCUDA::test_python_ref__refs_cauchy_cuda_float16 SKIPPED [0.0001s] (TODO: RuntimeError: no _refs support for torch.rand_like) [ 26%] 2025-12-04T13:28:26.4719772Z test_ops.py::TestCommonCUDA::test_python_ref__refs_cauchy_cuda_float64 SKIPPED [0.0001s] (TODO: RuntimeError: no _refs support for torch.rand_like) [ 26%] 2025-12-04T13:28:26.4719881Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ceil_cuda_float64 PASSED [0.7477s] [ 26%] 2025-12-04T13:28:26.4719994Z test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_bool PASSED [0.0149s] [ 26%] 2025-12-04T13:28:26.4720108Z test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_complex128 PASSED [0.7400s] [ 26%] 2025-12-04T13:28:26.4720222Z test_ops.py::TestCommonCUDA::test_python_ref__refs_chunk_cuda_uint8 PASSED [0.0148s] [ 26%] 2025-12-04T13:28:26.4720333Z test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_cuda_bfloat16 PASSED [0.0483s] [ 26%] 2025-12-04T13:28:26.4720439Z test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_bool PASSED [0.0709s] [ 26%] 2025-12-04T13:28:26.4720588Z test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_float16 PASSED [0.1186s] [ 26%] 2025-12-04T13:28:26.4720695Z test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_float64 PASSED [0.0895s] [ 26%] 2025-12-04T13:28:26.4720806Z test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_max_cuda_int64 PASSED [0.0765s] [ 26%] 2025-12-04T13:28:26.4720912Z test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_int64 PASSED [0.0765s] [ 26%] 2025-12-04T13:28:26.4721022Z test_ops.py::TestCommonCUDA::test_python_ref__refs_clamp_min_cuda_uint8 PASSED [0.0795s] [ 26%] 2025-12-04T13:28:26.4721138Z test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_bfloat16 PASSED [0.0259s] [ 26%] 2025-12-04T13:28:26.4721246Z test_ops.py::TestCommonCUDA::test_python_ref__refs_clone_cuda_int16 PASSED [0.7569s] [ 26%] 2025-12-04T13:28:26.4721360Z test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_complex32 PASSED [0.0057s] [ 26%] 2025-12-04T13:28:26.4721475Z test_ops.py::TestCommonCUDA::test_python_ref__refs_column_stack_cuda_uint8 PASSED [0.0039s] [ 26%] 2025-12-04T13:28:26.4721586Z test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_bfloat16 PASSED [0.7471s] [ 26%] 2025-12-04T13:28:26.4721691Z test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_complex64 PASSED [0.0321s] [ 26%] 2025-12-04T13:28:26.4721798Z test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_float64 PASSED [0.7329s] [ 26%] 2025-12-04T13:28:26.4721942Z test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_cuda_int16 PASSED [0.0110s] [ 26%] 2025-12-04T13:28:26.4722065Z test_ops.py::TestCommonCUDA::test_python_ref__refs_conj_physical_cuda_complex32 PASSED [0.0285s] [ 26%] 2025-12-04T13:28:26.4722173Z test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_int32 PASSED [0.7462s] [ 26%] 2025-12-04T13:28:26.4722283Z test_ops.py::TestCommonCUDA::test_python_ref__refs_contiguous_cuda_int64 PASSED [0.0185s] [ 26%] 2025-12-04T13:28:26.4722388Z test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_int16 PASSED [0.1089s] [ 26%] 2025-12-04T13:28:26.4722611Z test_ops.py::TestCommonCUDA::test_python_ref__refs_copysign_cuda_int32 PASSED [0.1072s] [ 26%] 2025-12-04T13:28:26.4722733Z test_ops.py::TestCommonCUDA::test_python_ref__refs_cos_cuda_complex128 PASSED [1.1000s] [ 26%] 2025-12-04T13:28:26.4722843Z test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_complex32 PASSED [0.4175s] [ 26%] 2025-12-04T13:28:26.4722945Z test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_float16 PASSED [0.0228s] [ 27%] 2025-12-04T13:28:26.4723052Z test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_float32 PASSED [0.0166s] [ 27%] 2025-12-04T13:28:26.4723154Z test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_int32 PASSED [0.0180s] [ 27%] 2025-12-04T13:28:26.4723258Z test_ops.py::TestCommonCUDA::test_python_ref__refs_cosh_cuda_uint8 PASSED [0.7475s] [ 27%] 2025-12-04T13:28:26.4723374Z test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_complex64 PASSED [0.0128s] [ 27%] 2025-12-04T13:28:26.4723492Z test_ops.py::TestCommonCUDA::test_python_ref__refs_count_nonzero_cuda_float16 PASSED [0.0132s] [ 27%] 2025-12-04T13:28:26.4723604Z test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_bfloat16 PASSED [0.0149s] [ 27%] 2025-12-04T13:28:26.4723729Z test_ops.py::TestCommonCUDA::test_python_ref__refs_cumprod_cuda_complex128 PASSED [0.0126s] [ 27%] 2025-12-04T13:28:26.4723841Z test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_bfloat16 PASSED [0.7244s] [ 27%] 2025-12-04T13:28:26.4723945Z test_ops.py::TestCommonCUDA::test_python_ref__refs_cumsum_cuda_int32 PASSED [0.0082s] [ 27%] 2025-12-04T13:28:26.4724069Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_complex64 PASSED [0.0106s] [ 27%] 2025-12-04T13:28:26.4724169Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_cuda_int64 PASSED [0.7374s] [ 27%] 2025-12-04T13:28:26.4724281Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_bool PASSED [0.0363s] [ 27%] 2025-12-04T13:28:26.4724396Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_complex128 PASSED [0.0374s] [ 27%] 2025-12-04T13:28:26.4724513Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diag_embed_cuda_complex64 PASSED [0.0370s] [ 27%] 2025-12-04T13:28:26.4724633Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_complex128 PASSED [0.0130s] [ 27%] 2025-12-04T13:28:26.4724752Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_float32 PASSED [0.0124s] [ 27%] 2025-12-04T13:28:26.4724881Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_int16 PASSED [0.0100s] [ 27%] 2025-12-04T13:28:26.4725049Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_int64 PASSED [0.7476s] [ 27%] 2025-12-04T13:28:26.4725160Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_int8 PASSED [0.0115s] [ 27%] 2025-12-04T13:28:26.4725279Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_copy_cuda_uint8 PASSED [0.0102s] [ 27%] 2025-12-04T13:28:26.4725391Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_complex128 PASSED [0.7298s] [ 27%] 2025-12-04T13:28:26.4725502Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_int32 PASSED [0.0109s] [ 27%] 2025-12-04T13:28:26.4725611Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_cuda_int8 PASSED [0.0089s] [ 27%] 2025-12-04T13:28:26.4725725Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_bool PASSED [0.0093s] [ 27%] 2025-12-04T13:28:26.4725854Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_complex128 PASSED [0.7438s] [ 27%] 2025-12-04T13:28:26.4725978Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_complex64 PASSED [0.0134s] [ 27%] 2025-12-04T13:28:26.4726101Z test_ops.py::TestCommonCUDA::test_python_ref__refs_diagonal_scatter_cuda_float64 PASSED [0.0111s] [ 27%] 2025-12-04T13:28:26.4726205Z test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_bool PASSED [0.2380s] [ 27%] 2025-12-04T13:28:26.4726317Z test_ops.py::TestCommonCUDA::test_python_ref__refs_digamma_cuda_int32 PASSED [0.0187s] [ 27%] 2025-12-04T13:28:26.4726448Z test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_float16 PASSED [0.3416s] [ 27%] 2025-12-04T13:28:26.4726571Z test_ops.py::TestCommonCUDA::test_python_ref__refs_div_floor_rounding_cuda_int32 PASSED [0.1378s] [ 27%] 2025-12-04T13:28:26.4726691Z test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_float32 PASSED [0.0664s] [ 27%] 2025-12-04T13:28:26.4726815Z test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_int64 PASSED [0.0844s] [ 27%] 2025-12-04T13:28:26.4726934Z test_ops.py::TestCommonCUDA::test_python_ref__refs_div_no_rounding_mode_cuda_uint8 PASSED [0.0831s] [ 27%] 2025-12-04T13:28:26.4727053Z test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_int32 PASSED [0.0529s] [ 27%] 2025-12-04T13:28:26.4727168Z test_ops.py::TestCommonCUDA::test_python_ref__refs_div_trunc_rounding_cuda_int8 PASSED [0.0519s] [ 27%] 2025-12-04T13:28:26.4727275Z test_ops.py::TestCommonCUDA::test_python_ref__refs_dot_cuda_complex64 XFAIL [0.0026s] [ 27%] 2025-12-04T13:28:26.4727386Z test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_bfloat16 PASSED [0.7500s] [ 27%] 2025-12-04T13:28:26.4727508Z test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_bool PASSED [0.0036s] [ 27%] 2025-12-04T13:28:26.4727622Z test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_complex128 PASSED [0.7335s] [ 27%] 2025-12-04T13:28:26.4727728Z test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_float16 PASSED [0.0054s] [ 27%] 2025-12-04T13:28:26.4727845Z test_ops.py::TestCommonCUDA::test_python_ref__refs_dsplit_cuda_int16 PASSED [0.0036s] [ 27%] 2025-12-04T13:28:26.4727953Z test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_complex32 PASSED [0.0042s] [ 27%] 2025-12-04T13:28:26.4728063Z test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_complex64 PASSED [0.7223s] [ 27%] 2025-12-04T13:28:26.4728166Z test_ops.py::TestCommonCUDA::test_python_ref__refs_dstack_cuda_int16 PASSED [0.0052s] [ 27%] 2025-12-04T13:28:26.4728328Z test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_complex32 SKIPPED [0.0002s] (Expected: empty is not comparable) [ 27%] 2025-12-04T13:28:26.4728483Z test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_float16 SKIPPED [0.0001s] (Expected: empty is not comparable) [ 27%] 2025-12-04T13:28:26.4728650Z test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_float32 SKIPPED [0.0001s] (Expected: empty is not comparable) [ 27%] 2025-12-04T13:28:26.4728798Z test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_cuda_int16 SKIPPED [0.0001s] (Expected: empty is not comparable) [ 27%] 2025-12-04T13:28:26.4728954Z test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_bool SKIPPED [0.0001s] (Expected: empty is not comparable) [ 27%] 2025-12-04T13:28:26.4729114Z test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_complex64 SKIPPED [0.0001s] (Expected: empty is not comparable) [ 27%] 2025-12-04T13:28:26.4729276Z test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_like_cuda_float16 SKIPPED [0.0001s] (Expected: empty is not comparable) [ 27%] 2025-12-04T13:28:26.4729453Z test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_bfloat16 SKIPPED [0.0001s] (Expected: empty_strided is not comparable) [ 27%] 2025-12-04T13:28:26.4729621Z test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_bool SKIPPED [0.0001s] (Expected: empty_strided is not comparable) [ 27%] 2025-12-04T13:28:26.4729797Z test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_float32 SKIPPED [0.0001s] (Expected: empty_strided is not comparable) [ 27%] 2025-12-04T13:28:26.4729967Z test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_float64 SKIPPED [0.0001s] (Expected: empty_strided is not comparable) [ 27%] 2025-12-04T13:28:26.4730137Z test_ops.py::TestCommonCUDA::test_python_ref__refs_empty_strided_cuda_int16 SKIPPED [0.0001s] (Expected: empty_strided is not comparable) [ 27%] 2025-12-04T13:28:26.4730252Z test_ops.py::TestCommonCUDA::test_python_ref__refs_eq_cuda_complex32 PASSED [0.0849s] [ 27%] 2025-12-04T13:28:26.4730363Z test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_bfloat16 PASSED [0.0065s] [ 27%] 2025-12-04T13:28:26.4730471Z test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_complex64 PASSED [0.0058s] [ 27%] 2025-12-04T13:28:26.4730581Z test_ops.py::TestCommonCUDA::test_python_ref__refs_equal_cuda_int32 PASSED [0.0057s] [ 27%] 2025-12-04T13:28:26.4730684Z test_ops.py::TestCommonCUDA::test_python_ref__refs_erf_cuda_int64 PASSED [0.0171s] [ 27%] 2025-12-04T13:28:26.4730789Z test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_bool PASSED [0.9388s] [ 27%] 2025-12-04T13:28:26.4730896Z test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_float32 PASSED [0.1672s] [ 27%] 2025-12-04T13:28:26.4730998Z test_ops.py::TestCommonCUDA::test_python_ref__refs_erfc_cuda_int32 PASSED [0.0186s] [ 27%] 2025-12-04T13:28:26.4731105Z test_ops.py::TestCommonCUDA::test_python_ref__refs_erfinv_cuda_int32 PASSED [0.2344s] [ 27%] 2025-12-04T13:28:26.4731210Z test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_complex64 PASSED [0.3932s] [ 28%] 2025-12-04T13:28:26.4731326Z test_ops.py::TestCommonCUDA::test_python_ref__refs_exp2_cuda_int16 PASSED [0.2008s] [ 28%] 2025-12-04T13:28:26.4731431Z test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_complex128 PASSED [1.0951s] [ 28%] 2025-12-04T13:28:26.4731539Z test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_float32 PASSED [0.0190s] [ 28%] 2025-12-04T13:28:26.4731648Z test_ops.py::TestCommonCUDA::test_python_ref__refs_exp_cuda_uint8 PASSED [0.0178s] [ 28%] 2025-12-04T13:28:26.4731767Z test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_complex128 PASSED [0.7295s] [ 28%] 2025-12-04T13:28:26.4731909Z test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_int32 PASSED [0.0043s] [ 28%] 2025-12-04T13:28:26.4732020Z test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_as_cuda_int64 PASSED [0.7382s] [ 28%] 2025-12-04T13:28:26.4732136Z test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_complex128 PASSED [0.0090s] [ 28%] 2025-12-04T13:28:26.4732254Z test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_float64 PASSED [0.8354s] [ 28%] 2025-12-04T13:28:26.4732363Z test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_copy_cuda_int16 PASSED [0.0083s] [ 28%] 2025-12-04T13:28:26.4732489Z test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_bfloat16 PASSED [0.7575s] [ 28%] 2025-12-04T13:28:26.4732595Z test_ops.py::TestCommonCUDA::test_python_ref__refs_expand_cuda_float64 PASSED [0.0077s] [ 28%] 2025-12-04T13:28:26.4732706Z test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_complex128 PASSED [0.0295s] [ 28%] 2025-12-04T13:28:26.4732816Z test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_complex64 PASSED [0.0286s] [ 28%] 2025-12-04T13:28:26.4732923Z test_ops.py::TestCommonCUDA::test_python_ref__refs_expm1_cuda_float64 PASSED [0.7452s] [ 28%] 2025-12-04T13:28:26.4733118Z test_ops.py::TestCommonCUDA::test_python_ref__refs_exponential_cuda_float32 SKIPPED [0.0002s] (TODO: RuntimeError: no _refs support for torch.rand_like) [ 28%] 2025-12-04T13:28:26.4733222Z test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_bfloat16 PASSED [0.0784s] [ 28%] 2025-12-04T13:28:26.4733331Z test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_complex128 PASSED [0.0779s] [ 28%] 2025-12-04T13:28:26.4733437Z test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_complex64 PASSED [0.0777s] [ 28%] 2025-12-04T13:28:26.4733545Z test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_float16 PASSED [0.0762s] [ 28%] 2025-12-04T13:28:26.4733647Z test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_float32 PASSED [0.0762s] [ 28%] 2025-12-04T13:28:26.4733750Z test_ops.py::TestCommonCUDA::test_python_ref__refs_eye_cuda_int16 PASSED [0.0662s] [ 28%] 2025-12-04T13:28:26.4733854Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_bool PASSED [0.0078s] [ 28%] 2025-12-04T13:28:26.4733982Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_complex32 PASSED [1.3561s] [ 28%] 2025-12-04T13:28:26.4734088Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_int64 PASSED [0.0104s] [ 28%] 2025-12-04T13:28:26.4734197Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fft2_cuda_int8 PASSED [0.0079s] [ 28%] 2025-12-04T13:28:26.4734307Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_complex32 PASSED [0.7442s] [ 28%] 2025-12-04T13:28:26.4734420Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_float32 PASSED [0.0112s] [ 28%] 2025-12-04T13:28:26.4734524Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_int64 PASSED [0.0090s] [ 28%] 2025-12-04T13:28:26.4734633Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftn_cuda_int8 PASSED [0.0088s] [ 28%] 2025-12-04T13:28:26.4734750Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_complex32 PASSED [0.7360s] [ 28%] 2025-12-04T13:28:26.4734860Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_fftshift_cuda_int64 PASSED [0.0071s] [ 28%] 2025-12-04T13:28:26.4734994Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_int16 PASSED [0.0085s] [ 28%] 2025-12-04T13:28:26.4735100Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft2_cuda_int64 PASSED [0.0075s] [ 28%] 2025-12-04T13:28:26.4735215Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_complex64 PASSED [0.0064s] [ 28%] 2025-12-04T13:28:26.4735335Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_float64 PASSED [0.1457s] [ 28%] 2025-12-04T13:28:26.4735447Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_int32 PASSED [0.0074s] [ 28%] 2025-12-04T13:28:26.4735551Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfft_cuda_int8 PASSED [0.0072s] [ 28%] 2025-12-04T13:28:26.4735666Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_hfftn_cuda_complex64 PASSED [0.0073s] [ 28%] 2025-12-04T13:28:26.4735776Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_complex64 PASSED [0.0074s] [ 28%] 2025-12-04T13:28:26.4735885Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_int64 PASSED [0.0081s] [ 28%] 2025-12-04T13:28:26.4735991Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft2_cuda_uint8 PASSED [0.0080s] [ 28%] 2025-12-04T13:28:26.4736110Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifft_cuda_int32 PASSED [0.7507s] [ 28%] 2025-12-04T13:28:26.4736222Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_complex32 PASSED [0.0117s] [ 28%] 2025-12-04T13:28:26.4736331Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftn_cuda_int8 PASSED [0.0099s] [ 28%] 2025-12-04T13:28:26.4736450Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ifftshift_cuda_complex128 PASSED [0.7368s] [ 28%] 2025-12-04T13:28:26.4736559Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_ihfftn_cuda_int64 PASSED [0.0131s] [ 28%] 2025-12-04T13:28:26.4736669Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_bool PASSED [0.0075s] [ 28%] 2025-12-04T13:28:26.4736778Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft2_cuda_float64 PASSED [0.1565s] [ 28%] 2025-12-04T13:28:26.4736894Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_complex32 PASSED [0.3178s] [ 28%] 2025-12-04T13:28:26.4737002Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_float32 PASSED [0.0077s] [ 28%] 2025-12-04T13:28:26.4737113Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_int16 PASSED [0.0073s] [ 28%] 2025-12-04T13:28:26.4737220Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfft_cuda_uint8 PASSED [0.0073s] [ 28%] 2025-12-04T13:28:26.4737336Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_complex32 PASSED [0.0097s] [ 28%] 2025-12-04T13:28:26.4737443Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_float16 PASSED [0.7477s] [ 28%] 2025-12-04T13:28:26.4737563Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_irfftn_cuda_int16 PASSED [0.0109s] [ 28%] 2025-12-04T13:28:26.4737670Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_float32 PASSED [0.7364s] [ 28%] 2025-12-04T13:28:26.4737782Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfft_cuda_float64 PASSED [0.0085s] [ 28%] 2025-12-04T13:28:26.4737890Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fft_rfftn_cuda_float64 PASSED [0.1530s] [ 28%] 2025-12-04T13:28:26.4738003Z test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_complex32 PASSED [0.0198s] [ 28%] 2025-12-04T13:28:26.4738111Z test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_complex64 PASSED [0.0192s] [ 28%] 2025-12-04T13:28:26.4738221Z test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_float16 PASSED [0.0179s] [ 28%] 2025-12-04T13:28:26.4738326Z test_ops.py::TestCommonCUDA::test_python_ref__refs_flatten_cuda_float32 PASSED [0.0179s] [ 28%] 2025-12-04T13:28:26.4738435Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_bool PASSED [0.0025s] [ 28%] 2025-12-04T13:28:26.4738544Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_float16 PASSED [0.0026s] [ 28%] 2025-12-04T13:28:26.4738663Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fliplr_cuda_float32 PASSED [0.0027s] [ 28%] 2025-12-04T13:28:26.4738772Z test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_float64 PASSED [0.0026s] [ 29%] 2025-12-04T13:28:26.4738876Z test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_int64 PASSED [0.0026s] [ 29%] 2025-12-04T13:28:26.4738993Z test_ops.py::TestCommonCUDA::test_python_ref__refs_flipud_cuda_uint8 PASSED [0.0025s] [ 29%] 2025-12-04T13:28:26.4739101Z test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_bool PASSED [0.0801s] [ 29%] 2025-12-04T13:28:26.4739221Z test_ops.py::TestCommonCUDA::test_python_ref__refs_float_power_cuda_complex64 PASSED [0.1007s] [ 29%] 2025-12-04T13:28:26.4739329Z test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_bfloat16 PASSED [0.0214s] [ 29%] 2025-12-04T13:28:26.4739438Z test_ops.py::TestCommonCUDA::test_python_ref__refs_floor_cuda_float32 PASSED [0.7556s] [ 29%] 2025-12-04T13:28:26.4739542Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_float64 PASSED [0.0560s] [ 29%] 2025-12-04T13:28:26.4739647Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fmax_cuda_int16 PASSED [0.0416s] [ 29%] 2025-12-04T13:28:26.4739758Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_bool PASSED [0.0393s] [ 29%] 2025-12-04T13:28:26.4739862Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fmin_cuda_int16 PASSED [0.0417s] [ 29%] 2025-12-04T13:28:26.4739963Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_float64 PASSED [0.0629s] [ 29%] 2025-12-04T13:28:26.4740064Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_int32 PASSED [0.7919s] [ 29%] 2025-12-04T13:28:26.4740162Z test_ops.py::TestCommonCUDA::test_python_ref__refs_fmod_cuda_uint8 PASSED [0.0519s] [ 29%] 2025-12-04T13:28:26.4740269Z test_ops.py::TestCommonCUDA::test_python_ref__refs_frac_cuda_float16 PASSED [0.0332s] [ 29%] 2025-12-04T13:28:26.4740374Z test_ops.py::TestCommonCUDA::test_python_ref__refs_frac_cuda_float64 PASSED [0.7738s] [ 29%] 2025-12-04T13:28:26.4740473Z test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_int16 PASSED [0.1709s] [ 29%] 2025-12-04T13:28:26.4740576Z test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_int8 PASSED [0.1784s] [ 29%] 2025-12-04T13:28:26.4740674Z test_ops.py::TestCommonCUDA::test_python_ref__refs_gcd_cuda_uint8 PASSED [0.1686s] [ 29%] 2025-12-04T13:28:26.4740778Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_bfloat16 PASSED [0.7963s] [ 29%] 2025-12-04T13:28:26.4740874Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_bool PASSED [0.0458s] [ 29%] 2025-12-04T13:28:26.4740973Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_float16 PASSED [0.0690s] [ 29%] 2025-12-04T13:28:26.4741069Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_int32 PASSED [0.0469s] [ 29%] 2025-12-04T13:28:26.4741182Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_int64 PASSED [0.0474s] [ 29%] 2025-12-04T13:28:26.4741279Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_int8 PASSED [0.0456s] [ 29%] 2025-12-04T13:28:26.4741375Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ge_cuda_uint8 PASSED [0.0450s] [ 29%] 2025-12-04T13:28:26.4741559Z test_ops.py::TestCommonCUDA::test_python_ref__refs_geometric_cuda_int16 SKIPPED [0.0001s] (TODO: RuntimeError: no _refs support for torch.rand_like) [ 29%] 2025-12-04T13:28:26.4741666Z test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_bfloat16 PASSED [0.0680s] [ 29%] 2025-12-04T13:28:26.4741761Z test_ops.py::TestCommonCUDA::test_python_ref__refs_gt_cuda_int8 PASSED [0.0447s] [ 29%] 2025-12-04T13:28:26.4741913Z test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_float16 PASSED [0.1445s] [ 29%] 2025-12-04T13:28:26.4742022Z test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_float64 PASSED [0.1194s] [ 29%] 2025-12-04T13:28:26.4742133Z test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_int8 PASSED [0.1121s] [ 29%] 2025-12-04T13:28:26.4742255Z test_ops.py::TestCommonCUDA::test_python_ref__refs_heaviside_cuda_uint8 PASSED [0.1125s] [ 29%] 2025-12-04T13:28:26.4742364Z test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_complex64 PASSED [0.0039s] [ 29%] 2025-12-04T13:28:26.4742472Z test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_int16 PASSED [0.7489s] [ 29%] 2025-12-04T13:28:26.4742585Z test_ops.py::TestCommonCUDA::test_python_ref__refs_hsplit_cuda_int64 PASSED [0.0045s] [ 29%] 2025-12-04T13:28:26.4742690Z test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_bool PASSED [0.0034s] [ 29%] 2025-12-04T13:28:26.4742794Z test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_complex32 PASSED [0.7324s] [ 29%] 2025-12-04T13:28:26.4742898Z test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_int64 PASSED [0.0045s] [ 29%] 2025-12-04T13:28:26.4743000Z test_ops.py::TestCommonCUDA::test_python_ref__refs_hstack_cuda_uint8 PASSED [0.7166s] [ 29%] 2025-12-04T13:28:26.4743101Z test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_int32 PASSED [0.2271s] [ 29%] 2025-12-04T13:28:26.4746277Z test_ops.py::TestCommonCUDA::test_python_ref__refs_i0_cuda_int8 PASSED [0.0165s] [ 29%] 2025-12-04T13:28:26.4746436Z test_ops.py::TestCommonCUDA::test_python_ref__refs_imag_cuda_complex128 PASSED [0.7533s] [ 29%] 2025-12-04T13:28:26.4746547Z test_ops.py::TestCommonCUDA::test_python_ref__refs_imag_cuda_complex32 PASSED [0.0317s] [ 29%] 2025-12-04T13:28:26.4746649Z test_ops.py::TestCommonCUDA::test_python_ref__refs_imag_cuda_complex64 PASSED [0.7591s] [ 29%] 2025-12-04T13:28:26.4746755Z test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_float16 XFAIL [0.0036s] [ 29%] 2025-12-04T13:28:26.4746855Z test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_int16 XFAIL [0.7381s] [ 29%] 2025-12-04T13:28:26.4746959Z test_ops.py::TestCommonCUDA::test_python_ref__refs_index_add_cuda_int32 XFAIL [0.0031s] [ 29%] 2025-12-04T13:28:26.4747062Z test_ops.py::TestCommonCUDA::test_python_ref__refs_index_copy_cuda_int32 XFAIL [0.7316s] [ 29%] 2025-12-04T13:28:26.4747172Z test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_bfloat16 XFAIL [0.7335s] [ 29%] 2025-12-04T13:28:26.4747282Z test_ops.py::TestCommonCUDA::test_python_ref__refs_index_fill_cuda_complex32 XFAIL [0.7496s] [ 29%] 2025-12-04T13:28:26.4747389Z test_ops.py::TestCommonCUDA::test_python_ref__refs_index_select_cuda_int64 XFAIL [0.7122s] [ 29%] 2025-12-04T13:28:26.4747490Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_bool PASSED [0.8771s] [ 29%] 2025-12-04T13:28:26.4747598Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_complex128 PASSED [0.1731s] [ 29%] 2025-12-04T13:28:26.4747702Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isclose_cuda_float32 PASSED [0.1545s] [ 29%] 2025-12-04T13:28:26.4747826Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_complex128 PASSED [0.0294s] [ 29%] 2025-12-04T13:28:26.4747929Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int32 PASSED [0.7468s] [ 29%] 2025-12-04T13:28:26.4748031Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int64 PASSED [0.0176s] [ 29%] 2025-12-04T13:28:26.4748135Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isfinite_cuda_int8 PASSED [0.0149s] [ 29%] 2025-12-04T13:28:26.4748235Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_float16 PASSED [0.0212s] [ 29%] 2025-12-04T13:28:26.4748337Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_float32 PASSED [0.0170s] [ 29%] 2025-12-04T13:28:26.4748437Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_int16 PASSED [0.0140s] [ 29%] 2025-12-04T13:28:26.4748536Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isinf_cuda_int8 PASSED [0.0134s] [ 29%] 2025-12-04T13:28:26.4748641Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_complex128 PASSED [0.0251s] [ 29%] 2025-12-04T13:28:26.4748744Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_float32 PASSED [0.0126s] [ 29%] 2025-12-04T13:28:26.4748855Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isnan_cuda_int8 PASSED [0.7586s] [ 29%] 2025-12-04T13:28:26.4748962Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_bfloat16 PASSED [0.0203s] [ 29%] 2025-12-04T13:28:26.4749066Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isneginf_cuda_bool PASSED [0.0170s] [ 30%] 2025-12-04T13:28:26.4749179Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_int32 PASSED [0.0142s] [ 30%] 2025-12-04T13:28:26.4749281Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isposinf_cuda_int8 PASSED [0.0135s] [ 30%] 2025-12-04T13:28:26.4749383Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_float16 PASSED [0.0210s] [ 30%] 2025-12-04T13:28:26.4749482Z test_ops.py::TestCommonCUDA::test_python_ref__refs_isreal_cuda_int16 PASSED [0.0154s] [ 30%] 2025-12-04T13:28:26.4749582Z test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_float32 PASSED [0.7386s] [ 30%] 2025-12-04T13:28:26.4749682Z test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_float64 PASSED [0.0047s] [ 30%] 2025-12-04T13:28:26.4749779Z test_ops.py::TestCommonCUDA::test_python_ref__refs_item_cuda_int8 PASSED [0.7200s] [ 30%] 2025-12-04T13:28:26.4749889Z test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_bool PASSED [0.0454s] [ 30%] 2025-12-04T13:28:26.4749987Z test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_float64 PASSED [0.0480s] [ 30%] 2025-12-04T13:28:26.4750084Z test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_int32 PASSED [0.0458s] [ 30%] 2025-12-04T13:28:26.4750178Z test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_int64 PASSED [0.0470s] [ 30%] 2025-12-04T13:28:26.4750273Z test_ops.py::TestCommonCUDA::test_python_ref__refs_le_cuda_uint8 PASSED [0.0452s] [ 30%] 2025-12-04T13:28:26.4750375Z test_ops.py::TestCommonCUDA::test_python_ref__refs_lerp_cuda_complex64 PASSED [0.0358s] [ 30%] 2025-12-04T13:28:26.4750485Z test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_int32 PASSED [0.7372s] [ 30%] 2025-12-04T13:28:26.4750593Z test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_int64 PASSED [0.0080s] [ 30%] 2025-12-04T13:28:26.4750699Z test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_cross_cuda_int8 PASSED [0.0065s] [ 30%] 2025-12-04T13:28:26.4750811Z test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_bool PASSED [0.0065s] [ 30%] 2025-12-04T13:28:26.4750924Z test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_int16 PASSED [0.0064s] [ 30%] 2025-12-04T13:28:26.4751034Z test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_diagonal_cuda_int64 PASSED [0.0064s] [ 30%] 2025-12-04T13:28:26.4751143Z test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_complex64 PASSED [0.0890s] [ 30%] 2025-12-04T13:28:26.4751262Z test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_norm_cuda_float16 PASSED [0.0724s] [ 30%] 2025-12-04T13:28:26.4751371Z test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vecdot_cuda_float32 PASSED [0.7371s] [ 30%] 2025-12-04T13:28:26.4751495Z test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_complex128 PASSED [0.1128s] [ 30%] 2025-12-04T13:28:26.4751615Z test_ops.py::TestCommonCUDA::test_python_ref__refs_linalg_vector_norm_cuda_float32 PASSED [0.1045s] [ 30%] 2025-12-04T13:28:26.4751726Z test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_complex64 PASSED [0.0403s] [ 30%] 2025-12-04T13:28:26.4751828Z test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_int16 XFAIL [0.0028s] [ 30%] 2025-12-04T13:28:26.4751987Z test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_cuda_int32 XFAIL [0.0029s] [ 30%] 2025-12-04T13:28:26.4752111Z test_ops.py::TestCommonCUDA::test_python_ref__refs_linspace_tensor_overload_cuda_float32 XFAIL [0.7240s] [ 30%] 2025-12-04T13:28:26.4752212Z test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_bool PASSED [0.7404s] [ 30%] 2025-12-04T13:28:26.4752317Z test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_complex128 PASSED [0.3322s] [ 30%] 2025-12-04T13:28:26.4752434Z test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_float32 PASSED [0.0172s] [ 30%] 2025-12-04T13:28:26.4752535Z test_ops.py::TestCommonCUDA::test_python_ref__refs_log10_cuda_float64 PASSED [0.0171s] [ 30%] 2025-12-04T13:28:26.4752641Z test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_complex64 PASSED [0.7549s] [ 30%] 2025-12-04T13:28:26.4752754Z test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_int16 PASSED [0.0190s] [ 30%] 2025-12-04T13:28:26.4752853Z test_ops.py::TestCommonCUDA::test_python_ref__refs_log1p_cuda_int8 PASSED [0.0161s] [ 30%] 2025-12-04T13:28:26.4752956Z test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_bfloat16 PASSED [0.7424s] [ 30%] 2025-12-04T13:28:26.4753053Z test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_int64 PASSED [0.0206s] [ 30%] 2025-12-04T13:28:26.4753151Z test_ops.py::TestCommonCUDA::test_python_ref__refs_log2_cuda_uint8 PASSED [0.0178s] [ 30%] 2025-12-04T13:28:26.4753248Z test_ops.py::TestCommonCUDA::test_python_ref__refs_log_cuda_uint8 PASSED [0.0178s] [ 30%] 2025-12-04T13:28:26.4753367Z test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_bool PASSED [0.0125s] [ 30%] 2025-12-04T13:28:26.4753508Z test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_complex32 PASSED [0.7261s] [ 30%] 2025-12-04T13:28:26.4753634Z test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_complex64 PASSED [0.7326s] [ 30%] 2025-12-04T13:28:26.4753755Z test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_float64 PASSED [0.7312s] [ 30%] 2025-12-04T13:28:26.4753875Z test_ops.py::TestCommonCUDA::test_python_ref__refs_log_softmax_with_dtype_cuda_int64 PASSED [0.7329s] [ 30%] 2025-12-04T13:28:26.4753983Z test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp2_cuda_float64 PASSED [0.0070s] [ 30%] 2025-12-04T13:28:26.4754091Z test_ops.py::TestCommonCUDA::test_python_ref__refs_logaddexp_cuda_bfloat16 PASSED [0.1904s] [ 30%] 2025-12-04T13:28:26.4754197Z test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_int16 PASSED [0.0643s] [ 30%] 2025-12-04T13:28:26.4754304Z test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_and_cuda_int64 PASSED [0.0633s] [ 30%] 2025-12-04T13:28:26.4754409Z test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_bool PASSED [0.7384s] [ 30%] 2025-12-04T13:28:26.4754522Z test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_not_cuda_complex128 PASSED [0.0304s] [ 30%] 2025-12-04T13:28:26.4754633Z test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_complex64 PASSED [0.8054s] [ 30%] 2025-12-04T13:28:26.4754742Z test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_float64 PASSED [0.0689s] [ 30%] 2025-12-04T13:28:26.4754859Z test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_or_cuda_int32 PASSED [0.0637s] [ 30%] 2025-12-04T13:28:26.4754963Z test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_bool PASSED [0.0443s] [ 30%] 2025-12-04T13:28:26.4755068Z test_ops.py::TestCommonCUDA::test_python_ref__refs_logical_xor_cuda_int16 PASSED [0.7818s] [ 30%] 2025-12-04T13:28:26.4755170Z test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int32 XFAIL [0.0423s] [ 30%] 2025-12-04T13:28:26.4755273Z test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_cuda_int8 PASSED [0.8412s] [ 30%] 2025-12-04T13:28:26.4755400Z test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_bfloat16 XFAIL [0.0118s] [ 30%] 2025-12-04T13:28:26.4755523Z test_ops.py::TestCommonCUDA::test_python_ref__refs_logspace_tensor_overload_cuda_int16 XFAIL [0.7280s] [ 30%] 2025-12-04T13:28:26.4755629Z test_ops.py::TestCommonCUDA::test_python_ref__refs_logsumexp_cuda_int64 PASSED [0.7371s] [ 30%] 2025-12-04T13:28:26.4755739Z test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_float16 PASSED [0.0079s] [ 30%] 2025-12-04T13:28:26.4755846Z test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_float64 PASSED [0.0077s] [ 30%] 2025-12-04T13:28:26.4755970Z test_ops.py::TestCommonCUDA::test_python_ref__refs_masked_fill_cuda_int8 PASSED [0.7159s] [ 30%] 2025-12-04T13:28:26.4756074Z test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_float64 PASSED [0.0571s] [ 30%] 2025-12-04T13:28:26.4756179Z test_ops.py::TestCommonCUDA::test_python_ref__refs_maximum_cuda_int64 PASSED [0.0420s] [ 30%] 2025-12-04T13:28:26.4756294Z test_ops.py::TestCommonCUDA::test_python_ref__refs_mean_cuda_complex64 PASSED [0.0121s] [ 30%] 2025-12-04T13:28:26.4756421Z test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_float64 PASSED [0.0089s] [ 31%] 2025-12-04T13:28:26.4756548Z test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_int64 PASSED [0.7310s] [ 31%] 2025-12-04T13:28:26.4756672Z test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_list_of_tensors_cuda_int8 PASSED [0.0087s] [ 31%] 2025-12-04T13:28:26.4756804Z test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_float32 PASSED [0.0094s] [ 31%] 2025-12-04T13:28:26.4756930Z test_ops.py::TestCommonCUDA::test_python_ref__refs_meshgrid_variadic_tensors_cuda_uint8 PASSED [0.0070s] [ 31%] 2025-12-04T13:28:26.4757050Z test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_float32 PASSED [0.0548s] [ 31%] 2025-12-04T13:28:26.4757155Z test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_float64 PASSED [0.0542s] [ 31%] 2025-12-04T13:28:26.4757259Z test_ops.py::TestCommonCUDA::test_python_ref__refs_minimum_cuda_int32 PASSED [0.0419s] [ 31%] 2025-12-04T13:28:26.4757363Z test_ops.py::TestCommonCUDA::test_python_ref__refs_movedim_cuda_float16 PASSED [0.0060s] [ 31%] 2025-12-04T13:28:26.4757467Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_bool PASSED [0.0149s] [ 31%] 2025-12-04T13:28:26.4757572Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nan_to_num_cuda_int8 PASSED [0.0122s] [ 31%] 2025-12-04T13:28:26.4757687Z test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_complex128 XFAIL [0.0025s] [ 31%] 2025-12-04T13:28:26.4757796Z test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_complex32 XFAIL [0.7226s] [ 31%] 2025-12-04T13:28:26.4757908Z test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_float64 XFAIL [0.0030s] [ 31%] 2025-12-04T13:28:26.4758014Z test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_int16 XFAIL [0.7171s] [ 31%] 2025-12-04T13:28:26.4758122Z test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_copy_cuda_uint8 XFAIL [0.0030s] [ 31%] 2025-12-04T13:28:26.4758224Z test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_bool PASSED [0.7386s] [ 31%] 2025-12-04T13:28:26.4758336Z test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_complex128 PASSED [0.0213s] [ 31%] 2025-12-04T13:28:26.4758453Z test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_float32 PASSED [0.0197s] [ 31%] 2025-12-04T13:28:26.4758557Z test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_float64 PASSED [0.0196s] [ 31%] 2025-12-04T13:28:26.4758661Z test_ops.py::TestCommonCUDA::test_python_ref__refs_narrow_cuda_int64 PASSED [0.0157s] [ 31%] 2025-12-04T13:28:26.4758783Z test_ops.py::TestCommonCUDA::test_python_ref__refs_native_layer_norm_cuda_bfloat16 PASSED [0.0461s] [ 31%] 2025-12-04T13:28:26.4758883Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ne_cuda_int8 PASSED [0.0450s] [ 31%] 2025-12-04T13:28:26.4758987Z test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_complex128 PASSED [0.1752s] [ 31%] 2025-12-04T13:28:26.4759089Z test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_int16 PASSED [0.0113s] [ 31%] 2025-12-04T13:28:26.4759186Z test_ops.py::TestCommonCUDA::test_python_ref__refs_neg_cuda_int32 PASSED [0.7352s] [ 31%] 2025-12-04T13:28:26.4759352Z test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_complex64 SKIPPED [0.0002s] (Expected: empty is not comparable) [ 31%] 2025-12-04T13:28:26.4759508Z test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_float16 SKIPPED [0.0001s] (Expected: empty is not comparable) [ 31%] 2025-12-04T13:28:26.4759675Z test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_cuda_float32 SKIPPED [0.0001s] (Expected: empty is not comparable) [ 31%] 2025-12-04T13:28:26.4759868Z test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_complex128 SKIPPED [0.0001s] (Expected: empty_strided is not comparable) [ 31%] 2025-12-04T13:28:26.4760042Z test_ops.py::TestCommonCUDA::test_python_ref__refs_new_empty_strided_cuda_float16 SKIPPED [0.0001s] (Expected: empty_strided is not comparable) [ 31%] 2025-12-04T13:28:26.4760155Z test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_complex32 PASSED [0.7317s] [ 31%] 2025-12-04T13:28:26.4760265Z test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_complex64 PASSED [0.0074s] [ 31%] 2025-12-04T13:28:26.4760372Z test_ops.py::TestCommonCUDA::test_python_ref__refs_new_zeros_cuda_float32 PASSED [0.7152s] [ 31%] 2025-12-04T13:28:26.4760481Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nextafter_cuda_bfloat16 PASSED [0.0570s] [ 31%] 2025-12-04T13:28:26.4760660Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_alpha_dropout_cuda_float16 SKIPPED [0.0002s] (Expected: dropout is not comparable) [ 31%] 2025-12-04T13:28:26.4760791Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_celu_cuda_float32 PASSED [0.0316s] [ 31%] 2025-12-04T13:28:26.4760922Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_bool PASSED [0.0040s] [ 31%] 2025-12-04T13:28:26.4761061Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_complex128 PASSED [0.0042s] [ 31%] 2025-12-04T13:28:26.4761196Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_float64 PASSED [0.7232s] [ 31%] 2025-12-04T13:28:26.4761327Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_int32 PASSED [0.0052s] [ 31%] 2025-12-04T13:28:26.4761461Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_channel_shuffle_cuda_int64 PASSED [0.0040s] [ 31%] 2025-12-04T13:28:26.4761580Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_elu_cuda_bfloat16 PASSED [0.0425s] [ 31%] 2025-12-04T13:28:26.4761710Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_group_norm_cuda_bfloat16 PASSED [0.8293s] [ 31%] 2025-12-04T13:28:26.4761836Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_group_norm_cuda_float32 PASSED [0.0764s] [ 31%] 2025-12-04T13:28:26.4761997Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_hardtanh_cuda_int16 PASSED [0.0390s] [ 31%] 2025-12-04T13:28:26.4762123Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_huber_loss_cuda_float32 PASSED [0.7366s] [ 31%] 2025-12-04T13:28:26.4762256Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_l1_loss_cuda_bfloat16 PASSED [0.0095s] [ 31%] 2025-12-04T13:28:26.4762384Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_layer_norm_cuda_float16 PASSED [0.0132s] [ 31%] 2025-12-04T13:28:26.4762507Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_layer_norm_cuda_float32 PASSED [0.7410s] [ 31%] 2025-12-04T13:28:26.4762632Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_layer_norm_cuda_float64 PASSED [0.0124s] [ 31%] 2025-12-04T13:28:26.4762776Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_bfloat16 PASSED [0.0126s] [ 31%] 2025-12-04T13:28:26.4762915Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_log_softmax_with_dtype_cuda_int32 PASSED [0.0122s] [ 31%] 2025-12-04T13:28:26.4763049Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_margin_ranking_loss_cuda_int32 PASSED [0.0364s] [ 31%] 2025-12-04T13:28:26.4763172Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mish_cuda_bfloat16 PASSED [0.0405s] [ 31%] 2025-12-04T13:28:26.4763306Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mish_cuda_float32 PASSED [0.0335s] [ 31%] 2025-12-04T13:28:26.4763428Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_mse_loss_cuda_float16 PASSED [0.0076s] [ 31%] 2025-12-04T13:28:26.4763570Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pairwise_distance_cuda_complex64 PASSED [0.0086s] [ 31%] 2025-12-04T13:28:26.4763708Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_bool PASSED [0.0050s] [ 31%] 2025-12-04T13:28:26.4763841Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_float64 PASSED [0.0052s] [ 31%] 2025-12-04T13:28:26.4763967Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_shuffle_cuda_int8 PASSED [0.0049s] [ 31%] 2025-12-04T13:28:26.4764098Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_int16 PASSED [0.0049s] [ 31%] 2025-12-04T13:28:26.4764224Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_pixel_unshuffle_cuda_int64 PASSED [0.0048s] [ 31%] 2025-12-04T13:28:26.4764363Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_poisson_nll_loss_cuda_float64 PASSED [0.0857s] [ 31%] 2025-12-04T13:28:26.4764495Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_prelu_cuda_float16 PASSED [0.8387s] [ 31%] 2025-12-04T13:28:26.4764618Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_prelu_cuda_float32 PASSED [0.0889s] [ 31%] 2025-12-04T13:28:26.4764736Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_int16 PASSED [0.7542s] [ 31%] 2025-12-04T13:28:26.4764855Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu6_cuda_int64 PASSED [0.0371s] [ 32%] 2025-12-04T13:28:26.4764977Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_bfloat16 PASSED [0.0331s] [ 32%] 2025-12-04T13:28:26.4765090Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_relu_cuda_int8 PASSED [0.0185s] [ 32%] 2025-12-04T13:28:26.4765210Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_selu_cuda_float32 PASSED [0.0320s] [ 32%] 2025-12-04T13:28:26.4765345Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmax_with_dtype_cuda_int32 PASSED [0.0095s] [ 32%] 2025-12-04T13:28:26.4765485Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softmin_with_dtype_cuda_float64 PASSED [0.7251s] [ 32%] 2025-12-04T13:28:26.4765611Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softplus_cuda_float32 PASSED [0.0470s] [ 32%] 2025-12-04T13:28:26.4765737Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softshrink_cuda_float16 PASSED [0.0746s] [ 32%] 2025-12-04T13:28:26.4765861Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_softshrink_cuda_float64 PASSED [0.0413s] [ 32%] 2025-12-04T13:28:26.4766000Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_tanhshrink_cuda_complex64 PASSED [0.0366s] [ 32%] 2025-12-04T13:28:26.4766125Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_float64 PASSED [0.0257s] [ 32%] 2025-12-04T13:28:26.4766246Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_int8 PASSED [0.0220s] [ 32%] 2025-12-04T13:28:26.4766366Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_threshold_cuda_uint8 PASSED [0.0217s] [ 32%] 2025-12-04T13:28:26.4766503Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_int16 PASSED [0.0187s] [ 32%] 2025-12-04T13:28:26.4766638Z test_ops.py::TestCommonCUDA::test_python_ref__refs_nn_functional_triplet_margin_loss_cuda_uint8 PASSED [0.0184s] [ 32%] 2025-12-04T13:28:26.4766745Z test_ops.py::TestCommonCUDA::test_python_ref__refs_norm_cuda_float64 PASSED [0.0299s] [ 32%] 2025-12-04T13:28:26.4766945Z test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_number_mean_cuda_bfloat16 SKIPPED [0.0001s] (TODO: RuntimeError: no _refs support for torch.rand_like) [ 32%] 2025-12-04T13:28:26.4767148Z test_ops.py::TestCommonCUDA::test_python_ref__refs_normal_number_mean_cuda_float64 SKIPPED [0.0001s] (TODO: RuntimeError: no _refs support for torch.rand_like) [ 32%] 2025-12-04T13:28:26.4767258Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_complex32 PASSED [0.7412s] [ 32%] 2025-12-04T13:28:26.4767370Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_float16 PASSED [0.0043s] [ 32%] 2025-12-04T13:28:26.4767472Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_int32 PASSED [0.7324s] [ 32%] 2025-12-04T13:28:26.4767571Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ones_cuda_uint8 PASSED [0.0038s] [ 32%] 2025-12-04T13:28:26.4767685Z test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_float64 PASSED [0.0288s] [ 32%] 2025-12-04T13:28:26.4767794Z test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_int16 PASSED [0.0217s] [ 32%] 2025-12-04T13:28:26.4767903Z test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_copy_cuda_uint8 PASSED [0.0217s] [ 32%] 2025-12-04T13:28:26.4768011Z test_ops.py::TestCommonCUDA::test_python_ref__refs_permute_cuda_complex128 PASSED [0.0256s] [ 32%] 2025-12-04T13:28:26.4768128Z test_ops.py::TestCommonCUDA::test_python_ref__refs_positive_cuda_int16 PASSED [0.7479s] [ 32%] 2025-12-04T13:28:26.4768227Z test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_int32 PASSED [0.0484s] [ 32%] 2025-12-04T13:28:26.4768326Z test_ops.py::TestCommonCUDA::test_python_ref__refs_pow_cuda_uint8 PASSED [0.0460s] [ 32%] 2025-12-04T13:28:26.4768430Z test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_complex64 PASSED [0.0178s] [ 32%] 2025-12-04T13:28:26.4768534Z test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_float64 PASSED [0.0166s] [ 32%] 2025-12-04T13:28:26.4768635Z test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_int16 PASSED [0.0187s] [ 32%] 2025-12-04T13:28:26.4768732Z test_ops.py::TestCommonCUDA::test_python_ref__refs_prod_cuda_int32 PASSED [0.0181s] [ 32%] 2025-12-04T13:28:26.4768840Z test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_float32 PASSED [0.0176s] [ 32%] 2025-12-04T13:28:26.4768944Z test_ops.py::TestCommonCUDA::test_python_ref__refs_rad2deg_cuda_float64 PASSED [0.0174s] [ 32%] 2025-12-04T13:28:26.4769051Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_complex128 PASSED [0.0035s] [ 32%] 2025-12-04T13:28:26.4769156Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_complex32 PASSED [0.7287s] [ 32%] 2025-12-04T13:28:26.4769259Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_complex64 PASSED [0.0051s] [ 32%] 2025-12-04T13:28:26.4769360Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_int32 PASSED [0.7291s] [ 32%] 2025-12-04T13:28:26.4769459Z test_ops.py::TestCommonCUDA::test_python_ref__refs_ravel_cuda_int8 PASSED [0.0048s] [ 32%] 2025-12-04T13:28:26.4769579Z test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_bfloat16 PASSED [0.7406s] [ 32%] 2025-12-04T13:28:26.4769678Z test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_bool PASSED [0.0129s] [ 32%] 2025-12-04T13:28:26.4769778Z test_ops.py::TestCommonCUDA::test_python_ref__refs_real_cuda_float16 PASSED [0.7544s] [ 32%] 2025-12-04T13:28:26.4769890Z test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_bfloat16 PASSED [0.0255s] [ 32%] 2025-12-04T13:28:26.4769995Z test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_bool PASSED [0.0215s] [ 32%] 2025-12-04T13:28:26.4770104Z test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_float16 PASSED [0.7574s] [ 32%] 2025-12-04T13:28:26.4770208Z test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_int16 PASSED [0.0211s] [ 32%] 2025-12-04T13:28:26.4770313Z test_ops.py::TestCommonCUDA::test_python_ref__refs_reciprocal_cuda_int8 PASSED [0.0175s] [ 32%] 2025-12-04T13:28:26.4770417Z test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_int64 PASSED [0.0518s] [ 32%] 2025-12-04T13:28:26.4770521Z test_ops.py::TestCommonCUDA::test_python_ref__refs_remainder_cuda_int8 PASSED [0.7800s] [ 32%] 2025-12-04T13:28:26.4770642Z test_ops.py::TestCommonCUDA::test_python_ref__refs_renorm_cuda_complex128 PASSED [0.0114s] [ 32%] 2025-12-04T13:28:26.4770746Z test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_float32 PASSED [0.0247s] [ 32%] 2025-12-04T13:28:26.4770860Z test_ops.py::TestCommonCUDA::test_python_ref__refs_repeat_cuda_float64 PASSED [0.0242s] [ 32%] 2025-12-04T13:28:26.4770971Z test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_bfloat16 PASSED [0.0155s] [ 32%] 2025-12-04T13:28:26.4771075Z test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_as_cuda_int8 PASSED [0.0128s] [ 32%] 2025-12-04T13:28:26.4771179Z test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_bfloat16 PASSED [0.0194s] [ 32%] 2025-12-04T13:28:26.4771281Z test_ops.py::TestCommonCUDA::test_python_ref__refs_reshape_cuda_int8 PASSED [0.0157s] [ 32%] 2025-12-04T13:28:26.4771381Z test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_float16 PASSED [0.0104s] [ 32%] 2025-12-04T13:28:26.4771481Z test_ops.py::TestCommonCUDA::test_python_ref__refs_roll_cuda_int64 PASSED [0.0090s] [ 32%] 2025-12-04T13:28:26.4771598Z test_ops.py::TestCommonCUDA::test_python_ref__refs_rot90_cuda_complex128 PASSED [0.0155s] [ 32%] 2025-12-04T13:28:26.4771700Z test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_float32 PASSED [0.0153s] [ 32%] 2025-12-04T13:28:26.4771803Z test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_float64 PASSED [0.0151s] [ 32%] 2025-12-04T13:28:26.4771937Z test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_int8 PASSED [0.7454s] [ 32%] 2025-12-04T13:28:26.4772036Z test_ops.py::TestCommonCUDA::test_python_ref__refs_round_cuda_uint8 PASSED [0.0121s] [ 32%] 2025-12-04T13:28:26.4772142Z test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_complex32 PASSED [0.4407s] [ 32%] 2025-12-04T13:28:26.4772242Z test_ops.py::TestCommonCUDA::test_python_ref__refs_rsqrt_cuda_int64 PASSED [0.7396s] [ 32%] 2025-12-04T13:28:26.4772345Z test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_complex128 PASSED [0.0793s] [ 32%] 2025-12-04T13:28:26.4772447Z test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_float64 PASSED [0.0617s] [ 33%] 2025-12-04T13:28:26.4772546Z test_ops.py::TestCommonCUDA::test_python_ref__refs_rsub_cuda_uint8 PASSED [0.0471s] [ 33%] 2025-12-04T13:28:26.4772661Z test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_float16 PASSED [0.0071s] [ 33%] 2025-12-04T13:28:26.4772772Z test_ops.py::TestCommonCUDA::test_python_ref__refs_select_scatter_cuda_uint8 PASSED [0.0064s] [ 33%] 2025-12-04T13:28:26.4772875Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_complex32 PASSED [1.0406s] [ 33%] 2025-12-04T13:28:26.4772975Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sgn_cuda_float16 PASSED [0.0241s] [ 33%] 2025-12-04T13:28:26.4773097Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_bfloat16 PASSED [0.0324s] [ 33%] 2025-12-04T13:28:26.4773198Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_bool PASSED [0.0317s] [ 33%] 2025-12-04T13:28:26.4773302Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_float16 PASSED [0.0320s] [ 33%] 2025-12-04T13:28:26.4773405Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_int32 PASSED [0.0276s] [ 33%] 2025-12-04T13:28:26.4773507Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sigmoid_cuda_int8 PASSED [0.0259s] [ 33%] 2025-12-04T13:28:26.4773606Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_float32 PASSED [0.7490s] [ 33%] 2025-12-04T13:28:26.4773705Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_int64 PASSED [0.0131s] [ 33%] 2025-12-04T13:28:26.4773802Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sign_cuda_uint8 PASSED [0.0108s] [ 33%] 2025-12-04T13:28:26.4773905Z test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_bool PASSED [0.0131s] [ 33%] 2025-12-04T13:28:26.4774007Z test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_int16 PASSED [0.7411s] [ 33%] 2025-12-04T13:28:26.4774123Z test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_int32 PASSED [0.0130s] [ 33%] 2025-12-04T13:28:26.4774224Z test_ops.py::TestCommonCUDA::test_python_ref__refs_signbit_cuda_int8 PASSED [0.0107s] [ 33%] 2025-12-04T13:28:26.4774325Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_complex32 PASSED [0.2439s] [ 33%] 2025-12-04T13:28:26.4774442Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sin_cuda_int8 PASSED [0.7404s] [ 33%] 2025-12-04T13:28:26.4774541Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_float32 PASSED [0.2005s] [ 33%] 2025-12-04T13:28:26.4774640Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sinc_cuda_int64 PASSED [0.2416s] [ 33%] 2025-12-04T13:28:26.4774743Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_complex32 PASSED [0.2399s] [ 33%] 2025-12-04T13:28:26.4774842Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sinh_cuda_float32 PASSED [0.0153s] [ 33%] 2025-12-04T13:28:26.4774961Z test_ops.py::TestCommonCUDA::test_python_ref__refs_softmax_with_dtype_cuda_int32 PASSED [0.0094s] [ 33%] 2025-12-04T13:28:26.4775080Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_float32 PASSED [1.0925s] [ 33%] 2025-12-04T13:28:26.4775206Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j0_cuda_uint8 PASSED [0.2608s] [ 33%] 2025-12-04T13:28:26.4775325Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_bessel_j1_cuda_float32 PASSED [0.0168s] [ 33%] 2025-12-04T13:28:26.4775436Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_bfloat16 PASSED [0.3489s] [ 33%] 2025-12-04T13:28:26.4775547Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_float64 PASSED [0.3084s] [ 33%] 2025-12-04T13:28:26.4775657Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_int16 PASSED [0.2258s] [ 33%] 2025-12-04T13:28:26.4775767Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_int64 PASSED [0.0402s] [ 33%] 2025-12-04T13:28:26.4775876Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_entr_cuda_uint8 PASSED [0.0376s] [ 33%] 2025-12-04T13:28:26.4775986Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_erfcx_cuda_int16 PASSED [0.2546s] [ 33%] 2025-12-04T13:28:26.4776097Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_bfloat16 PASSED [0.3539s] [ 33%] 2025-12-04T13:28:26.4776205Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_float32 PASSED [0.7501s] [ 33%] 2025-12-04T13:28:26.4776315Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_float64 PASSED [0.2958s] [ 33%] 2025-12-04T13:28:26.4776420Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i0e_cuda_int64 PASSED [0.2139s] [ 33%] 2025-12-04T13:28:26.4776537Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_int32 PASSED [0.0175s] [ 33%] 2025-12-04T13:28:26.4776641Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1_cuda_uint8 PASSED [0.0164s] [ 33%] 2025-12-04T13:28:26.4776746Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_bool PASSED [0.9419s] [ 33%] 2025-12-04T13:28:26.4776855Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_float64 PASSED [0.2985s] [ 33%] 2025-12-04T13:28:26.4776961Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_int16 PASSED [0.0180s] [ 33%] 2025-12-04T13:28:26.4777066Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_i1e_cuda_int32 PASSED [0.0174s] [ 33%] 2025-12-04T13:28:26.4777182Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_float64 PASSED [0.3629s] [ 33%] 2025-12-04T13:28:26.4777293Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_log_ndtr_cuda_int64 PASSED [0.2679s] [ 33%] 2025-12-04T13:28:26.4777433Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_int16 PASSED [0.0669s] [ 33%] 2025-12-04T13:28:26.4777571Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8 PASSED [0.0635s] [ 33%] 2025-12-04T13:28:26.4777716Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_int32 PASSED [0.0662s] [ 33%] 2025-12-04T13:28:26.4777853Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_int64 PASSED [0.0662s] [ 33%] 2025-12-04T13:28:26.4777999Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_multigammaln_mvlgamma_p_5_cuda_int8 PASSED [0.0634s] [ 33%] 2025-12-04T13:28:26.4778110Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_ndtri_cuda_int8 PASSED [0.2177s] [ 33%] 2025-12-04T13:28:26.4778240Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_bfloat16 PASSED [0.7399s] [ 33%] 2025-12-04T13:28:26.4778370Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_float16 PASSED [0.7317s] [ 33%] 2025-12-04T13:28:26.4778498Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_float32 PASSED [0.7364s] [ 33%] 2025-12-04T13:28:26.4778625Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_softmax_with_dtype_cuda_int16 PASSED [0.7257s] [ 33%] 2025-12-04T13:28:26.4778762Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_spherical_bessel_j0_cuda_bool PASSED [0.2374s] [ 33%] 2025-12-04T13:28:26.4778876Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_int16 PASSED [0.1353s] [ 33%] 2025-12-04T13:28:26.4778987Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_xlog1py_cuda_int64 PASSED [0.1349s] [ 33%] 2025-12-04T13:28:26.4779096Z test_ops.py::TestCommonCUDA::test_python_ref__refs_special_zeta_cuda_int16 PASSED [0.9762s] [ 33%] 2025-12-04T13:28:26.4779210Z test_ops.py::TestCommonCUDA::test_python_ref__refs_split_with_sizes_cuda_float32 PASSED [0.0054s] [ 33%] 2025-12-04T13:28:26.4779316Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sqrt_cuda_complex64 PASSED [0.7570s] [ 33%] 2025-12-04T13:28:26.4779421Z test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_float32 PASSED [0.0216s] [ 33%] 2025-12-04T13:28:26.4779521Z test_ops.py::TestCommonCUDA::test_python_ref__refs_square_cuda_int64 PASSED [0.7368s] [ 33%] 2025-12-04T13:28:26.4779631Z test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_int16 PASSED [0.0067s] [ 33%] 2025-12-04T13:28:26.4779739Z test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_copy_cuda_int64 PASSED [0.7266s] [ 33%] 2025-12-04T13:28:26.4779845Z test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_bfloat16 PASSED [0.0064s] [ 33%] 2025-12-04T13:28:26.4779949Z test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_cuda_float16 PASSED [0.0052s] [ 34%] 2025-12-04T13:28:26.4780069Z test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_complex64 PASSED [0.0046s] [ 34%] 2025-12-04T13:28:26.4780197Z test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_float16 PASSED [0.7294s] [ 34%] 2025-12-04T13:28:26.4780314Z test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_float32 PASSED [0.0064s] [ 34%] 2025-12-04T13:28:26.4780426Z test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_int16 PASSED [0.0042s] [ 34%] 2025-12-04T13:28:26.4780538Z test_ops.py::TestCommonCUDA::test_python_ref__refs_squeeze_multiple_cuda_uint8 PASSED [0.7426s] [ 34%] 2025-12-04T13:28:26.4780642Z test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_float64 PASSED [0.0089s] [ 34%] 2025-12-04T13:28:26.4780745Z test_ops.py::TestCommonCUDA::test_python_ref__refs_stack_cuda_int32 PASSED [0.0066s] [ 34%] 2025-12-04T13:28:26.4780845Z test_ops.py::TestCommonCUDA::test_python_ref__refs_stft_cuda_float32 XFAIL [0.0028s] [ 34%] 2025-12-04T13:28:26.4780945Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_bfloat16 PASSED [0.8150s] [ 34%] 2025-12-04T13:28:26.4781046Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_float64 PASSED [0.0625s] [ 34%] 2025-12-04T13:28:26.4781158Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_int32 PASSED [0.7713s] [ 34%] 2025-12-04T13:28:26.4781254Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sub_cuda_int8 PASSED [0.0507s] [ 34%] 2025-12-04T13:28:26.4781354Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_float16 PASSED [0.0131s] [ 34%] 2025-12-04T13:28:26.4781463Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_float64 PASSED [0.7388s] [ 34%] 2025-12-04T13:28:26.4781561Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_int16 PASSED [0.0118s] [ 34%] 2025-12-04T13:28:26.4781658Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_cuda_int32 PASSED [0.0100s] [ 34%] 2025-12-04T13:28:26.4781768Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_complex64 PASSED [0.0096s] [ 34%] 2025-12-04T13:28:26.4781915Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_float16 PASSED [0.0115s] [ 34%] 2025-12-04T13:28:26.4782021Z test_ops.py::TestCommonCUDA::test_python_ref__refs_sum_to_size_cuda_int32 PASSED [0.0089s] [ 34%] 2025-12-04T13:28:26.4782127Z test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_bfloat16 PASSED [0.7300s] [ 34%] 2025-12-04T13:28:26.4782264Z test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_complex64 PASSED [0.0052s] [ 34%] 2025-12-04T13:28:26.4782366Z test_ops.py::TestCommonCUDA::test_python_ref__refs_t_copy_cuda_int64 PASSED [0.0034s] [ 34%] 2025-12-04T13:28:26.4782461Z test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_int32 PASSED [0.7214s] [ 34%] 2025-12-04T13:28:26.4782558Z test_ops.py::TestCommonCUDA::test_python_ref__refs_t_cuda_uint8 PASSED [0.0044s] [ 34%] 2025-12-04T13:28:26.4782672Z test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_complex64 XFAIL [0.0036s] [ 34%] 2025-12-04T13:28:26.4782783Z test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_int16 XFAIL [0.7171s] [ 34%] 2025-12-04T13:28:26.4782890Z test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_int64 XFAIL [0.7190s] [ 34%] 2025-12-04T13:28:26.4782999Z test_ops.py::TestCommonCUDA::test_python_ref__refs_take_along_dim_cuda_uint8 XFAIL [0.7206s] [ 34%] 2025-12-04T13:28:26.4783098Z test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_float32 PASSED [0.7252s] [ 34%] 2025-12-04T13:28:26.4783197Z test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_int32 PASSED [0.0184s] [ 34%] 2025-12-04T13:28:26.4783294Z test_ops.py::TestCommonCUDA::test_python_ref__refs_tan_cuda_int64 PASSED [0.0180s] [ 34%] 2025-12-04T13:28:26.4783394Z test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_float32 PASSED [0.7390s] [ 34%] 2025-12-04T13:28:26.4783493Z test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_int64 PASSED [0.0201s] [ 34%] 2025-12-04T13:28:26.4783589Z test_ops.py::TestCommonCUDA::test_python_ref__refs_tanh_cuda_int8 PASSED [0.0169s] [ 34%] 2025-12-04T13:28:26.4783733Z test_ops.py::TestCommonCUDA::test_python_ref__refs_tensor_split_cuda_float32 XFAIL [0.0026s] [ 34%] 2025-12-04T13:28:26.4783833Z test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_float64 PASSED [0.7352s] [ 34%] 2025-12-04T13:28:26.4783928Z test_ops.py::TestCommonCUDA::test_python_ref__refs_to_cuda_int8 PASSED [0.0139s] [ 34%] 2025-12-04T13:28:26.4784041Z test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_float16 PASSED [0.0062s] [ 34%] 2025-12-04T13:28:26.4784157Z test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_float32 PASSED [0.0058s] [ 34%] 2025-12-04T13:28:26.4784273Z test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_copy_cuda_float64 PASSED [0.0057s] [ 34%] 2025-12-04T13:28:26.4784378Z test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_bool PASSED [0.7272s] [ 34%] 2025-12-04T13:28:26.4784491Z test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_complex128 PASSED [0.0067s] [ 34%] 2025-12-04T13:28:26.4784599Z test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_float32 PASSED [0.0054s] [ 34%] 2025-12-04T13:28:26.4784724Z test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_float64 PASSED [0.0052s] [ 34%] 2025-12-04T13:28:26.4784829Z test_ops.py::TestCommonCUDA::test_python_ref__refs_transpose_cuda_int32 PASSED [0.7243s] [ 34%] 2025-12-04T13:28:26.4784933Z test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_bfloat16 PASSED [0.0130s] [ 34%] 2025-12-04T13:28:26.4785046Z test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_bool PASSED [0.0103s] [ 34%] 2025-12-04T13:28:26.4785148Z test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_float32 PASSED [0.0109s] [ 34%] 2025-12-04T13:28:26.4785247Z test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_int64 PASSED [0.0099s] [ 34%] 2025-12-04T13:28:26.4785348Z test_ops.py::TestCommonCUDA::test_python_ref__refs_tril_cuda_uint8 PASSED [0.0097s] [ 34%] 2025-12-04T13:28:26.4785454Z test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_bfloat16 PASSED [0.0107s] [ 34%] 2025-12-04T13:28:26.4785557Z test_ops.py::TestCommonCUDA::test_python_ref__refs_triu_cuda_complex64 PASSED [0.7320s] [ 34%] 2025-12-04T13:28:26.4785665Z test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_bool PASSED [0.0848s] [ 34%] 2025-12-04T13:28:26.4785788Z test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_complex128 PASSED [0.0823s] [ 34%] 2025-12-04T13:28:26.4785900Z test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_complex32 XFAIL [0.0251s] [ 34%] 2025-12-04T13:28:26.4786011Z test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_complex64 PASSED [0.8250s] [ 34%] 2025-12-04T13:28:26.4786118Z test_ops.py::TestCommonCUDA::test_python_ref__refs_true_divide_cuda_uint8 PASSED [0.0834s] [ 34%] 2025-12-04T13:28:26.4786227Z test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_complex64 PASSED [0.0108s] [ 34%] 2025-12-04T13:28:26.4786336Z test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_copy_cuda_int8 PASSED [0.7364s] [ 34%] 2025-12-04T13:28:26.4786441Z test_ops.py::TestCommonCUDA::test_python_ref__refs_unbind_cuda_bfloat16 PASSED [0.0112s] [ 34%] 2025-12-04T13:28:26.4786548Z test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_copy_cuda_int8 PASSED [0.0097s] [ 34%] 2025-12-04T13:28:26.4786654Z test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_complex128 PASSED [0.0112s] [ 34%] 2025-12-04T13:28:26.4786755Z test_ops.py::TestCommonCUDA::test_python_ref__refs_unfold_cuda_int32 PASSED [0.0081s] [ 34%] 2025-12-04T13:28:26.4786859Z test_ops.py::TestCommonCUDA::test_python_ref__refs_unsqueeze_cuda_int32 PASSED [0.7299s] [ 34%] 2025-12-04T13:28:26.4786963Z test_ops.py::TestCommonCUDA::test_python_ref__refs_var_cuda_complex128 PASSED [0.0104s] [ 34%] 2025-12-04T13:28:26.4787072Z test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_complex128 PASSED [0.0135s] [ 34%] 2025-12-04T13:28:26.4787191Z test_ops.py::TestCommonCUDA::test_python_ref__refs_var_mean_cuda_float64 PASSED [0.0120s] [ 35%] 2025-12-04T13:28:26.4787295Z test_ops.py::TestCommonCUDA::test_python_ref__refs_vdot_cuda_float16 PASSED [0.7411s] [ 35%] 2025-12-04T13:28:26.4787405Z test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_complex128 PASSED [0.0180s] [ 35%] 2025-12-04T13:28:26.4787510Z test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_float16 PASSED [0.0158s] [ 35%] 2025-12-04T13:28:26.4787612Z test_ops.py::TestCommonCUDA::test_python_ref__refs_view_as_cuda_float64 PASSED [0.0156s] [ 35%] 2025-12-04T13:28:26.4787716Z test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_bool PASSED [0.0049s] [ 35%] 2025-12-04T13:28:26.4787826Z test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_complex128 PASSED [0.7364s] [ 35%] 2025-12-04T13:28:26.4787930Z test_ops.py::TestCommonCUDA::test_python_ref__refs_view_copy_cuda_int32 PASSED [0.0062s] [ 35%] 2025-12-04T13:28:26.4788029Z test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_float32 PASSED [0.0203s] [ 35%] 2025-12-04T13:28:26.4788128Z test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_int16 PASSED [0.7349s] [ 35%] 2025-12-04T13:28:26.4788236Z test_ops.py::TestCommonCUDA::test_python_ref__refs_view_cuda_int8 PASSED [0.0179s] [ 35%] 2025-12-04T13:28:26.4788340Z test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_float16 PASSED [0.7298s] [ 35%] 2025-12-04T13:28:26.4788442Z test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_float64 PASSED [0.0054s] [ 35%] 2025-12-04T13:28:26.4788557Z test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_int64 PASSED [0.0036s] [ 35%] 2025-12-04T13:28:26.4788659Z test_ops.py::TestCommonCUDA::test_python_ref__refs_vsplit_cuda_uint8 PASSED [0.7156s] [ 35%] 2025-12-04T13:28:26.4788757Z test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_bool PASSED [0.1334s] [ 35%] 2025-12-04T13:28:26.4788857Z test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_int32 PASSED [0.1354s] [ 35%] 2025-12-04T13:28:26.4788957Z test_ops.py::TestCommonCUDA::test_python_ref__refs_xlogy_cuda_uint8 PASSED [0.1329s] [ 35%] 2025-12-04T13:28:26.4789068Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_and_cuda PASSED [0.0024s] [ 35%] 2025-12-04T13:28:26.4789190Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_bitwise_right_shift_cuda PASSED [0.0022s] [ 35%] 2025-12-04T13:28:26.4789308Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_copysign_cuda PASSED [0.7256s] [ 35%] 2025-12-04T13:28:26.4789409Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_diag_cuda PASSED [0.7207s] [ 35%] 2025-12-04T13:28:26.4789511Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_dstack_cuda XFAIL [0.0060s] [ 35%] 2025-12-04T13:28:26.4789615Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_hfft2_cuda PASSED [1.4396s] [ 35%] 2025-12-04T13:28:26.4789719Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_ifft2_cuda PASSED [0.7187s] [ 35%] 2025-12-04T13:28:26.4789824Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_ifftn_cuda PASSED [0.7247s] [ 35%] 2025-12-04T13:28:26.4789931Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_irfft2_cuda PASSED [0.7266s] [ 35%] 2025-12-04T13:28:26.4790037Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fft_irfft_cuda PASSED [0.7133s] [ 35%] 2025-12-04T13:28:26.4790141Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_flipud_cuda PASSED [0.7239s] [ 35%] 2025-12-04T13:28:26.4790242Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_fmax_cuda PASSED [0.7203s] [ 35%] 2025-12-04T13:28:26.4790344Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_hypot_cuda PASSED [0.7420s] [ 35%] 2025-12-04T13:28:26.4790448Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_isclose_cuda PASSED [0.0064s] [ 35%] 2025-12-04T13:28:26.4790548Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_lcm_cuda PASSED [0.0039s] [ 35%] 2025-12-04T13:28:26.4790657Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_le_cuda PASSED [0.7181s] [ 35%] 2025-12-04T13:28:26.4790943Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_linalg_diagonal_cuda E1204 12:56:45.987000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] failed while attempting to run meta for aten.diagonal.default 2025-12-04T13:28:26.4791082Z E1204 12:56:45.987000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] Traceback (most recent call last): 2025-12-04T13:28:26.4791348Z E1204 12:56:45.987000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 2823, in _dispatch_impl 2025-12-04T13:28:26.4791473Z E1204 12:56:45.987000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] r = func(*args, **kwargs) 2025-12-04T13:28:26.4791685Z E1204 12:56:45.987000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 836, in __call__ 2025-12-04T13:28:26.4791819Z E1204 12:56:45.987000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] return self._op(*args, **kwargs) 2025-12-04T13:28:26.4792036Z E1204 12:56:45.987000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] RuntimeError: diagonal dimensions cannot be identical 1, 1 2025-12-04T13:28:26.4792206Z E1204 12:56:45.989000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] failed while attempting to run meta for aten.diagonal.default 2025-12-04T13:28:26.4792353Z E1204 12:56:45.989000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] Traceback (most recent call last): 2025-12-04T13:28:26.4792605Z E1204 12:56:45.989000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 2823, in _dispatch_impl 2025-12-04T13:28:26.4792727Z E1204 12:56:45.989000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] r = func(*args, **kwargs) 2025-12-04T13:28:26.4792937Z E1204 12:56:45.989000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 836, in __call__ 2025-12-04T13:28:26.4793071Z E1204 12:56:45.989000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] return self._op(*args, **kwargs) 2025-12-04T13:28:26.4793282Z E1204 12:56:45.989000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 10000) 2025-12-04T13:28:26.4793454Z E1204 12:56:45.991000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] failed while attempting to run meta for aten.diagonal.default 2025-12-04T13:28:26.4793583Z E1204 12:56:45.991000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] Traceback (most recent call last): 2025-12-04T13:28:26.4793834Z E1204 12:56:45.991000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 2823, in _dispatch_impl 2025-12-04T13:28:26.4793954Z E1204 12:56:45.991000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] r = func(*args, **kwargs) 2025-12-04T13:28:26.4794165Z E1204 12:56:45.991000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 836, in __call__ 2025-12-04T13:28:26.4794299Z E1204 12:56:45.991000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] return self._op(*args, **kwargs) 2025-12-04T13:28:26.4794497Z E1204 12:56:45.991000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 10000) 2025-12-04T13:28:26.4794666Z E1204 12:56:45.992000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] failed while attempting to run meta for aten.diagonal.default 2025-12-04T13:28:26.4794810Z E1204 12:56:45.992000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] Traceback (most recent call last): 2025-12-04T13:28:26.4795057Z E1204 12:56:45.992000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 2823, in _dispatch_impl 2025-12-04T13:28:26.4795179Z E1204 12:56:45.992000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] r = func(*args, **kwargs) 2025-12-04T13:28:26.4795390Z E1204 12:56:45.992000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 836, in __call__ 2025-12-04T13:28:26.4795524Z E1204 12:56:45.992000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] return self._op(*args, **kwargs) 2025-12-04T13:28:26.4795687Z E1204 12:56:45.992000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] RuntimeError: diagonal dimensions cannot be identical 1, 1 2025-12-04T13:28:26.4795854Z E1204 12:56:45.993000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] failed while attempting to run meta for aten.diagonal.default 2025-12-04T13:28:26.4796007Z E1204 12:56:45.993000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] Traceback (most recent call last): 2025-12-04T13:28:26.4796258Z E1204 12:56:45.993000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 2823, in _dispatch_impl 2025-12-04T13:28:26.4796402Z E1204 12:56:45.993000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] r = func(*args, **kwargs) 2025-12-04T13:28:26.4796614Z E1204 12:56:45.993000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 836, in __call__ 2025-12-04T13:28:26.4796743Z E1204 12:56:45.993000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] return self._op(*args, **kwargs) 2025-12-04T13:28:26.4796941Z E1204 12:56:45.993000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] IndexError: Dimension out of range (expected to be in range of [-3, 2], but got 10000) 2025-12-04T13:28:26.4797109Z E1204 12:56:45.994000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] failed while attempting to run meta for aten.diagonal.default 2025-12-04T13:28:26.4797251Z E1204 12:56:45.994000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] Traceback (most recent call last): 2025-12-04T13:28:26.4797500Z E1204 12:56:45.994000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 2823, in _dispatch_impl 2025-12-04T13:28:26.4797620Z E1204 12:56:45.994000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] r = func(*args, **kwargs) 2025-12-04T13:28:26.4797830Z E1204 12:56:45.994000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 836, in __call__ 2025-12-04T13:28:26.4797959Z E1204 12:56:45.994000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] return self._op(*args, **kwargs) 2025-12-04T13:28:26.4798155Z E1204 12:56:45.994000 995238 site-packages/torch/_subclasses/fake_tensor.py:2827] IndexError: Dimension out of range (expected to be in range of [-3, 2], but got 10000) 2025-12-04T13:28:26.4798197Z PASSED [0.7340s] [ 35%] 2025-12-04T13:28:26.4798312Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_linspace_cuda PASSED [0.7223s] [ 35%] 2025-12-04T13:28:26.4798421Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_log_normal_cuda PASSED [0.7209s] [ 35%] 2025-12-04T13:28:26.4798528Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_movedim_cuda PASSED [0.7265s] [ 35%] 2025-12-04T13:28:26.4798627Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_mul_cuda PASSED [0.0023s] [ 35%] 2025-12-04T13:28:26.4798747Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_narrow_cuda PASSED [0.7276s] [ 35%] 2025-12-04T13:28:26.4798847Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_neg_cuda PASSED [0.7284s] [ 35%] 2025-12-04T13:28:26.4798974Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_nn_functional_softshrink_cuda PASSED [0.7200s] [ 35%] 2025-12-04T13:28:26.4799083Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_sum_to_size_cuda PASSED [0.7367s] [ 35%] 2025-12-04T13:28:26.4799187Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_unbind_cuda PASSED [0.7333s] [ 35%] 2025-12-04T13:28:26.4799294Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_view_copy_cuda PASSED [0.7353s] [ 35%] 2025-12-04T13:28:26.4799393Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_view_cuda PASSED [0.7312s] [ 35%] 2025-12-04T13:28:26.4799494Z test_ops.py::TestCommonCUDA::test_python_ref_errors__refs_xlogy_cuda PASSED [0.0030s] [ 35%] 2025-12-04T13:28:26.4799621Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_bfloat16 PASSED [0.0126s] [ 35%] 2025-12-04T13:28:26.4799747Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_T_executor_aten_cuda_uint8 PASSED [0.0068s] [ 35%] 2025-12-04T13:28:26.4799913Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_bool PASSED [0.1061s] [ 35%] 2025-12-04T13:28:26.4800075Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_complex128 PASSED [0.1109s] [ 35%] 2025-12-04T13:28:26.4800238Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bfloat16_executor_aten_cuda_uint8 PASSED [0.0829s] [ 35%] 2025-12-04T13:28:26.4800390Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_bool_executor_aten_cuda_complex32 PASSED [0.1083s] [ 35%] 2025-12-04T13:28:26.4800535Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_bool PASSED [0.0904s] [ 35%] 2025-12-04T13:28:26.4800691Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_complex128 PASSED [0.0971s] [ 35%] 2025-12-04T13:28:26.4800840Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_float16 PASSED [0.0933s] [ 35%] 2025-12-04T13:28:26.4800987Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_int64 PASSED [0.0845s] [ 35%] 2025-12-04T13:28:26.4801146Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_byte_executor_aten_cuda_int8 PASSED [0.0803s] [ 35%] 2025-12-04T13:28:26.4801301Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_complex128 PASSED [0.0776s] [ 35%] 2025-12-04T13:28:26.4801454Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_float16 PASSED [0.0896s] [ 35%] 2025-12-04T13:28:26.4801606Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_float64 PASSED [0.0980s] [ 35%] 2025-12-04T13:28:26.4801756Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_int64 PASSED [0.0797s] [ 35%] 2025-12-04T13:28:26.4802070Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cdouble_executor_aten_cuda_uint8 PASSED [0.0758s] [ 35%] 2025-12-04T13:28:26.4802225Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_complex64 PASSED [0.0875s] [ 35%] 2025-12-04T13:28:26.4802378Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_float16 PASSED [0.0892s] [ 35%] 2025-12-04T13:28:26.4802530Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_float32 PASSED [0.0977s] [ 35%] 2025-12-04T13:28:26.4802679Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_int16 PASSED [0.0885s] [ 35%] 2025-12-04T13:28:26.4802844Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_int32 PASSED [0.0882s] [ 36%] 2025-12-04T13:28:26.4802992Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_cfloat_executor_aten_cuda_uint8 PASSED [0.0764s] [ 36%] 2025-12-04T13:28:26.4803143Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_complex128 PASSED [0.1027s] [ 36%] 2025-12-04T13:28:26.4803296Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_int16 PASSED [0.0803s] [ 36%] 2025-12-04T13:28:26.4803442Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_chalf_executor_aten_cuda_uint8 PASSED [0.0759s] [ 36%] 2025-12-04T13:28:26.4803595Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_complex128 PASSED [0.0995s] [ 36%] 2025-12-04T13:28:26.4803745Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_complex32 PASSED [0.0977s] [ 36%] 2025-12-04T13:28:26.4803895Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_float32 PASSED [0.0840s] [ 36%] 2025-12-04T13:28:26.4804055Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_float64 PASSED [0.0836s] [ 36%] 2025-12-04T13:28:26.4804202Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_int16 PASSED [0.0756s] [ 36%] 2025-12-04T13:28:26.4804359Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_char_executor_aten_cuda_int32 PASSED [0.0757s] [ 36%] 2025-12-04T13:28:26.4804512Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_bfloat16 PASSED [0.1027s] [ 36%] 2025-12-04T13:28:26.4804663Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_complex32 PASSED [0.1017s] [ 36%] 2025-12-04T13:28:26.4804813Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_float16 PASSED [0.0872s] [ 36%] 2025-12-04T13:28:26.4804960Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_int16 PASSED [0.8839s] [ 36%] 2025-12-04T13:28:26.4805118Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_double_executor_aten_cuda_int8 PASSED [0.0843s] [ 36%] 2025-12-04T13:28:26.4805271Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_complex128 PASSED [0.1036s] [ 36%] 2025-12-04T13:28:26.4805420Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_float_executor_aten_cuda_float32 PASSED [0.0625s] [ 36%] 2025-12-04T13:28:26.4805567Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_float32 PASSED [0.0852s] [ 36%] 2025-12-04T13:28:26.4805711Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_int32 PASSED [0.0535s] [ 36%] 2025-12-04T13:28:26.4805854Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_int8 PASSED [0.0718s] [ 36%] 2025-12-04T13:28:26.4805997Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_int_executor_aten_cuda_uint8 PASSED [0.0720s] [ 36%] 2025-12-04T13:28:26.4806150Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_complex32 PASSED [0.1026s] [ 36%] 2025-12-04T13:28:26.4806298Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs__conversions_long_executor_aten_cuda_float16 PASSED [0.0925s] [ 36%] 2025-12-04T13:28:26.4806423Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_int16 PASSED [0.0598s] [ 36%] 2025-12-04T13:28:26.4806549Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_int32 PASSED [0.0577s] [ 36%] 2025-12-04T13:28:26.4806684Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_abs_executor_aten_cuda_int64 PASSED [0.0577s] [ 36%] 2025-12-04T13:28:26.4806818Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_bfloat16 PASSED [0.1155s] [ 36%] 2025-12-04T13:28:26.4806951Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_complex32 PASSED [0.1299s] [ 36%] 2025-12-04T13:28:26.4807080Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acos_executor_aten_cuda_int16 PASSED [0.0936s] [ 36%] 2025-12-04T13:28:26.4807208Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_bool PASSED [0.1048s] [ 36%] 2025-12-04T13:28:26.4807341Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_float64 PASSED [0.0761s] [ 36%] 2025-12-04T13:28:26.4807470Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_acosh_executor_aten_cuda_int64 PASSED [0.0864s] [ 36%] 2025-12-04T13:28:26.4807601Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_add_executor_aten_cuda_uint8 PASSED [0.3072s] [ 36%] 2025-12-04T13:28:26.4807755Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addcdiv_executor_aten_cuda_float16 PASSED [0.6021s] [ 36%] 2025-12-04T13:28:26.4807885Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_float16 PASSED [0.0440s] [ 36%] 2025-12-04T13:28:26.4808016Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_float32 PASSED [0.0294s] [ 36%] 2025-12-04T13:28:26.4808153Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_addr_executor_aten_cuda_int8 PASSED [1.3605s] [ 36%] 2025-12-04T13:28:26.4808293Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_bfloat16 PASSED [0.0109s] [ 36%] 2025-12-04T13:28:26.4808427Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_bool PASSED [0.0069s] [ 36%] 2025-12-04T13:28:26.4808571Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_complex128 PASSED [0.0069s] [ 36%] 2025-12-04T13:28:26.4808709Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_float32 PASSED [0.0066s] [ 36%] 2025-12-04T13:28:26.4808848Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_alias_copy_executor_aten_cuda_int64 PASSED [0.0069s] [ 36%] 2025-12-04T13:28:26.4808981Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_bool PASSED [0.0816s] [ 36%] 2025-12-04T13:28:26.4809108Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_int64 PASSED [0.0789s] [ 36%] 2025-12-04T13:28:26.4809232Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_all_executor_aten_cuda_int8 PASSED [0.0801s] [ 36%] 2025-12-04T13:28:26.4809356Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_bool PASSED [0.0414s] [ 36%] 2025-12-04T13:28:26.4809488Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amax_executor_aten_cuda_float32 PASSED [0.0444s] [ 36%] 2025-12-04T13:28:26.4809619Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_float16 PASSED [0.0641s] [ 36%] 2025-12-04T13:28:26.4809748Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_int16 PASSED [0.0429s] [ 36%] 2025-12-04T13:28:26.4809875Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_int64 PASSED [1.3117s] [ 36%] 2025-12-04T13:28:26.4810000Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_amin_executor_aten_cuda_int8 PASSED [0.0471s] [ 36%] 2025-12-04T13:28:26.4810126Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_float16 PASSED [0.0802s] [ 36%] 2025-12-04T13:28:26.4810251Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_any_executor_aten_cuda_int16 PASSED [0.0708s] [ 36%] 2025-12-04T13:28:26.4810396Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_bfloat16 PASSED [0.0915s] [ 36%] 2025-12-04T13:28:26.4810527Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_int64 PASSED [0.0405s] [ 36%] 2025-12-04T13:28:26.4810656Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_arange_executor_aten_cuda_int8 PASSED [0.0390s] [ 36%] 2025-12-04T13:28:26.4810801Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_bool PASSED [0.0146s] [ 36%] 2025-12-04T13:28:26.4810945Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_float64 PASSED [1.2889s] [ 36%] 2025-12-04T13:28:26.4811088Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_int16 PASSED [0.0186s] [ 36%] 2025-12-04T13:28:26.4811227Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_copy_executor_aten_cuda_int8 PASSED [0.0154s] [ 36%] 2025-12-04T13:28:26.4811363Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_executor_aten_cuda_float64 PASSED [0.0124s] [ 36%] 2025-12-04T13:28:26.4811539Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_bfloat16 PASSED [0.0085s] [ 36%] 2025-12-04T13:28:26.4811697Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_complex128 PASSED [0.0086s] [ 36%] 2025-12-04T13:28:26.4811906Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_partial_views_executor_aten_cuda_uint8 PASSED [0.0085s] [ 36%] 2025-12-04T13:28:26.4812054Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_as_strided_scatter_executor_aten_cuda_float64 PASSED [1.2925s] [ 36%] 2025-12-04T13:28:26.4812187Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_complex32 PASSED [0.1258s] [ 36%] 2025-12-04T13:28:26.4812316Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_float32 PASSED [0.0676s] [ 37%] 2025-12-04T13:28:26.4812444Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asin_executor_aten_cuda_int16 PASSED [0.0765s] [ 37%] 2025-12-04T13:28:26.4812578Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_complex32 PASSED [0.1171s] [ 37%] 2025-12-04T13:28:26.4812726Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_complex64 PASSED [0.0816s] [ 37%] 2025-12-04T13:28:26.4812857Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_asinh_executor_aten_cuda_float32 PASSED [0.0669s] [ 37%] 2025-12-04T13:28:26.4812986Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atan2_executor_aten_cuda_int32 PASSED [0.3799s] [ 37%] 2025-12-04T13:28:26.4813120Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_complex64 PASSED [0.0816s] [ 37%] 2025-12-04T13:28:26.4813250Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_float64 PASSED [0.0762s] [ 37%] 2025-12-04T13:28:26.4813376Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atanh_executor_aten_cuda_int8 PASSED [0.0728s] [ 37%] 2025-12-04T13:28:26.4813509Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_bool PASSED [0.0134s] [ 37%] 2025-12-04T13:28:26.4813650Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_complex64 PASSED [0.0142s] [ 37%] 2025-12-04T13:28:26.4813787Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_float16 PASSED [0.0137s] [ 37%] 2025-12-04T13:28:26.4813922Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_1d_executor_aten_cuda_int32 PASSED [0.0126s] [ 37%] 2025-12-04T13:28:26.4814064Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_complex128 PASSED [1.3040s] [ 37%] 2025-12-04T13:28:26.4814215Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_float16 PASSED [0.0195s] [ 37%] 2025-12-04T13:28:26.4814349Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_int16 PASSED [0.0160s] [ 37%] 2025-12-04T13:28:26.4814483Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_2d_executor_aten_cuda_int8 PASSED [0.0148s] [ 37%] 2025-12-04T13:28:26.4814622Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_bfloat16 PASSED [0.0195s] [ 37%] 2025-12-04T13:28:26.4814755Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_bool PASSED [1.2774s] [ 37%] 2025-12-04T13:28:26.4814896Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_complex128 PASSED [0.0245s] [ 37%] 2025-12-04T13:28:26.4815032Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_float32 PASSED [0.0194s] [ 37%] 2025-12-04T13:28:26.4815168Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_atleast_3d_executor_aten_cuda_int8 PASSED [0.0173s] [ 37%] 2025-12-04T13:28:26.4815303Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_and_executor_aten_cuda_int16 PASSED [0.2883s] [ 37%] 2025-12-04T13:28:26.4815464Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_left_shift_executor_aten_cuda_int16 PASSED [0.2885s] [ 37%] 2025-12-04T13:28:26.4815610Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_left_shift_executor_aten_cuda_uint8 PASSED [0.2784s] [ 37%] 2025-12-04T13:28:26.4815757Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_not_executor_aten_cuda_bool PASSED [0.0777s] [ 37%] 2025-12-04T13:28:26.4815892Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_or_executor_aten_cuda_int16 PASSED [0.2738s] [ 37%] 2025-12-04T13:28:26.4816041Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_right_shift_executor_aten_cuda_int16 PASSED [0.2740s] [ 37%] 2025-12-04T13:28:26.4816177Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bitwise_xor_executor_aten_cuda_int32 PASSED [0.2800s] [ 37%] 2025-12-04T13:28:26.4816319Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_block_diag_executor_aten_cuda_complex64 PASSED [0.1007s] [ 37%] 2025-12-04T13:28:26.4816467Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_shapes_executor_aten_cuda_float32 PASSED [0.0113s] [ 37%] 2025-12-04T13:28:26.4816627Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_tensors_executor_aten_cuda_bfloat16 PASSED [0.0314s] [ 37%] 2025-12-04T13:28:26.4816764Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_broadcast_to_executor_aten_cuda_bool PASSED [1.3304s] [ 37%] 2025-12-04T13:28:26.4816900Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_float16 XFAIL [0.0145s] [ 37%] 2025-12-04T13:28:26.4817036Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_int32 XFAIL [1.2691s] [ 37%] 2025-12-04T13:28:26.4817167Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_bucketize_executor_aten_cuda_int8 XFAIL [1.2785s] [ 37%] 2025-12-04T13:28:26.4817300Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_complex128 PASSED [1.3222s] [ 37%] 2025-12-04T13:28:26.4817430Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cat_executor_aten_cuda_float64 PASSED [0.0377s] [ 37%] 2025-12-04T13:28:26.4817562Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_bfloat16 PASSED [0.1036s] [ 37%] 2025-12-04T13:28:26.4817691Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_float16 PASSED [0.1050s] [ 37%] 2025-12-04T13:28:26.4817817Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ceil_executor_aten_cuda_int8 PASSED [0.0532s] [ 37%] 2025-12-04T13:28:26.4817947Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int32 PASSED [0.0677s] [ 37%] 2025-12-04T13:28:26.4818084Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_chunk_executor_aten_cuda_int8 PASSED [0.0684s] [ 37%] 2025-12-04T13:28:26.4818221Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_float16 PASSED [0.6389s] [ 37%] 2025-12-04T13:28:26.4818355Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_int32 PASSED [0.4485s] [ 37%] 2025-12-04T13:28:26.4818488Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_max_executor_aten_cuda_uint8 PASSED [0.4445s] [ 37%] 2025-12-04T13:28:26.4818624Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_bfloat16 PASSED [0.6441s] [ 37%] 2025-12-04T13:28:26.4818757Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_int64 PASSED [0.4557s] [ 37%] 2025-12-04T13:28:26.4818887Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clamp_min_executor_aten_cuda_int8 PASSED [0.4481s] [ 37%] 2025-12-04T13:28:26.4819019Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_bfloat16 PASSED [0.1334s] [ 37%] 2025-12-04T13:28:26.4819162Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_complex64 PASSED [0.1358s] [ 37%] 2025-12-04T13:28:26.4819291Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_int8 PASSED [0.1267s] [ 37%] 2025-12-04T13:28:26.4819429Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_clone_executor_aten_cuda_uint8 PASSED [0.1268s] [ 37%] 2025-12-04T13:28:26.4819565Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_column_stack_executor_aten_cuda_bool PASSED [1.4381s] [ 37%] 2025-12-04T13:28:26.4819697Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_complex128 PASSED [0.0931s] [ 37%] 2025-12-04T13:28:26.4819826Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_executor_aten_cuda_float32 PASSED [0.0547s] [ 37%] 2025-12-04T13:28:26.4819970Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_bfloat16 PASSED [0.0487s] [ 37%] 2025-12-04T13:28:26.4820109Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_conj_physical_executor_aten_cuda_int16 PASSED [0.0397s] [ 37%] 2025-12-04T13:28:26.4820265Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_float32 PASSED [0.2502s] [ 37%] 2025-12-04T13:28:26.4820408Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_constant_pad_nd_executor_aten_cuda_uint8 PASSED [0.2417s] [ 37%] 2025-12-04T13:28:26.4820543Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_contiguous_executor_aten_cuda_bool PASSED [0.1054s] [ 37%] 2025-12-04T13:28:26.4820678Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_copysign_executor_aten_cuda_float32 PASSED [0.5261s] [ 37%] 2025-12-04T13:28:26.4820811Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_complex128 PASSED [0.0845s] [ 37%] 2025-12-04T13:28:26.4820940Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_complex64 PASSED [0.0918s] [ 37%] 2025-12-04T13:28:26.4821067Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_int64 PASSED [0.0857s] [ 37%] 2025-12-04T13:28:26.4821190Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_int8 PASSED [0.0802s] [ 37%] 2025-12-04T13:28:26.4821316Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cos_executor_aten_cuda_uint8 PASSED [0.0801s] [ 37%] 2025-12-04T13:28:26.4821445Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_float16 PASSED [0.1123s] [ 38%] 2025-12-04T13:28:26.4821569Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cosh_executor_aten_cuda_int8 PASSED [0.0801s] [ 38%] 2025-12-04T13:28:26.4821721Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_count_nonzero_executor_aten_cuda_bfloat16 PASSED [0.0697s] [ 38%] 2025-12-04T13:28:26.4821896Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumprod_executor_aten_cuda_uint8 PASSED [0.0702s] [ 38%] 2025-12-04T13:28:26.4822030Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_bfloat16 PASSED [1.3749s] [ 38%] 2025-12-04T13:28:26.4822166Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_complex128 PASSED [0.0301s] [ 38%] 2025-12-04T13:28:26.4822299Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_float16 PASSED [1.3197s] [ 38%] 2025-12-04T13:28:26.4822428Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_int16 PASSED [0.0367s] [ 38%] 2025-12-04T13:28:26.4822558Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_cumsum_executor_aten_cuda_int64 PASSED [0.0271s] [ 38%] 2025-12-04T13:28:26.4822691Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_deg2rad_executor_aten_cuda_float16 PASSED [0.1146s] [ 38%] 2025-12-04T13:28:26.4822829Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_bfloat16 PASSED [0.2273s] [ 38%] 2025-12-04T13:28:26.4822983Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_embed_executor_aten_cuda_complex128 PASSED [0.2285s] [ 38%] 2025-12-04T13:28:26.4823117Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_complex128 PASSED [0.0474s] [ 38%] 2025-12-04T13:28:26.4823256Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_int16 PASSED [0.0447s] [ 38%] 2025-12-04T13:28:26.4823381Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diag_executor_aten_cuda_int64 PASSED [0.0450s] [ 38%] 2025-12-04T13:28:26.4823527Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_complex128 PASSED [0.0619s] [ 38%] 2025-12-04T13:28:26.4823669Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_float32 PASSED [0.0623s] [ 38%] 2025-12-04T13:28:26.4823808Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int16 PASSED [0.0597s] [ 38%] 2025-12-04T13:28:26.4823946Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_copy_executor_aten_cuda_int8 PASSED [0.0603s] [ 38%] 2025-12-04T13:28:26.4824110Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_diagonal_scatter_executor_aten_cuda_complex64 PASSED [0.0733s] [ 38%] 2025-12-04T13:28:26.4824242Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_bool PASSED [0.1048s] [ 38%] 2025-12-04T13:28:26.4824376Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_float16 PASSED [0.5108s] [ 38%] 2025-12-04T13:28:26.4824508Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_float64 PASSED [0.4040s] [ 38%] 2025-12-04T13:28:26.4824642Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_digamma_executor_aten_cuda_int64 PASSED [0.0857s] [ 38%] 2025-12-04T13:28:26.4824795Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_floor_rounding_executor_aten_cuda_bfloat16 PASSED [1.9042s] [ 38%] 2025-12-04T13:28:26.4824964Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_complex128 SKIPPED [0.0002s] (Skipped!) [ 38%] 2025-12-04T13:28:26.4825117Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_div_no_rounding_mode_executor_aten_cuda_float32 PASSED [0.3051s] [ 38%] 2025-12-04T13:28:26.4825247Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dot_executor_aten_cuda_complex64 PASSED [0.0098s] [ 38%] 2025-12-04T13:28:26.4825380Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_bfloat16 PASSED [0.0093s] [ 38%] 2025-12-04T13:28:26.4825535Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_complex128 PASSED [0.0092s] [ 38%] 2025-12-04T13:28:26.4825670Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_complex64 PASSED [0.0093s] [ 38%] 2025-12-04T13:28:26.4825801Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dsplit_executor_aten_cuda_uint8 PASSED [0.0094s] [ 38%] 2025-12-04T13:28:26.4825934Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_dstack_executor_aten_cuda_bfloat16 PASSED [0.0127s] [ 38%] 2025-12-04T13:28:26.4826104Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_bool SKIPPED [0.0001s] (Can't check result for empty) [ 38%] 2025-12-04T13:28:26.4826275Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int16 SKIPPED [0.0001s] (Can't check result for empty) [ 38%] 2025-12-04T13:28:26.4826443Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_executor_aten_cuda_int64 SKIPPED [0.0001s] (Can't check result for empty) [ 38%] 2025-12-04T13:28:26.4826629Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_float16 SKIPPED [0.0002s] (Can't check result for empty_like) [ 38%] 2025-12-04T13:28:26.4826820Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_like_executor_aten_cuda_int16 SKIPPED [0.0001s] (Can't check result for empty_like) [ 38%] 2025-12-04T13:28:26.4827015Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_bool SKIPPED [0.0001s] (Expected: empty_strided is not comparable) [ 38%] 2025-12-04T13:28:26.4827220Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_int16 SKIPPED [0.0001s] (Expected: empty_strided is not comparable) [ 38%] 2025-12-04T13:28:26.4827412Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_empty_strided_executor_aten_cuda_int8 SKIPPED [0.0001s] (Expected: empty_strided is not comparable) [ 38%] 2025-12-04T13:28:26.4827542Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_bfloat16 PASSED [0.4053s] [ 38%] 2025-12-04T13:28:26.4827666Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_float16 PASSED [0.4149s] [ 38%] 2025-12-04T13:28:26.4827793Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_float32 PASSED [0.2896s] [ 38%] 2025-12-04T13:28:26.4827929Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eq_executor_aten_cuda_int16 PASSED [0.2786s] [ 38%] 2025-12-04T13:28:26.4828063Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_bfloat16 PASSED [0.0309s] [ 38%] 2025-12-04T13:28:26.4828197Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_float16 PASSED [0.0304s] [ 38%] 2025-12-04T13:28:26.4828326Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_float64 PASSED [0.0269s] [ 38%] 2025-12-04T13:28:26.4828457Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_int32 PASSED [0.0266s] [ 38%] 2025-12-04T13:28:26.4828583Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_int64 PASSED [0.0266s] [ 38%] 2025-12-04T13:28:26.4828711Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_equal_executor_aten_cuda_int8 PASSED [0.0266s] [ 38%] 2025-12-04T13:28:26.4828836Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erf_executor_aten_cuda_int32 PASSED [0.0888s] [ 38%] 2025-12-04T13:28:26.4828969Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_float16 PASSED [0.4280s] [ 38%] 2025-12-04T13:28:26.4829096Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_int16 PASSED [0.0876s] [ 38%] 2025-12-04T13:28:26.4829221Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_int8 PASSED [0.0819s] [ 38%] 2025-12-04T13:28:26.4829346Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfc_executor_aten_cuda_uint8 PASSED [0.0813s] [ 38%] 2025-12-04T13:28:26.4829485Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_erfinv_executor_aten_cuda_int64 PASSED [0.0787s] [ 38%] 2025-12-04T13:28:26.4829619Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp2_executor_aten_cuda_complex64 PASSED [0.0918s] [ 38%] 2025-12-04T13:28:26.4829748Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_bfloat16 PASSED [1.7608s] [ 38%] 2025-12-04T13:28:26.4829881Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_complex128 PASSED [0.0890s] [ 38%] 2025-12-04T13:28:26.4830012Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exp_executor_aten_cuda_complex64 PASSED [0.4691s] [ 38%] 2025-12-04T13:28:26.4830153Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_float64 PASSED [1.3218s] [ 38%] 2025-12-04T13:28:26.4830288Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_int16 PASSED [0.0136s] [ 38%] 2025-12-04T13:28:26.4830427Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_as_executor_aten_cuda_int32 PASSED [0.0095s] [ 38%] 2025-12-04T13:28:26.4830581Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_float16 PASSED [0.0262s] [ 38%] 2025-12-04T13:28:26.4830721Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_copy_executor_aten_cuda_int8 PASSED [0.0247s] [ 38%] 2025-12-04T13:28:26.4830863Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expand_executor_aten_cuda_int16 PASSED [0.0195s] [ 38%] 2025-12-04T13:28:26.4830995Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_int16 PASSED [0.0766s] [ 39%] 2025-12-04T13:28:26.4831124Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_expm1_executor_aten_cuda_int32 PASSED [0.0761s] [ 39%] 2025-12-04T13:28:26.4831268Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exponential_executor_aten_cuda_bfloat16 XFAIL [0.0141s] [ 39%] 2025-12-04T13:28:26.4831410Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_exponential_executor_aten_cuda_float16 XFAIL [0.0134s] [ 39%] 2025-12-04T13:28:26.4831541Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_complex64 PASSED [1.7483s] [ 39%] 2025-12-04T13:28:26.4831681Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float16 PASSED [0.4128s] [ 39%] 2025-12-04T13:28:26.4831808Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float64 PASSED [0.4126s] [ 39%] 2025-12-04T13:28:26.4831994Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_eye_executor_aten_cuda_float8_e4m3fnuz PASSED [0.4100s] [ 39%] 2025-12-04T13:28:26.4832132Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft2_executor_aten_cuda_complex128 PASSED [0.3533s] [ 39%] 2025-12-04T13:28:26.4832265Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_bool PASSED [1.3614s] [ 39%] 2025-12-04T13:28:26.4832394Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fft_executor_aten_cuda_int8 PASSED [0.0306s] [ 39%] 2025-12-04T13:28:26.4832536Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_complex128 PASSED [0.0290s] [ 39%] 2025-12-04T13:28:26.4832670Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_float32 PASSED [0.0348s] [ 39%] 2025-12-04T13:28:26.4832803Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftn_executor_aten_cuda_int16 PASSED [0.0335s] [ 39%] 2025-12-04T13:28:26.4832947Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_complex32 PASSED [0.0253s] [ 39%] 2025-12-04T13:28:26.4833094Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_complex64 PASSED [0.0238s] [ 39%] 2025-12-04T13:28:26.4833253Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_fftshift_executor_aten_cuda_float16 PASSED [0.0242s] [ 39%] 2025-12-04T13:28:26.4833395Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_complex128 PASSED [0.5885s] [ 39%] 2025-12-04T13:28:26.4833534Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_float64 PASSED [0.0281s] [ 39%] 2025-12-04T13:28:26.4833668Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft2_executor_aten_cuda_int64 PASSED [0.0286s] [ 39%] 2025-12-04T13:28:26.4833801Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_bool PASSED [0.0283s] [ 39%] 2025-12-04T13:28:26.4833935Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfft_executor_aten_cuda_float16 PASSED [0.0329s] [ 39%] 2025-12-04T13:28:26.4834075Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_complex32 PASSED [0.6265s] [ 39%] 2025-12-04T13:28:26.4834212Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_float16 PASSED [0.0420s] [ 39%] 2025-12-04T13:28:26.4834348Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_float64 PASSED [0.0327s] [ 39%] 2025-12-04T13:28:26.4834495Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_int16 PASSED [0.0321s] [ 39%] 2025-12-04T13:28:26.4834627Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_hfftn_executor_aten_cuda_int64 PASSED [0.0321s] [ 39%] 2025-12-04T13:28:26.4834774Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft2_executor_aten_cuda_int8 PASSED [1.3507s] [ 39%] 2025-12-04T13:28:26.4834907Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_float32 PASSED [0.0326s] [ 39%] 2025-12-04T13:28:26.4835038Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_int8 PASSED [0.0327s] [ 39%] 2025-12-04T13:28:26.4835169Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifft_executor_aten_cuda_uint8 PASSED [0.0327s] [ 39%] 2025-12-04T13:28:26.4835303Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_float16 PASSED [0.0408s] [ 39%] 2025-12-04T13:28:26.4835433Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftn_executor_aten_cuda_int8 PASSED [0.0365s] [ 39%] 2025-12-04T13:28:26.4835589Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_bool PASSED [0.0240s] [ 39%] 2025-12-04T13:28:26.4835728Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ifftshift_executor_aten_cuda_int16 PASSED [0.0239s] [ 39%] 2025-12-04T13:28:26.4835880Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft2_executor_aten_cuda_int64 SKIPPED [0.0001s] (Skipped!) [ 39%] 2025-12-04T13:28:26.4836010Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfft_executor_aten_cuda_int8 PASSED [0.0325s] [ 39%] 2025-12-04T13:28:26.4836144Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_ihfftn_executor_aten_cuda_int8 PASSED [0.0422s] [ 39%] 2025-12-04T13:28:26.4836284Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_complex64 PASSED [0.0201s] [ 39%] 2025-12-04T13:28:26.4836419Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft2_executor_aten_cuda_int8 PASSED [0.0236s] [ 39%] 2025-12-04T13:28:26.4836553Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfft_executor_aten_cuda_uint8 PASSED [0.0245s] [ 39%] 2025-12-04T13:28:26.4836692Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_complex32 PASSED [0.0386s] [ 39%] 2025-12-04T13:28:26.4836826Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_int64 PASSED [0.0295s] [ 39%] 2025-12-04T13:28:26.4836958Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_irfftn_executor_aten_cuda_uint8 PASSED [0.0294s] [ 39%] 2025-12-04T13:28:26.4837103Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_float64 PASSED [0.1806s] [ 39%] 2025-12-04T13:28:26.4837236Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft2_executor_aten_cuda_uint8 PASSED [0.0288s] [ 39%] 2025-12-04T13:28:26.4837368Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_int16 PASSED [0.0261s] [ 39%] 2025-12-04T13:28:26.4837499Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfft_executor_aten_cuda_int64 PASSED [0.0264s] [ 39%] 2025-12-04T13:28:26.4837632Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fft_rfftn_executor_aten_cuda_uint8 PASSED [0.0335s] [ 39%] 2025-12-04T13:28:26.4837764Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_complex128 PASSED [0.0908s] [ 39%] 2025-12-04T13:28:26.4837897Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fill_executor_aten_cuda_complex32 PASSED [0.0905s] [ 39%] 2025-12-04T13:28:26.4838036Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_complex64 PASSED [0.0950s] [ 39%] 2025-12-04T13:28:26.4838183Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_float16 PASSED [0.0943s] [ 39%] 2025-12-04T13:28:26.4838315Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flatten_executor_aten_cuda_int16 PASSED [0.0902s] [ 39%] 2025-12-04T13:28:26.4838456Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flip_executor_aten_cuda_float32 PASSED [0.0202s] [ 39%] 2025-12-04T13:28:26.4838587Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_bool PASSED [0.0058s] [ 39%] 2025-12-04T13:28:26.4838719Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_float64 PASSED [0.0062s] [ 39%] 2025-12-04T13:28:26.4838851Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_int8 PASSED [0.0062s] [ 39%] 2025-12-04T13:28:26.4838982Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fliplr_executor_aten_cuda_uint8 PASSED [0.0057s] [ 39%] 2025-12-04T13:28:26.4839120Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_complex128 PASSED [0.0062s] [ 39%] 2025-12-04T13:28:26.4839267Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_int32 PASSED [0.0057s] [ 39%] 2025-12-04T13:28:26.4839398Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_flipud_executor_aten_cuda_uint8 PASSED [0.0054s] [ 39%] 2025-12-04T13:28:26.4839534Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_divide_executor_aten_cuda_uint8 PASSED [0.3006s] [ 39%] 2025-12-04T13:28:26.4839668Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_bfloat16 PASSED [0.1033s] [ 39%] 2025-12-04T13:28:26.4839798Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_floor_executor_aten_cuda_int8 PASSED [0.0536s] [ 39%] 2025-12-04T13:28:26.4839926Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_float32 PASSED [0.2687s] [ 39%] 2025-12-04T13:28:26.4840057Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmax_executor_aten_cuda_float64 PASSED [0.2823s] [ 39%] 2025-12-04T13:28:26.4840183Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_bool PASSED [0.2515s] [ 40%] 2025-12-04T13:28:26.4840314Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_float16 PASSED [0.4268s] [ 40%] 2025-12-04T13:28:26.4840439Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_fmin_executor_aten_cuda_int8 PASSED [0.2490s] [ 40%] 2025-12-04T13:28:26.4840564Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_gcd_executor_aten_cuda_int16 PASSED [0.2598s] [ 40%] 2025-12-04T13:28:26.4840689Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ge_executor_aten_cuda_float64 PASSED [0.2834s] [ 40%] 2025-12-04T13:28:26.4840890Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_int64 SKIPPED [0.0002s] (Expected: geometric is not comparable) [ 40%] 2025-12-04T13:28:26.4841077Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_int8 SKIPPED [0.0001s] (Expected: geometric is not comparable) [ 40%] 2025-12-04T13:28:26.4841269Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_geometric_executor_aten_cuda_uint8 SKIPPED [0.0001s] (Expected: geometric is not comparable) [ 40%] 2025-12-04T13:28:26.4841405Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_float64 PASSED [0.5796s] [ 40%] 2025-12-04T13:28:26.4841541Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_int32 PASSED [2.0423s] [ 40%] 2025-12-04T13:28:26.4841675Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_heaviside_executor_aten_cuda_int64 PASSED [0.4896s] [ 40%] 2025-12-04T13:28:26.4841807Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_bfloat16 PASSED [0.0095s] [ 40%] 2025-12-04T13:28:26.4842010Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_complex32 PASSED [0.0094s] [ 40%] 2025-12-04T13:28:26.4842142Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_float32 PASSED [1.3181s] [ 40%] 2025-12-04T13:28:26.4842290Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hsplit_executor_aten_cuda_int8 PASSED [0.0117s] [ 40%] 2025-12-04T13:28:26.4842421Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_float16 PASSED [0.0091s] [ 40%] 2025-12-04T13:28:26.4842553Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_float64 PASSED [0.0085s] [ 40%] 2025-12-04T13:28:26.4842682Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_int16 PASSED [0.0086s] [ 40%] 2025-12-04T13:28:26.4842811Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hstack_executor_aten_cuda_int32 PASSED [0.0079s] [ 40%] 2025-12-04T13:28:26.4842942Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_hypot_executor_aten_cuda_float16 PASSED [0.4327s] [ 40%] 2025-12-04T13:28:26.4843068Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_bfloat16 PASSED [0.4653s] [ 40%] 2025-12-04T13:28:26.4843211Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_float16 PASSED [0.4501s] [ 40%] 2025-12-04T13:28:26.4843335Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_i0_executor_aten_cuda_int8 PASSED [0.0760s] [ 40%] 2025-12-04T13:28:26.4843467Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_igamma_executor_aten_cuda_float64 PASSED [0.2819s] [ 40%] 2025-12-04T13:28:26.4843599Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_imag_executor_aten_cuda_complex128 PASSED [0.0932s] [ 40%] 2025-12-04T13:28:26.4843732Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_imag_executor_aten_cuda_complex32 PASSED [0.0922s] [ 40%] 2025-12-04T13:28:26.4843863Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_imag_executor_aten_cuda_complex64 PASSED [0.0925s] [ 40%] 2025-12-04T13:28:26.4844002Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_add_executor_aten_cuda_complex64 PASSED [0.0277s] [ 40%] 2025-12-04T13:28:26.4844143Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_bfloat16 PASSED [0.0124s] [ 40%] 2025-12-04T13:28:26.4844287Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_complex128 PASSED [0.0111s] [ 40%] 2025-12-04T13:28:26.4844428Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_complex64 PASSED [0.0111s] [ 40%] 2025-12-04T13:28:26.4844563Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_copy_executor_aten_cuda_int8 PASSED [0.0114s] [ 40%] 2025-12-04T13:28:26.4844716Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_complex128 PASSED [0.0266s] [ 40%] 2025-12-04T13:28:26.4844853Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_float16 PASSED [0.0262s] [ 40%] 2025-12-04T13:28:26.4844988Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_int8 PASSED [0.0255s] [ 40%] 2025-12-04T13:28:26.4845124Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_fill_executor_aten_cuda_uint8 PASSED [0.0257s] [ 40%] 2025-12-04T13:28:26.4845268Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_bfloat16 PASSED [0.0104s] [ 40%] 2025-12-04T13:28:26.4845412Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_complex64 PASSED [0.0103s] [ 40%] 2025-12-04T13:28:26.4845554Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_float16 PASSED [0.0109s] [ 40%] 2025-12-04T13:28:26.4845694Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_float32 PASSED [0.0103s] [ 40%] 2025-12-04T13:28:26.4845848Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_int16 PASSED [0.0100s] [ 40%] 2025-12-04T13:28:26.4845986Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_int64 PASSED [0.0107s] [ 40%] 2025-12-04T13:28:26.4846136Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_index_select_executor_aten_cuda_int8 PASSED [0.0099s] [ 40%] 2025-12-04T13:28:26.4846272Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isclose_executor_aten_cuda_complex64 PASSED [0.9052s] [ 40%] 2025-12-04T13:28:26.4846402Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_bool PASSED [0.0947s] [ 40%] 2025-12-04T13:28:26.4846535Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_float32 PASSED [0.0851s] [ 40%] 2025-12-04T13:28:26.4846664Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_int16 PASSED [0.0754s] [ 40%] 2025-12-04T13:28:26.4846796Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_int32 PASSED [0.0753s] [ 40%] 2025-12-04T13:28:26.4846935Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isinf_executor_aten_cuda_uint8 PASSED [0.0707s] [ 40%] 2025-12-04T13:28:26.4847068Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_bfloat16 PASSED [0.0852s] [ 40%] 2025-12-04T13:28:26.4847195Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isnan_executor_aten_cuda_bool PASSED [0.0713s] [ 40%] 2025-12-04T13:28:26.4847331Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_bfloat16 PASSED [0.0887s] [ 40%] 2025-12-04T13:28:26.4847463Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_bool PASSED [0.0929s] [ 40%] 2025-12-04T13:28:26.4847600Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_float16 PASSED [0.0886s] [ 40%] 2025-12-04T13:28:26.4847732Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isneginf_executor_aten_cuda_uint8 PASSED [0.0752s] [ 40%] 2025-12-04T13:28:26.4847869Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_bfloat16 PASSED [0.1002s] [ 40%] 2025-12-04T13:28:26.4848001Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_bool PASSED [0.0937s] [ 40%] 2025-12-04T13:28:26.4848138Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_float32 PASSED [0.0688s] [ 40%] 2025-12-04T13:28:26.4848272Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_int16 PASSED [0.0751s] [ 40%] 2025-12-04T13:28:26.4848414Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_int64 PASSED [0.0761s] [ 40%] 2025-12-04T13:28:26.4848549Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isposinf_executor_aten_cuda_uint8 PASSED [0.0707s] [ 40%] 2025-12-04T13:28:26.4848677Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_bool PASSED [0.1139s] [ 40%] 2025-12-04T13:28:26.4848815Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_complex128 PASSED [0.1115s] [ 40%] 2025-12-04T13:28:26.4848946Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_isreal_executor_aten_cuda_int16 PASSED [0.0847s] [ 40%] 2025-12-04T13:28:26.4849185Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_istft_executor_aten_cuda_complex128 SKIPPED [0.0002s] (Expected: unfold_backward() got an unexpected keyword argument 'input_sizes') [ 40%] 2025-12-04T13:28:26.4849316Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_complex64 PASSED [0.0088s] [ 40%] 2025-12-04T13:28:26.4849445Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_item_executor_aten_cuda_float16 PASSED [0.0090s] [ 40%] 2025-12-04T13:28:26.4849579Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_le_executor_aten_cuda_int16 PASSED [0.2862s] [ 41%] 2025-12-04T13:28:26.4849711Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lerp_executor_aten_cuda_float64 PASSED [0.1249s] [ 41%] 2025-12-04T13:28:26.4849852Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_float32 PASSED [0.0776s] [ 41%] 2025-12-04T13:28:26.4849984Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_lgamma_executor_aten_cuda_int64 PASSED [0.3118s] [ 41%] 2025-12-04T13:28:26.4850130Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_complex64 PASSED [0.0313s] [ 41%] 2025-12-04T13:28:26.4850271Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_float16 PASSED [1.4739s] [ 41%] 2025-12-04T13:28:26.4850411Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_cross_executor_aten_cuda_int16 PASSED [0.0334s] [ 41%] 2025-12-04T13:28:26.4850557Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_diagonal_executor_aten_cuda_float32 PASSED [0.0316s] [ 41%] 2025-12-04T13:28:26.4850720Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_matrix_norm_executor_aten_cuda_bfloat16 PASSED [0.2508s] [ 41%] 2025-12-04T13:28:26.4850864Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_complex128 PASSED [0.4699s] [ 41%] 2025-12-04T13:28:26.4851004Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_norm_executor_aten_cuda_float16 PASSED [1.7675s] [ 41%] 2025-12-04T13:28:26.4851142Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svd_executor_aten_cuda_complex64 PASSED [0.7670s] [ 41%] 2025-12-04T13:28:26.4851282Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_svd_executor_aten_cuda_float32 PASSED [0.7630s] [ 41%] 2025-12-04T13:28:26.4851424Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vecdot_executor_aten_cuda_float64 PASSED [0.1233s] [ 41%] 2025-12-04T13:28:26.4851578Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linalg_vector_norm_executor_aten_cuda_float16 PASSED [0.7874s] [ 41%] 2025-12-04T13:28:26.4851719Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_complex128 PASSED [0.2155s] [ 41%] 2025-12-04T13:28:26.4851896Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_complex64 PASSED [0.2223s] [ 41%] 2025-12-04T13:28:26.4852029Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_executor_aten_cuda_int32 XFAIL [0.0058s] [ 41%] 2025-12-04T13:28:26.4852188Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_linspace_tensor_overload_executor_aten_cuda_complex128 PASSED [1.0309s] [ 41%] 2025-12-04T13:28:26.4852338Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_complex128 PASSED [0.0920s] [ 41%] 2025-12-04T13:28:26.4852468Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_float64 PASSED [0.0765s] [ 41%] 2025-12-04T13:28:26.4852599Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log10_executor_aten_cuda_int64 PASSED [0.0871s] [ 41%] 2025-12-04T13:28:26.4852729Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log1p_executor_aten_cuda_int64 PASSED [0.0782s] [ 41%] 2025-12-04T13:28:26.4852865Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log2_executor_aten_cuda_complex64 PASSED [0.3798s] [ 41%] 2025-12-04T13:28:26.4852988Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_bool PASSED [0.1047s] [ 41%] 2025-12-04T13:28:26.4853113Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_executor_aten_cuda_int32 PASSED [0.0874s] [ 41%] 2025-12-04T13:28:26.4853305Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_normal_executor_aten_cuda_float16 SKIPPED [0.0003s] (Expected: log_normal is not comparable) [ 41%] 2025-12-04T13:28:26.4853472Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_bfloat16 PASSED [0.0658s] [ 41%] 2025-12-04T13:28:26.4853625Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_float32 PASSED [0.0589s] [ 41%] 2025-12-04T13:28:26.4853793Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_int16 PASSED [0.0579s] [ 41%] 2025-12-04T13:28:26.4853942Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_int32 PASSED [0.0577s] [ 41%] 2025-12-04T13:28:26.4854090Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_log_softmax_with_dtype_executor_aten_cuda_uint8 PASSED [0.0578s] [ 41%] 2025-12-04T13:28:26.4854231Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp2_executor_aten_cuda_float64 PASSED [0.0109s] [ 41%] 2025-12-04T13:28:26.4854369Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp_executor_aten_cuda_bfloat16 PASSED [0.9423s] [ 41%] 2025-12-04T13:28:26.4854508Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logaddexp_executor_aten_cuda_float32 PASSED [0.8082s] [ 41%] 2025-12-04T13:28:26.4854657Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_bool PASSED [0.2523s] [ 41%] 2025-12-04T13:28:26.4854795Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_int16 PASSED [0.3538s] [ 41%] 2025-12-04T13:28:26.4854934Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_and_executor_aten_cuda_uint8 PASSED [0.3487s] [ 41%] 2025-12-04T13:28:26.4855078Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_complex128 PASSED [0.0911s] [ 41%] 2025-12-04T13:28:26.4855219Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_not_executor_aten_cuda_float32 PASSED [0.0775s] [ 41%] 2025-12-04T13:28:26.4855359Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_or_executor_aten_cuda_complex64 PASSED [0.3824s] [ 41%] 2025-12-04T13:28:26.4855495Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logical_xor_executor_aten_cuda_bool PASSED [0.2538s] [ 41%] 2025-12-04T13:28:26.4855631Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_executor_aten_cuda_float64 PASSED [3.5352s] [ 41%] 2025-12-04T13:28:26.4855795Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_complex128 PASSED [10.5454s] [ 41%] 2025-12-04T13:28:26.4855947Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logspace_tensor_overload_executor_aten_cuda_int8 PASSED [5.3610s] [ 41%] 2025-12-04T13:28:26.4856097Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_float32 PASSED [0.1269s] [ 41%] 2025-12-04T13:28:26.4856230Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_logsumexp_executor_aten_cuda_int8 PASSED [0.0592s] [ 41%] 2025-12-04T13:28:26.4856368Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_int32 PASSED [0.0283s] [ 41%] 2025-12-04T13:28:26.4856503Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_int64 PASSED [0.0276s] [ 41%] 2025-12-04T13:28:26.4856642Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_masked_fill_executor_aten_cuda_uint8 PASSED [0.0268s] [ 41%] 2025-12-04T13:28:26.4856774Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_maximum_executor_aten_cuda_bool PASSED [0.2439s] [ 41%] 2025-12-04T13:28:26.4856907Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_complex128 PASSED [0.0570s] [ 41%] 2025-12-04T13:28:26.4857039Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mean_executor_aten_cuda_float64 PASSED [0.0561s] [ 41%] 2025-12-04T13:28:26.4857199Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_complex128 PASSED [0.0396s] [ 41%] 2025-12-04T13:28:26.4857368Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_float16 PASSED [0.0390s] [ 41%] 2025-12-04T13:28:26.4857522Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_float64 PASSED [0.0445s] [ 41%] 2025-12-04T13:28:26.4857684Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_int64 PASSED [0.0381s] [ 41%] 2025-12-04T13:28:26.4857835Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_list_of_tensors_executor_aten_cuda_uint8 PASSED [0.0366s] [ 41%] 2025-12-04T13:28:26.4857996Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_complex64 PASSED [0.0396s] [ 41%] 2025-12-04T13:28:26.4858151Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_float64 PASSED [0.0374s] [ 41%] 2025-12-04T13:28:26.4858306Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_int16 PASSED [0.0366s] [ 41%] 2025-12-04T13:28:26.4858471Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_meshgrid_variadic_tensors_executor_aten_cuda_int32 PASSED [0.0369s] [ 41%] 2025-12-04T13:28:26.4858607Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_bfloat16 PASSED [0.4330s] [ 41%] 2025-12-04T13:28:26.4858742Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_float16 PASSED [0.4406s] [ 41%] 2025-12-04T13:28:26.4858872Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_minimum_executor_aten_cuda_int8 PASSED [0.2587s] [ 41%] 2025-12-04T13:28:26.4859008Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_bfloat16 PASSED [0.0259s] [ 41%] 2025-12-04T13:28:26.4859144Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_complex32 PASSED [0.0256s] [ 41%] 2025-12-04T13:28:26.4859279Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_int64 PASSED [1.7229s] [ 42%] 2025-12-04T13:28:26.4859414Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_movedim_executor_aten_cuda_uint8 PASSED [0.0289s] [ 42%] 2025-12-04T13:28:26.4859546Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_mul_executor_aten_cuda_complex128 PASSED [0.3285s] [ 42%] 2025-12-04T13:28:26.4859679Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_float32 PASSED [0.1749s] [ 42%] 2025-12-04T13:28:26.4859811Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nan_to_num_executor_aten_cuda_uint8 PASSED [0.0617s] [ 42%] 2025-12-04T13:28:26.4859965Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_bfloat16 PASSED [0.0602s] [ 42%] 2025-12-04T13:28:26.4860105Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_float16 PASSED [0.0600s] [ 42%] 2025-12-04T13:28:26.4860244Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_float32 PASSED [0.0602s] [ 42%] 2025-12-04T13:28:26.4860379Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_int32 PASSED [0.0592s] [ 42%] 2025-12-04T13:28:26.4860516Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_int64 PASSED [0.0589s] [ 42%] 2025-12-04T13:28:26.4860650Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_copy_executor_aten_cuda_int8 PASSED [0.0584s] [ 42%] 2025-12-04T13:28:26.4860781Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_bool PASSED [0.1061s] [ 42%] 2025-12-04T13:28:26.4860917Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_complex128 PASSED [0.1114s] [ 42%] 2025-12-04T13:28:26.4861064Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_complex32 PASSED [0.1120s] [ 42%] 2025-12-04T13:28:26.4861193Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_narrow_executor_aten_cuda_int16 PASSED [0.1067s] [ 42%] 2025-12-04T13:28:26.4861353Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_native_layer_norm_executor_aten_cuda_float16 PASSED [0.1895s] [ 42%] 2025-12-04T13:28:26.4861483Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_complex64 PASSED [0.2959s] [ 42%] 2025-12-04T13:28:26.4861610Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ne_executor_aten_cuda_float64 PASSED [0.2819s] [ 42%] 2025-12-04T13:28:26.4861736Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_neg_executor_aten_cuda_uint8 PASSED [0.0541s] [ 42%] 2025-12-04T13:28:26.4861968Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_complex128 SKIPPED [0.0001s] (Can't check result for new_empty) [ 42%] 2025-12-04T13:28:26.4862154Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_complex32 SKIPPED [0.0001s] (Can't check result for new_empty) [ 42%] 2025-12-04T13:28:26.4862351Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_complex64 SKIPPED [0.0001s] (Can't check result for new_empty) [ 42%] 2025-12-04T13:28:26.4862529Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_int32 SKIPPED [0.0001s] (Can't check result for new_empty) [ 42%] 2025-12-04T13:28:26.4862704Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_executor_aten_cuda_int8 SKIPPED [0.0001s] (Can't check result for new_empty) [ 42%] 2025-12-04T13:28:26.4862911Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_empty_strided_executor_aten_cuda_complex64 SKIPPED [0.0001s] (Expected: empty_strided is not comparable) [ 42%] 2025-12-04T13:28:26.4863046Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_float16 PASSED [0.0230s] [ 42%] 2025-12-04T13:28:26.4863177Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_full_executor_aten_cuda_int8 PASSED [0.0212s] [ 42%] 2025-12-04T13:28:26.4863316Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_complex128 PASSED [0.0225s] [ 42%] 2025-12-04T13:28:26.4863451Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_complex32 PASSED [0.0264s] [ 42%] 2025-12-04T13:28:26.4863586Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_float64 PASSED [0.0252s] [ 42%] 2025-12-04T13:28:26.4863717Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_int64 PASSED [0.0223s] [ 42%] 2025-12-04T13:28:26.4863859Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_ones_executor_aten_cuda_int8 PASSED [0.0212s] [ 42%] 2025-12-04T13:28:26.4863996Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_new_zeros_executor_aten_cuda_complex64 PASSED [0.0214s] [ 42%] 2025-12-04T13:28:26.4864134Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nextafter_executor_aten_cuda_bfloat16 PASSED [0.2783s] [ 42%] 2025-12-04T13:28:26.4864285Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_celu_executor_aten_cuda_bfloat16 PASSED [0.1965s] [ 42%] 2025-12-04T13:28:26.4864433Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_celu_executor_aten_cuda_float16 PASSED [0.1882s] [ 42%] 2025-12-04T13:28:26.4864596Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_float64 PASSED [0.0134s] [ 42%] 2025-12-04T13:28:26.4864757Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_int32 PASSED [0.0127s] [ 42%] 2025-12-04T13:28:26.4864917Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_channel_shuffle_executor_aten_cuda_uint8 PASSED [0.0134s] [ 42%] 2025-12-04T13:28:26.4865077Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_gelu_executor_aten_cuda_float64 PASSED [0.0411s] [ 42%] 2025-12-04T13:28:26.4865234Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_group_norm_executor_aten_cuda_float64 PASSED [0.4865s] [ 42%] 2025-12-04T13:28:26.4865401Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardshrink_executor_aten_cuda_bfloat16 PASSED [0.2097s] [ 42%] 2025-12-04T13:28:26.4865558Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardshrink_executor_aten_cuda_float64 PASSED [0.1371s] [ 42%] 2025-12-04T13:28:26.4865711Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_bfloat16 PASSED [0.2766s] [ 42%] 2025-12-04T13:28:26.4865869Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_float16 PASSED [0.2763s] [ 42%] 2025-12-04T13:28:26.4866022Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_hardtanh_executor_aten_cuda_int32 PASSED [0.2074s] [ 42%] 2025-12-04T13:28:26.4866188Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_huber_loss_executor_aten_cuda_bfloat16 PASSED [0.0779s] [ 42%] 2025-12-04T13:28:26.4866341Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_huber_loss_executor_aten_cuda_float32 PASSED [0.0585s] [ 42%] 2025-12-04T13:28:26.4866494Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_complex128 PASSED [0.0314s] [ 42%] 2025-12-04T13:28:26.4866644Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_float32 PASSED [0.0332s] [ 42%] 2025-12-04T13:28:26.4866792Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_l1_loss_executor_aten_cuda_float64 PASSED [0.0233s] [ 42%] 2025-12-04T13:28:26.4866947Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_layer_norm_executor_aten_cuda_bfloat16 PASSED [0.0642s] [ 42%] 2025-12-04T13:28:26.4867117Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_float16 PASSED [0.0583s] [ 42%] 2025-12-04T13:28:26.4867291Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_float32 PASSED [0.0578s] [ 42%] 2025-12-04T13:28:26.4867456Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_log_softmax_with_dtype_executor_aten_cuda_int8 PASSED [0.0580s] [ 42%] 2025-12-04T13:28:26.4867623Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_float16 PASSED [0.4139s] [ 42%] 2025-12-04T13:28:26.4867796Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_margin_ranking_loss_executor_aten_cuda_int16 PASSED [0.2010s] [ 42%] 2025-12-04T13:28:26.4867961Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pairwise_distance_executor_aten_cuda_int8 PASSED [0.0388s] [ 42%] 2025-12-04T13:28:26.4868122Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_complex64 PASSED [0.0233s] [ 42%] 2025-12-04T13:28:26.4868282Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_shuffle_executor_aten_cuda_float64 PASSED [0.0234s] [ 42%] 2025-12-04T13:28:26.4868445Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_bfloat16 PASSED [0.0229s] [ 42%] 2025-12-04T13:28:26.4868602Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_pixel_unshuffle_executor_aten_cuda_int32 PASSED [0.0224s] [ 42%] 2025-12-04T13:28:26.4868765Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_poisson_nll_loss_executor_aten_cuda_float16 PASSED [0.5847s] [ 42%] 2025-12-04T13:28:26.4868928Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_prelu_executor_aten_cuda_float16 PASSED [0.5739s] [ 42%] 2025-12-04T13:28:26.4869078Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_float32 PASSED [0.2205s] [ 42%] 2025-12-04T13:28:26.4869237Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_int64 PASSED [0.1892s] [ 42%] 2025-12-04T13:28:26.4869384Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu6_executor_aten_cuda_int8 PASSED [0.1803s] [ 42%] 2025-12-04T13:28:26.4869528Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_relu_executor_aten_cuda_int16 PASSED [0.1055s] [ 43%] 2025-12-04T13:28:26.4869676Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_selu_executor_aten_cuda_float64 PASSED [0.1564s] [ 43%] 2025-12-04T13:28:26.4869835Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_smooth_l1_loss_executor_aten_cuda_float64 PASSED [0.0507s] [ 43%] 2025-12-04T13:28:26.4870005Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_complex128 PASSED [0.0440s] [ 43%] 2025-12-04T13:28:26.4870182Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_float32 PASSED [0.0425s] [ 43%] 2025-12-04T13:28:26.4870343Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_int16 PASSED [0.0428s] [ 43%] 2025-12-04T13:28:26.4870504Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_int64 PASSED [0.0433s] [ 43%] 2025-12-04T13:28:26.4870665Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmax_with_dtype_executor_aten_cuda_uint8 PASSED [0.0424s] [ 43%] 2025-12-04T13:28:26.4870831Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_complex64 PASSED [1.8757s] [ 43%] 2025-12-04T13:28:26.4870995Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_float64 PASSED [0.0424s] [ 43%] 2025-12-04T13:28:26.4871157Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_int16 PASSED [0.0458s] [ 43%] 2025-12-04T13:28:26.4871317Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_int64 PASSED [0.0455s] [ 43%] 2025-12-04T13:28:26.4871480Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softmin_with_dtype_executor_aten_cuda_uint8 PASSED [0.0453s] [ 43%] 2025-12-04T13:28:26.4871647Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_softshrink_executor_aten_cuda_float64 PASSED [0.2119s] [ 43%] 2025-12-04T13:28:26.4871800Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_threshold_executor_aten_cuda_float32 PASSED [0.1091s] [ 43%] 2025-12-04T13:28:26.4871996Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_float64 PASSED [0.0788s] [ 43%] 2025-12-04T13:28:26.4872160Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_nn_functional_triplet_margin_loss_executor_aten_cuda_int16 PASSED [0.0941s] [ 43%] 2025-12-04T13:28:26.4872292Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_norm_executor_aten_cuda_float16 PASSED [0.1910s] [ 43%] 2025-12-04T13:28:26.4872483Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_executor_aten_cuda_float16 SKIPPED [0.0002s] (make_traced() doesn't set seed properly!) [ 43%] 2025-12-04T13:28:26.4872671Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_normal_executor_aten_cuda_float32 SKIPPED [0.0001s] (make_traced() doesn't set seed properly!) [ 43%] 2025-12-04T13:28:26.4872799Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ones_executor_aten_cuda_int32 PASSED [0.0062s] [ 43%] 2025-12-04T13:28:26.4872956Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_bfloat16 PASSED [0.1644s] [ 43%] 2025-12-04T13:28:26.4873102Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_complex64 PASSED [0.1714s] [ 43%] 2025-12-04T13:28:26.4873255Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_float64 PASSED [0.1647s] [ 43%] 2025-12-04T13:28:26.4873394Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_int16 PASSED [0.1585s] [ 43%] 2025-12-04T13:28:26.4873529Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_copy_executor_aten_cuda_int64 PASSED [0.1576s] [ 43%] 2025-12-04T13:28:26.4873666Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_complex32 PASSED [0.1352s] [ 43%] 2025-12-04T13:28:26.4873801Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_float32 PASSED [0.1328s] [ 43%] 2025-12-04T13:28:26.4873934Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_int32 PASSED [0.1260s] [ 43%] 2025-12-04T13:28:26.4874079Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_int64 PASSED [0.1192s] [ 43%] 2025-12-04T13:28:26.4874213Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_permute_executor_aten_cuda_uint8 PASSED [0.1308s] [ 43%] 2025-12-04T13:28:26.4874353Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_complex128 PASSED [0.0657s] [ 43%] 2025-12-04T13:28:26.4874490Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_float16 PASSED [1.4906s] [ 43%] 2025-12-04T13:28:26.4874626Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_positive_executor_aten_cuda_float32 PASSED [0.0523s] [ 43%] 2025-12-04T13:28:26.4874759Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_complex128 PASSED [0.3128s] [ 43%] 2025-12-04T13:28:26.4874888Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_complex64 PASSED [0.3128s] [ 43%] 2025-12-04T13:28:26.4875016Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_int64 PASSED [0.2749s] [ 43%] 2025-12-04T13:28:26.4875142Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_pow_executor_aten_cuda_uint8 PASSED [0.2686s] [ 43%] 2025-12-04T13:28:26.4875272Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_prod_executor_aten_cuda_complex64 PASSED [0.0905s] [ 43%] 2025-12-04T13:28:26.4875403Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_ravel_executor_aten_cuda_float32 PASSED [0.0087s] [ 43%] 2025-12-04T13:28:26.4875542Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_int64 PASSED [0.0459s] [ 43%] 2025-12-04T13:28:26.4875671Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_real_executor_aten_cuda_uint8 PASSED [0.0430s] [ 43%] 2025-12-04T13:28:26.4875808Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_float32 PASSED [0.0753s] [ 43%] 2025-12-04T13:28:26.4875945Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_int16 PASSED [0.0854s] [ 43%] 2025-12-04T13:28:26.4876082Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reciprocal_executor_aten_cuda_uint8 PASSED [0.0800s] [ 43%] 2025-12-04T13:28:26.4876220Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_bfloat16 PASSED [0.4727s] [ 43%] 2025-12-04T13:28:26.4876353Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_int32 PASSED [0.2889s] [ 43%] 2025-12-04T13:28:26.4876489Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_int64 PASSED [0.2844s] [ 43%] 2025-12-04T13:28:26.4876634Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_remainder_executor_aten_cuda_uint8 PASSED [0.2778s] [ 43%] 2025-12-04T13:28:26.4876766Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_renorm_executor_aten_cuda_float16 PASSED [0.0448s] [ 43%] 2025-12-04T13:28:26.4876909Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_renorm_executor_aten_cuda_float32 PASSED [0.0294s] [ 43%] 2025-12-04T13:28:26.4877040Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_bfloat16 PASSED [0.1330s] [ 43%] 2025-12-04T13:28:26.4877176Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_complex128 PASSED [0.1337s] [ 43%] 2025-12-04T13:28:26.4877307Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_float64 PASSED [0.1320s] [ 43%] 2025-12-04T13:28:26.4877439Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_repeat_executor_aten_cuda_int32 PASSED [0.1300s] [ 43%] 2025-12-04T13:28:26.4877577Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_bfloat16 PASSED [0.0920s] [ 43%] 2025-12-04T13:28:26.4877714Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_bool PASSED [0.0948s] [ 43%] 2025-12-04T13:28:26.4877868Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_complex64 PASSED [0.0931s] [ 43%] 2025-12-04T13:28:26.4878007Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_int32 PASSED [0.0895s] [ 43%] 2025-12-04T13:28:26.4878142Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_as_executor_aten_cuda_int64 PASSED [0.0893s] [ 43%] 2025-12-04T13:28:26.4878277Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_float16 PASSED [0.1110s] [ 43%] 2025-12-04T13:28:26.4878409Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_reshape_executor_aten_cuda_int16 PASSED [0.1165s] [ 43%] 2025-12-04T13:28:26.4878538Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_roll_executor_aten_cuda_int64 PASSED [0.0541s] [ 43%] 2025-12-04T13:28:26.4878668Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_bool PASSED [0.0752s] [ 43%] 2025-12-04T13:28:26.4878798Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_int32 PASSED [0.0741s] [ 43%] 2025-12-04T13:28:26.4878929Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rot90_executor_aten_cuda_int64 PASSED [0.0741s] [ 43%] 2025-12-04T13:28:26.4879061Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_bfloat16 PASSED [0.1059s] [ 43%] 2025-12-04T13:28:26.4879192Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_round_executor_aten_cuda_int32 PASSED [0.0584s] [ 44%] 2025-12-04T13:28:26.4879332Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_float32 PASSED [0.0765s] [ 44%] 2025-12-04T13:28:26.4879464Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsqrt_executor_aten_cuda_int64 PASSED [0.0874s] [ 44%] 2025-12-04T13:28:26.4879591Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_rsub_executor_aten_cuda_int16 PASSED [0.2849s] [ 44%] 2025-12-04T13:28:26.4879740Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_select_scatter_executor_aten_cuda_bfloat16 PASSED [0.0306s] [ 44%] 2025-12-04T13:28:26.4879865Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_bool PASSED [0.0728s] [ 44%] 2025-12-04T13:28:26.4879997Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_complex128 PASSED [0.1421s] [ 44%] 2025-12-04T13:28:26.4880123Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_int16 PASSED [0.0600s] [ 44%] 2025-12-04T13:28:26.4880247Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sgn_executor_aten_cuda_uint8 PASSED [0.0551s] [ 44%] 2025-12-04T13:28:26.4880393Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sigmoid_executor_aten_cuda_int32 PASSED [0.1381s] [ 44%] 2025-12-04T13:28:26.4880523Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sign_executor_aten_cuda_bfloat16 PASSED [0.1125s] [ 44%] 2025-12-04T13:28:26.4880676Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_bfloat16 PASSED [0.0860s] [ 44%] 2025-12-04T13:28:26.4880806Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_signbit_executor_aten_cuda_int64 PASSED [0.0587s] [ 44%] 2025-12-04T13:28:26.4880938Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_complex128 PASSED [0.2808s] [ 44%] 2025-12-04T13:28:26.4881062Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_int64 PASSED [0.0856s] [ 44%] 2025-12-04T13:28:26.4881186Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sin_executor_aten_cuda_int8 PASSED [0.0794s] [ 44%] 2025-12-04T13:28:26.4881312Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_bool PASSED [0.1932s] [ 44%] 2025-12-04T13:28:26.4881439Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_int32 PASSED [0.1571s] [ 44%] 2025-12-04T13:28:26.4881577Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinc_executor_aten_cuda_int64 PASSED [0.1572s] [ 44%] 2025-12-04T13:28:26.4881704Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_bool PASSED [0.0970s] [ 44%] 2025-12-04T13:28:26.4881834Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_float16 PASSED [0.1064s] [ 44%] 2025-12-04T13:28:26.4882010Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_float64 PASSED [0.0696s] [ 44%] 2025-12-04T13:28:26.4882139Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sinh_executor_aten_cuda_int64 PASSED [0.0843s] [ 44%] 2025-12-04T13:28:26.4882292Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_complex128 PASSED [0.0470s] [ 44%] 2025-12-04T13:28:26.4882439Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_int32 PASSED [0.0438s] [ 44%] 2025-12-04T13:28:26.4882586Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_int64 PASSED [0.0440s] [ 44%] 2025-12-04T13:28:26.4882730Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_softmax_with_dtype_executor_aten_cuda_int8 PASSED [0.0434s] [ 44%] 2025-12-04T13:28:26.4882873Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_bool PASSED [0.1066s] [ 44%] 2025-12-04T13:28:26.4883040Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_float32 PASSED [0.0770s] [ 44%] 2025-12-04T13:28:26.4883186Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j0_executor_aten_cuda_int32 PASSED [0.0874s] [ 44%] 2025-12-04T13:28:26.4883331Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_bessel_j1_executor_aten_cuda_int32 PASSED [0.3082s] [ 44%] 2025-12-04T13:28:26.4883473Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_bfloat16 PASSED [0.2571s] [ 44%] 2025-12-04T13:28:26.4883610Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_bool PASSED [0.2578s] [ 44%] 2025-12-04T13:28:26.4883749Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_entr_executor_aten_cuda_int16 PASSED [0.2085s] [ 44%] 2025-12-04T13:28:26.4883891Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_float32 PASSED [0.2551s] [ 44%] 2025-12-04T13:28:26.4884031Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_int16 PASSED [0.0891s] [ 44%] 2025-12-04T13:28:26.4884168Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_erfcx_executor_aten_cuda_int64 PASSED [0.0878s] [ 44%] 2025-12-04T13:28:26.4884320Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_float16 PASSED [0.4474s] [ 44%] 2025-12-04T13:28:26.4884457Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i0e_executor_aten_cuda_float32 PASSED [0.0753s] [ 44%] 2025-12-04T13:28:26.4884607Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_float32 PASSED [0.0819s] [ 44%] 2025-12-04T13:28:26.4884741Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1_executor_aten_cuda_int8 PASSED [0.0804s] [ 44%] 2025-12-04T13:28:26.4884882Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_bfloat16 PASSED [0.4547s] [ 44%] 2025-12-04T13:28:26.4885021Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_float64 PASSED [0.0761s] [ 44%] 2025-12-04T13:28:26.4885158Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_i1e_executor_aten_cuda_uint8 PASSED [0.0802s] [ 44%] 2025-12-04T13:28:26.4885327Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_complex128 PASSED [0.0594s] [ 44%] 2025-12-04T13:28:26.4885508Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_complex32 PASSED [0.0603s] [ 44%] 2025-12-04T13:28:26.4885673Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_complex64 PASSED [0.0593s] [ 44%] 2025-12-04T13:28:26.4885833Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_float32 PASSED [0.0590s] [ 44%] 2025-12-04T13:28:26.4885995Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_log_softmax_with_dtype_executor_aten_cuda_float64 PASSED [0.0553s] [ 44%] 2025-12-04T13:28:26.4886135Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_logit_executor_aten_cuda_uint8 PASSED [0.1932s] [ 44%] 2025-12-04T13:28:26.4886304Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_1_executor_aten_cuda_int32 PASSED [0.3756s] [ 44%] 2025-12-04T13:28:26.4886470Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_int32 PASSED [0.3830s] [ 44%] 2025-12-04T13:28:26.4886634Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_3_executor_aten_cuda_int64 PASSED [0.3717s] [ 44%] 2025-12-04T13:28:26.4886800Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_bfloat16 PASSED [2.2133s] [ 44%] 2025-12-04T13:28:26.4886976Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_float64 PASSED [0.3936s] [ 44%] 2025-12-04T13:28:26.4887141Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_int64 PASSED [0.3690s] [ 44%] 2025-12-04T13:28:26.4887302Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_multigammaln_mvlgamma_p_5_executor_aten_cuda_int8 PASSED [0.3699s] [ 44%] 2025-12-04T13:28:26.4887447Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_bfloat16 PASSED [1.6061s] [ 44%] 2025-12-04T13:28:26.4887589Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_float32 PASSED [0.1348s] [ 44%] 2025-12-04T13:28:26.4887727Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtr_executor_aten_cuda_int16 PASSED [0.1353s] [ 44%] 2025-12-04T13:28:26.4887864Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_ndtri_executor_aten_cuda_int8 PASSED [0.0801s] [ 44%] 2025-12-04T13:28:26.4888024Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_bfloat16 PASSED [0.0427s] [ 44%] 2025-12-04T13:28:26.4888190Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_softmax_with_dtype_executor_aten_cuda_bool PASSED [0.0431s] [ 44%] 2025-12-04T13:28:26.4888349Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_float64 PASSED [0.3747s] [ 44%] 2025-12-04T13:28:26.4888516Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_spherical_bessel_j0_executor_aten_cuda_int32 PASSED [0.0857s] [ 44%] 2025-12-04T13:28:26.4888659Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_special_xlog1py_executor_aten_cuda_uint8 PASSED [0.7106s] [ 44%] 2025-12-04T13:28:26.4888809Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_complex128 PASSED [0.0171s] [ 44%] 2025-12-04T13:28:26.4888956Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_split_with_sizes_executor_aten_cuda_complex64 PASSED [0.0219s] [ 45%] 2025-12-04T13:28:26.4889088Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_bfloat16 PASSED [0.1094s] [ 45%] 2025-12-04T13:28:26.4889215Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_bool PASSED [0.0953s] [ 45%] 2025-12-04T13:28:26.4889358Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_complex32 PASSED [0.3456s] [ 45%] 2025-12-04T13:28:26.4889485Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_int32 PASSED [0.0780s] [ 45%] 2025-12-04T13:28:26.4889613Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sqrt_executor_aten_cuda_int64 PASSED [0.0778s] [ 45%] 2025-12-04T13:28:26.4889749Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_complex128 PASSED [0.0948s] [ 45%] 2025-12-04T13:28:26.4889903Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_complex64 SKIPPED [0.0002s] (Skipped!) [ 45%] 2025-12-04T13:28:26.4890038Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_float16 PASSED [0.1195s] [ 45%] 2025-12-04T13:28:26.4890171Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_float32 PASSED [0.0790s] [ 45%] 2025-12-04T13:28:26.4890303Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_square_executor_aten_cuda_int16 PASSED [0.0688s] [ 45%] 2025-12-04T13:28:26.4890442Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_copy_executor_aten_cuda_float16 PASSED [0.0241s] [ 45%] 2025-12-04T13:28:26.4890577Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_bfloat16 PASSED [0.0180s] [ 45%] 2025-12-04T13:28:26.4890716Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_bool PASSED [0.0176s] [ 45%] 2025-12-04T13:28:26.4890853Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_complex64 PASSED [0.0184s] [ 45%] 2025-12-04T13:28:26.4890986Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_float16 PASSED [0.0178s] [ 45%] 2025-12-04T13:28:26.4891121Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_float32 PASSED [1.4845s] [ 45%] 2025-12-04T13:28:26.4891254Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_executor_aten_cuda_int16 PASSED [0.0200s] [ 45%] 2025-12-04T13:28:26.4891405Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_complex64 PASSED [0.0147s] [ 45%] 2025-12-04T13:28:26.4891552Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_squeeze_multiple_executor_aten_cuda_float16 PASSED [0.0137s] [ 45%] 2025-12-04T13:28:26.4891684Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_float32 PASSED [0.0356s] [ 45%] 2025-12-04T13:28:26.4891816Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_stack_executor_aten_cuda_float64 PASSED [0.0354s] [ 45%] 2025-12-04T13:28:26.4892009Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_std_executor_aten_cuda_complex64 PASSED [0.0405s] [ 45%] 2025-12-04T13:28:26.4892140Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_bfloat16 PASSED [0.4934s] [ 45%] 2025-12-04T13:28:26.4892279Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_float64 PASSED [0.3085s] [ 45%] 2025-12-04T13:28:26.4892404Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sub_executor_aten_cuda_int8 PASSED [0.2848s] [ 45%] 2025-12-04T13:28:26.4892527Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int16 PASSED [0.0529s] [ 45%] 2025-12-04T13:28:26.4892652Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int32 PASSED [0.0543s] [ 45%] 2025-12-04T13:28:26.4892776Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_int64 PASSED [0.0457s] [ 45%] 2025-12-04T13:28:26.4892901Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_executor_aten_cuda_uint8 PASSED [0.0541s] [ 45%] 2025-12-04T13:28:26.4893046Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_sum_to_size_executor_aten_cuda_uint8 PASSED [0.0450s] [ 45%] 2025-12-04T13:28:26.4893183Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_complex128 PASSED [0.0096s] [ 45%] 2025-12-04T13:28:26.4893314Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_copy_executor_aten_cuda_float32 PASSED [1.4215s] [ 45%] 2025-12-04T13:28:26.4893438Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_float32 PASSED [0.0105s] [ 45%] 2025-12-04T13:28:26.4893564Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_t_executor_aten_cuda_uint8 PASSED [0.0076s] [ 45%] 2025-12-04T13:28:26.4893712Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_float32 PASSED [0.0220s] [ 45%] 2025-12-04T13:28:26.4893855Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_take_along_dim_executor_aten_cuda_uint8 PASSED [0.0200s] [ 45%] 2025-12-04T13:28:26.4893982Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_bool PASSED [0.0951s] [ 45%] 2025-12-04T13:28:26.4894112Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_float16 PASSED [0.1042s] [ 45%] 2025-12-04T13:28:26.4894235Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tan_executor_aten_cuda_int64 PASSED [0.0777s] [ 45%] 2025-12-04T13:28:26.4894361Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_int64 PASSED [0.0771s] [ 45%] 2025-12-04T13:28:26.4894487Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tanh_executor_aten_cuda_uint8 PASSED [0.0728s] [ 45%] 2025-12-04T13:28:26.4894638Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tensor_split_executor_aten_cuda_int16 PASSED [0.0348s] [ 45%] 2025-12-04T13:28:26.4894764Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_bool PASSED [0.0628s] [ 45%] 2025-12-04T13:28:26.4894894Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_complex128 PASSED [0.0624s] [ 45%] 2025-12-04T13:28:26.4895019Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_to_executor_aten_cuda_float32 PASSED [0.0623s] [ 45%] 2025-12-04T13:28:26.4895152Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_complex64 PASSED [0.0053s] [ 45%] 2025-12-04T13:28:26.4895280Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_int32 PASSED [0.0053s] [ 45%] 2025-12-04T13:28:26.4895405Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trace_executor_aten_cuda_int8 PASSED [0.0051s] [ 45%] 2025-12-04T13:28:26.4895554Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_complex128 PASSED [0.0222s] [ 45%] 2025-12-04T13:28:26.4895714Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_complex32 PASSED [0.0228s] [ 45%] 2025-12-04T13:28:26.4895863Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_complex64 PASSED [0.0216s] [ 45%] 2025-12-04T13:28:26.4896018Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_float16 PASSED [0.0222s] [ 45%] 2025-12-04T13:28:26.4896164Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_int16 PASSED [0.0216s] [ 45%] 2025-12-04T13:28:26.4896304Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_int32 PASSED [0.0211s] [ 45%] 2025-12-04T13:28:26.4896443Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_int8 PASSED [0.0219s] [ 45%] 2025-12-04T13:28:26.4896583Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_copy_executor_aten_cuda_uint8 PASSED [0.0214s] [ 45%] 2025-12-04T13:28:26.4896719Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_bool PASSED [0.0173s] [ 45%] 2025-12-04T13:28:26.4896871Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_float64 PASSED [0.0186s] [ 45%] 2025-12-04T13:28:26.4897007Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_transpose_executor_aten_cuda_int16 PASSED [0.0166s] [ 45%] 2025-12-04T13:28:26.4897133Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_bool PASSED [0.0514s] [ 45%] 2025-12-04T13:28:26.4897265Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_complex32 PASSED [0.0521s] [ 45%] 2025-12-04T13:28:26.4897396Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_complex64 PASSED [0.0519s] [ 45%] 2025-12-04T13:28:26.4897525Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_float16 PASSED [0.0518s] [ 45%] 2025-12-04T13:28:26.4897656Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_float64 PASSED [0.0518s] [ 45%] 2025-12-04T13:28:26.4897782Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_tril_executor_aten_cuda_uint8 PASSED [0.0496s] [ 45%] 2025-12-04T13:28:26.4897912Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_float32 PASSED [0.0510s] [ 45%] 2025-12-04T13:28:26.4898039Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_int32 PASSED [0.0501s] [ 46%] 2025-12-04T13:28:26.4898167Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_triu_executor_aten_cuda_int64 PASSED [0.0569s] [ 46%] 2025-12-04T13:28:26.4898320Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_complex128 PASSED [0.3260s] [ 46%] 2025-12-04T13:28:26.4898458Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_true_divide_executor_aten_cuda_float64 PASSED [0.3080s] [ 46%] 2025-12-04T13:28:26.4898592Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_bfloat16 PASSED [0.1064s] [ 46%] 2025-12-04T13:28:26.4898723Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_float32 PASSED [0.0689s] [ 46%] 2025-12-04T13:28:26.4898852Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_int32 PASSED [0.0575s] [ 46%] 2025-12-04T13:28:26.4898979Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_trunc_executor_aten_cuda_int64 PASSED [0.0581s] [ 46%] 2025-12-04T13:28:26.4899116Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_bool PASSED [0.0576s] [ 46%] 2025-12-04T13:28:26.4899260Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_complex32 PASSED [0.0545s] [ 46%] 2025-12-04T13:28:26.4899400Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_float16 PASSED [0.0525s] [ 46%] 2025-12-04T13:28:26.4899547Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_int16 PASSED [0.0500s] [ 46%] 2025-12-04T13:28:26.4899683Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_copy_executor_aten_cuda_uint8 PASSED [0.0501s] [ 46%] 2025-12-04T13:28:26.4899826Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_bfloat16 PASSED [0.0396s] [ 46%] 2025-12-04T13:28:26.4899957Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_int16 PASSED [0.0374s] [ 46%] 2025-12-04T13:28:26.4900085Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_int8 PASSED [0.0373s] [ 46%] 2025-12-04T13:28:26.4900215Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unbind_executor_aten_cuda_uint8 PASSED [0.0372s] [ 46%] 2025-12-04T13:28:26.4900355Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_complex32 PASSED [0.0253s] [ 46%] 2025-12-04T13:28:26.4900488Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unflatten_executor_aten_cuda_int8 PASSED [0.0243s] [ 46%] 2025-12-04T13:28:26.4900642Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_complex32 PASSED [0.0641s] [ 46%] 2025-12-04T13:28:26.4900782Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_float16 PASSED [0.0627s] [ 46%] 2025-12-04T13:28:26.4900921Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_copy_executor_aten_cuda_uint8 PASSED [0.0605s] [ 46%] 2025-12-04T13:28:26.4901057Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_complex64 PASSED [0.0503s] [ 46%] 2025-12-04T13:28:26.4901191Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unfold_executor_aten_cuda_int64 PASSED [0.0480s] [ 46%] 2025-12-04T13:28:26.4901337Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_bfloat16 PASSED [1.4934s] [ 46%] 2025-12-04T13:28:26.4901481Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_copy_executor_aten_cuda_int32 PASSED [0.0263s] [ 46%] 2025-12-04T13:28:26.4901624Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_complex128 PASSED [0.0209s] [ 46%] 2025-12-04T13:28:26.4901764Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_complex64 PASSED [0.0205s] [ 46%] 2025-12-04T13:28:26.4901946Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_int16 PASSED [0.0188s] [ 46%] 2025-12-04T13:28:26.4902080Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_unsqueeze_executor_aten_cuda_int64 PASSED [0.0193s] [ 46%] 2025-12-04T13:28:26.4902227Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_bfloat16 PASSED [0.0469s] [ 46%] 2025-12-04T13:28:26.4902356Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_executor_aten_cuda_float32 PASSED [0.0319s] [ 46%] 2025-12-04T13:28:26.4902491Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_float16 PASSED [0.0790s] [ 46%] 2025-12-04T13:28:26.4902626Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_float32 PASSED [0.0504s] [ 46%] 2025-12-04T13:28:26.4902760Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_var_mean_executor_aten_cuda_float64 PASSED [0.0497s] [ 46%] 2025-12-04T13:28:26.4902890Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vdot_executor_aten_cuda_bfloat16 PASSED [0.0062s] [ 46%] 2025-12-04T13:28:26.4903023Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vdot_executor_aten_cuda_complex128 PASSED [0.0096s] [ 46%] 2025-12-04T13:28:26.4903169Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_complex_executor_aten_cuda_float32 PASSED [0.0047s] [ 46%] 2025-12-04T13:28:26.4903315Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_as_executor_aten_cuda_float16 PASSED [0.0817s] [ 46%] 2025-12-04T13:28:26.4903456Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_complex128 PASSED [0.0199s] [ 46%] 2025-12-04T13:28:26.4903604Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_float32 PASSED [0.0203s] [ 46%] 2025-12-04T13:28:26.4903737Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_copy_executor_aten_cuda_int32 PASSED [0.0188s] [ 46%] 2025-12-04T13:28:26.4903868Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_bfloat16 PASSED [0.1022s] [ 46%] 2025-12-04T13:28:26.4904002Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_complex128 PASSED [0.1128s] [ 46%] 2025-12-04T13:28:26.4904131Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_float32 PASSED [0.1104s] [ 46%] 2025-12-04T13:28:26.4904261Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_int64 PASSED [0.0989s] [ 46%] 2025-12-04T13:28:26.4904407Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_view_executor_aten_cuda_uint8 PASSED [0.1082s] [ 46%] 2025-12-04T13:28:26.4904543Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vsplit_executor_aten_cuda_complex64 PASSED [0.0092s] [ 46%] 2025-12-04T13:28:26.4904677Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_bfloat16 PASSED [0.0106s] [ 46%] 2025-12-04T13:28:26.4904812Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_float16 PASSED [1.4240s] [ 46%] 2025-12-04T13:28:26.4904942Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_vstack_executor_aten_cuda_int32 PASSED [0.0129s] [ 46%] 2025-12-04T13:28:26.4905072Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_float16 PASSED [0.0671s] [ 46%] 2025-12-04T13:28:26.4905203Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_where_executor_aten_cuda_int16 PASSED [0.0514s] [ 46%] 2025-12-04T13:28:26.4905336Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_bfloat16 PASSED [0.8090s] [ 46%] 2025-12-04T13:28:26.4905464Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_int64 PASSED [0.7149s] [ 46%] 2025-12-04T13:28:26.4905592Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_xlogy_executor_aten_cuda_uint8 PASSED [0.7077s] [ 46%] 2025-12-04T13:28:26.4905719Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_bool PASSED [0.0066s] [ 46%] 2025-12-04T13:28:26.4905864Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_complex128 PASSED [0.0064s] [ 46%] 2025-12-04T13:28:26.4905996Z test_ops.py::TestCommonCUDA::test_python_ref_executor__refs_zeros_executor_aten_cuda_float16 PASSED [0.0063s] [ 46%] 2025-12-04T13:28:26.4906106Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_bfloat16 PASSED [1.4837s] [ 46%] 2025-12-04T13:28:26.4906214Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_complex64 PASSED [1.4079s] [ 46%] 2025-12-04T13:28:26.4906321Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_float16 PASSED [1.3912s] [ 46%] 2025-12-04T13:28:26.4906423Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_T_cuda_int64 PASSED [1.3705s] [ 46%] 2025-12-04T13:28:26.4906553Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_bfloat16 PASSED [1.4036s] [ 46%] 2025-12-04T13:28:26.4906683Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_float64 PASSED [1.4423s] [ 46%] 2025-12-04T13:28:26.4906808Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_int16 PASSED [1.4151s] [ 46%] 2025-12-04T13:28:26.4906933Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_int32 PASSED [1.4299s] [ 46%] 2025-12-04T13:28:26.4907066Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bfloat16_cuda_int64 PASSED [1.4415s] [ 47%] 2025-12-04T13:28:26.4907186Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_bool PASSED [1.4125s] [ 47%] 2025-12-04T13:28:26.4907321Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_bool_cuda_float64 PASSED [1.4393s] [ 47%] 2025-12-04T13:28:26.4907446Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_complex128 PASSED [1.4459s] [ 47%] 2025-12-04T13:28:26.4907567Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_byte_cuda_int32 PASSED [1.4249s] [ 47%] 2025-12-04T13:28:26.4907695Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_bfloat16 PASSED [1.4452s] [ 47%] 2025-12-04T13:28:26.4907818Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_bool PASSED [1.4291s] [ 47%] 2025-12-04T13:28:26.4907951Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_complex64 PASSED [1.4325s] [ 47%] 2025-12-04T13:28:26.4908087Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_float32 PASSED [1.4359s] [ 47%] 2025-12-04T13:28:26.4908209Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cdouble_cuda_int8 PASSED [1.4286s] [ 47%] 2025-12-04T13:28:26.4908339Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_complex32 PASSED [1.4533s] [ 47%] 2025-12-04T13:28:26.4908466Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_complex64 PASSED [1.4216s] [ 47%] 2025-12-04T13:28:26.4908589Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_cfloat_cuda_int32 PASSED [1.4454s] [ 47%] 2025-12-04T13:28:26.4908717Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_complex64 PASSED [1.4732s] [ 47%] 2025-12-04T13:28:26.4908842Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_chalf_cuda_float64 PASSED [1.4374s] [ 47%] 2025-12-04T13:28:26.4908969Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_complex_cuda_float32 PASSED [0.1450s] [ 47%] 2025-12-04T13:28:26.4909097Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_complex_cuda_float64 PASSED [0.1398s] [ 47%] 2025-12-04T13:28:26.4909225Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_double_cuda_complex64 PASSED [0.0531s] [ 47%] 2025-12-04T13:28:26.4909349Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_bfloat16 PASSED [0.0395s] [ 47%] 2025-12-04T13:28:26.4909470Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_bool PASSED [1.4408s] [ 47%] 2025-12-04T13:28:26.4909601Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_int64 PASSED [1.4292s] [ 47%] 2025-12-04T13:28:26.4909723Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_float_cuda_uint8 PASSED [1.4579s] [ 47%] 2025-12-04T13:28:26.4909842Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_half_cuda_uint8 PASSED [1.4475s] [ 47%] 2025-12-04T13:28:26.4909967Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_complex64 PASSED [1.4757s] [ 47%] 2025-12-04T13:28:26.4910088Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_float32 PASSED [1.4452s] [ 47%] 2025-12-04T13:28:26.4910209Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_float64 PASSED [1.4401s] [ 47%] 2025-12-04T13:28:26.4910328Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_int_cuda_int8 PASSED [1.4474s] [ 47%] 2025-12-04T13:28:26.4910455Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_complex128 PASSED [1.4735s] [ 47%] 2025-12-04T13:28:26.4910583Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_complex64 PASSED [1.4823s] [ 47%] 2025-12-04T13:28:26.4910716Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_float16 PASSED [1.4600s] [ 47%] 2025-12-04T13:28:26.4910841Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_long_cuda_float64 PASSED [1.4637s] [ 47%] 2025-12-04T13:28:26.4910985Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs__conversions_short_cuda_float16 PASSED [1.4654s] [ 47%] 2025-12-04T13:28:26.4913127Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_abs_cuda_complex64 PASSED [1.4659s] [ 47%] 2025-12-04T13:28:26.4913236Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_float32 PASSED [1.4356s] [ 47%] 2025-12-04T13:28:26.4913345Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_int64 PASSED [1.4545s] [ 47%] 2025-12-04T13:28:26.4913452Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acos_cuda_int8 PASSED [1.4438s] [ 47%] 2025-12-04T13:28:26.4913565Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_float32 PASSED [1.4484s] [ 47%] 2025-12-04T13:28:26.4913671Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_int16 PASSED [1.4769s] [ 47%] 2025-12-04T13:28:26.4913780Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_acosh_cuda_uint8 PASSED [1.4767s] [ 47%] 2025-12-04T13:28:26.4913889Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_bfloat16 PASSED [0.1693s] [ 47%] 2025-12-04T13:28:26.4913997Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_add_cuda_int32 PASSED [0.1168s] [ 47%] 2025-12-04T13:28:26.4914124Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcdiv_cuda_float64 PASSED [1.4613s] [ 47%] 2025-12-04T13:28:26.4914237Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_float64 PASSED [1.4699s] [ 47%] 2025-12-04T13:28:26.4914346Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addcmul_cuda_uint8 PASSED [1.4840s] [ 47%] 2025-12-04T13:28:26.4914456Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_bfloat16 PASSED [1.4399s] [ 47%] 2025-12-04T13:28:26.4914567Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_addr_cuda_complex128 PASSED [1.4447s] [ 47%] 2025-12-04T13:28:26.4914687Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_bfloat16 PASSED [1.4197s] [ 47%] 2025-12-04T13:28:26.4914806Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_complex64 PASSED [1.4348s] [ 47%] 2025-12-04T13:28:26.4914923Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_alias_copy_cuda_float32 PASSED [1.4119s] [ 47%] 2025-12-04T13:28:26.4915032Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_float32 PASSED [1.4373s] [ 47%] 2025-12-04T13:28:26.4915136Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_all_cuda_uint8 PASSED [1.4294s] [ 47%] 2025-12-04T13:28:26.4915270Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_allclose_cuda_float16 PASSED [1.4567s] [ 47%] 2025-12-04T13:28:26.4915378Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amax_cuda_int32 PASSED [1.4401s] [ 47%] 2025-12-04T13:28:26.4915482Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_amin_cuda_bool PASSED [1.4530s] [ 47%] 2025-12-04T13:28:26.4915587Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_float16 PASSED [1.4327s] [ 47%] 2025-12-04T13:28:26.4915695Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_any_cuda_float64 PASSED [1.4447s] [ 47%] 2025-12-04T13:28:26.4915804Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_bfloat16 PASSED [0.0272s] [ 47%] 2025-12-04T13:28:26.4915914Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_arange_cuda_int32 PASSED [0.0104s] [ 47%] 2025-12-04T13:28:26.4916037Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_complex64 PASSED [0.0060s] [ 47%] 2025-12-04T13:28:26.4916158Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_copy_cuda_float64 PASSED [0.0056s] [ 47%] 2025-12-04T13:28:26.4916298Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_bool PASSED [0.0051s] [ 47%] 2025-12-04T13:28:26.4916435Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_complex64 PASSED [0.0050s] [ 47%] 2025-12-04T13:28:26.4916578Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_float32 PASSED [0.0050s] [ 47%] 2025-12-04T13:28:26.4916716Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_float64 PASSED [0.0051s] [ 47%] 2025-12-04T13:28:26.4916886Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_partial_views_cuda_int16 PASSED [0.0049s] [ 47%] 2025-12-04T13:28:26.4917009Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_float16 PASSED [0.0082s] [ 47%] 2025-12-04T13:28:26.4917135Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_float32 PASSED [1.4188s] [ 47%] 2025-12-04T13:28:26.4917256Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_int32 PASSED [1.4209s] [ 48%] 2025-12-04T13:28:26.4917378Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_as_strided_scatter_cuda_int8 PASSED [1.4298s] [ 48%] 2025-12-04T13:28:26.4917489Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asin_cuda_complex128 PASSED [1.4691s] [ 48%] 2025-12-04T13:28:26.4917604Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_complex128 PASSED [1.4698s] [ 48%] 2025-12-04T13:28:26.4917709Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_asinh_cuda_int16 PASSED [1.4312s] [ 48%] 2025-12-04T13:28:26.4917822Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan2_cuda_bfloat16 PASSED [0.1097s] [ 48%] 2025-12-04T13:28:26.4917934Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_complex32 PASSED [0.0555s] [ 48%] 2025-12-04T13:28:26.4918043Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_int16 PASSED [1.4549s] [ 48%] 2025-12-04T13:28:26.4918150Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_int32 PASSED [1.4571s] [ 48%] 2025-12-04T13:28:26.4918255Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atan_cuda_int64 PASSED [1.4313s] [ 48%] 2025-12-04T13:28:26.4918364Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_float32 PASSED [1.4612s] [ 48%] 2025-12-04T13:28:26.4918476Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_float64 PASSED [1.4469s] [ 48%] 2025-12-04T13:28:26.4918584Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_int64 PASSED [1.4365s] [ 48%] 2025-12-04T13:28:26.4918689Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atanh_cuda_int8 PASSED [1.4491s] [ 48%] 2025-12-04T13:28:26.4918809Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_bfloat16 PASSED [1.4244s] [ 48%] 2025-12-04T13:28:26.4918939Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_complex128 PASSED [1.4568s] [ 48%] 2025-12-04T13:28:26.4919058Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_float32 PASSED [1.4266s] [ 48%] 2025-12-04T13:28:26.4919169Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_int32 PASSED [1.4268s] [ 48%] 2025-12-04T13:28:26.4919282Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_1d_cuda_uint8 PASSED [1.4373s] [ 48%] 2025-12-04T13:28:26.4919392Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_bool PASSED [1.4321s] [ 48%] 2025-12-04T13:28:26.4919508Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_float64 PASSED [1.4295s] [ 48%] 2025-12-04T13:28:26.4919619Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_2d_cuda_int16 PASSED [1.4271s] [ 48%] 2025-12-04T13:28:26.4919737Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_bfloat16 PASSED [1.4331s] [ 48%] 2025-12-04T13:28:26.4919845Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_bool PASSED [1.4246s] [ 48%] 2025-12-04T13:28:26.4919981Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_atleast_3d_cuda_complex128 PASSED [1.4270s] [ 48%] 2025-12-04T13:28:26.4920094Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_not_cuda_int16 PASSED [1.4441s] [ 48%] 2025-12-04T13:28:26.4920217Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_int32 PASSED [0.1026s] [ 48%] 2025-12-04T13:28:26.4920329Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_or_cuda_int8 PASSED [0.1154s] [ 48%] 2025-12-04T13:28:26.4920455Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_bool PASSED [0.1088s] [ 48%] 2025-12-04T13:28:26.4920568Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_int32 PASSED [0.0977s] [ 48%] 2025-12-04T13:28:26.4920679Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_int8 PASSED [0.0957s] [ 48%] 2025-12-04T13:28:26.4920792Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bitwise_xor_cuda_uint8 PASSED [0.1095s] [ 48%] 2025-12-04T13:28:26.4920902Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_block_diag_cuda_int8 PASSED [1.4468s] [ 48%] 2025-12-04T13:28:26.4921029Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_broadcast_tensors_cuda_float32 PASSED [1.4328s] [ 48%] 2025-12-04T13:28:26.4921138Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_int8 PASSED [0.1889s] [ 48%] 2025-12-04T13:28:26.4921249Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_bucketize_cuda_uint8 PASSED [1.5707s] [ 48%] 2025-12-04T13:28:26.4921359Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cat_cuda_complex32 PASSED [0.0173s] [ 48%] 2025-12-04T13:28:26.4921471Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_float16 PASSED [0.0463s] [ 48%] 2025-12-04T13:28:26.4921577Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ceil_cuda_int16 PASSED [0.0232s] [ 48%] 2025-12-04T13:28:26.4921685Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_chunk_cuda_bool PASSED [0.0180s] [ 48%] 2025-12-04T13:28:26.4921789Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_cuda_int16 PASSED [0.0515s] [ 48%] 2025-12-04T13:28:26.4921933Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_bool PASSED [0.0855s] [ 48%] 2025-12-04T13:28:26.4922047Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_float32 PASSED [0.0946s] [ 48%] 2025-12-04T13:28:26.4922158Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_int16 PASSED [0.0932s] [ 48%] 2025-12-04T13:28:26.4922270Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_max_cuda_int64 PASSED [0.0930s] [ 48%] 2025-12-04T13:28:26.4922382Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clamp_min_cuda_float32 PASSED [0.0938s] [ 48%] 2025-12-04T13:28:26.4922506Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_bfloat16 PASSED [0.0491s] [ 48%] 2025-12-04T13:28:26.4922617Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_complex64 PASSED [0.0494s] [ 48%] 2025-12-04T13:28:26.4922729Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_float16 PASSED [0.0489s] [ 48%] 2025-12-04T13:28:26.4922834Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int16 PASSED [0.0489s] [ 48%] 2025-12-04T13:28:26.4922942Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_clone_cuda_int8 PASSED [0.0488s] [ 48%] 2025-12-04T13:28:26.4925159Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_bfloat16 PASSED [0.0060s] [ 48%] 2025-12-04T13:28:26.4925289Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_column_stack_cuda_complex128 PASSED [0.0055s] [ 48%] 2025-12-04T13:28:26.4925394Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_cuda_uint8 PASSED [0.0138s] [ 48%] 2025-12-04T13:28:26.4925515Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_bfloat16 PASSED [0.0149s] [ 48%] 2025-12-04T13:28:26.4925661Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_conj_physical_cuda_int8 PASSED [0.0124s] [ 48%] 2025-12-04T13:28:26.4925779Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_complex32 PASSED [0.0318s] [ 48%] 2025-12-04T13:28:26.4925897Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_complex64 PASSED [1.4744s] [ 48%] 2025-12-04T13:28:26.4926029Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_contiguous_cuda_int8 PASSED [1.4794s] [ 48%] 2025-12-04T13:28:26.4926160Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_float16 PASSED [0.2427s] [ 48%] 2025-12-04T13:28:26.4926270Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_copysign_cuda_int32 PASSED [0.1759s] [ 48%] 2025-12-04T13:28:26.4926376Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_bfloat16 PASSED [0.0346s] [ 48%] 2025-12-04T13:28:26.4926481Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_float64 PASSED [0.0280s] [ 48%] 2025-12-04T13:28:26.4926585Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cos_cuda_int8 PASSED [0.0210s] [ 48%] 2025-12-04T13:28:26.4926692Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_float16 PASSED [1.4643s] [ 48%] 2025-12-04T13:28:26.4926795Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cosh_cuda_int32 PASSED [1.4585s] [ 48%] 2025-12-04T13:28:26.4926911Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_int8 PASSED [1.4579s] [ 48%] 2025-12-04T13:28:26.4927028Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_count_nonzero_cuda_uint8 PASSED [1.4548s] [ 49%] 2025-12-04T13:28:26.4927136Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_float16 PASSED [1.4516s] [ 49%] 2025-12-04T13:28:26.4927245Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_cumsum_cuda_float32 PASSED [1.4238s] [ 49%] 2025-12-04T13:28:26.4927351Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_deg2rad_cuda_bool PASSED [1.4813s] [ 49%] 2025-12-04T13:28:26.4927469Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_complex64 PASSED [1.4823s] [ 49%] 2025-12-04T13:28:26.4927583Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_float16 PASSED [1.4959s] [ 49%] 2025-12-04T13:28:26.4927694Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diag_embed_cuda_float32 PASSED [1.4824s] [ 49%] 2025-12-04T13:28:26.4927812Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_int16 PASSED [1.4665s] [ 49%] 2025-12-04T13:28:26.4927929Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_copy_cuda_int64 PASSED [1.4716s] [ 49%] 2025-12-04T13:28:26.4928037Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_cuda_int32 PASSED [1.4613s] [ 49%] 2025-12-04T13:28:26.4928175Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_complex64 PASSED [1.4481s] [ 49%] 2025-12-04T13:28:26.4928296Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_float32 PASSED [1.4419s] [ 49%] 2025-12-04T13:28:26.4928417Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_diagonal_scatter_cuda_int64 PASSED [1.4498s] [ 49%] 2025-12-04T13:28:26.4928526Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_float32 PASSED [1.4741s] [ 49%] 2025-12-04T13:28:26.4928636Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_float64 PASSED [1.4761s] [ 49%] 2025-12-04T13:28:26.4928744Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_digamma_cuda_int16 PASSED [1.4778s] [ 49%] 2025-12-04T13:28:26.4928870Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_floor_rounding_cuda_bfloat16 PASSED [0.6421s] [ 49%] 2025-12-04T13:28:26.4928993Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_float16 PASSED [0.1136s] [ 49%] 2025-12-04T13:28:26.4929116Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_int64 PASSED [0.1218s] [ 49%] 2025-12-04T13:28:26.4929256Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_no_rounding_mode_cuda_uint8 PASSED [0.1194s] [ 49%] 2025-12-04T13:28:26.4929379Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_float64 PASSED [0.1372s] [ 49%] 2025-12-04T13:28:26.4929497Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_div_trunc_rounding_cuda_int64 PASSED [0.1211s] [ 49%] 2025-12-04T13:28:26.4929617Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dot_cuda_complex64 PASSED [0.0053s] [ 49%] 2025-12-04T13:28:26.4929739Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_complex64 PASSED [1.4568s] [ 49%] 2025-12-04T13:28:26.4929847Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dsplit_cuda_float32 PASSED [1.4576s] [ 49%] 2025-12-04T13:28:26.4929959Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_complex32 PASSED [1.4482s] [ 49%] 2025-12-04T13:28:26.4930067Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_float64 PASSED [1.4318s] [ 49%] 2025-12-04T13:28:26.4930174Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_int32 PASSED [1.4615s] [ 49%] 2025-12-04T13:28:26.4930278Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_dstack_cuda_int64 PASSED [1.4903s] [ 49%] 2025-12-04T13:28:26.4930384Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_bool PASSED [0.0065s] [ 49%] 2025-12-04T13:28:26.4930490Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_cuda_int16 PASSED [0.0045s] [ 49%] 2025-12-04T13:28:26.4930609Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_complex128 PASSED [0.0192s] [ 49%] 2025-12-04T13:28:26.4930717Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_like_cuda_int8 PASSED [1.4782s] [ 49%] 2025-12-04T13:28:26.4930832Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_bool PASSED [1.4610s] [ 49%] 2025-12-04T13:28:26.4930954Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_empty_strided_cuda_complex128 PASSED [1.4526s] [ 49%] 2025-12-04T13:28:26.4931058Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_int8 PASSED [0.1026s] [ 49%] 2025-12-04T13:28:26.4931161Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eq_cuda_uint8 PASSED [0.0983s] [ 49%] 2025-12-04T13:28:26.4931266Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_int16 XFAIL [0.0039s] [ 49%] 2025-12-04T13:28:26.4931370Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_equal_cuda_int64 XFAIL [1.4507s] [ 49%] 2025-12-04T13:28:26.4931478Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_bfloat16 PASSED [1.4774s] [ 49%] 2025-12-04T13:28:26.4931582Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_float64 PASSED [0.0260s] [ 49%] 2025-12-04T13:28:26.4931694Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erf_cuda_int64 PASSED [0.0203s] [ 49%] 2025-12-04T13:28:26.4931800Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfc_cuda_float16 PASSED [0.0315s] [ 49%] 2025-12-04T13:28:26.4931939Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_erfinv_cuda_bool PASSED [0.0275s] [ 49%] 2025-12-04T13:28:26.4932045Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_float16 PASSED [0.0310s] [ 49%] 2025-12-04T13:28:26.4932148Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp2_cuda_uint8 PASSED [0.0206s] [ 49%] 2025-12-04T13:28:26.4932258Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_complex32 PASSED [0.0477s] [ 49%] 2025-12-04T13:28:26.4932366Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_complex64 PASSED [0.0388s] [ 49%] 2025-12-04T13:28:26.4932471Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_float16 PASSED [0.0309s] [ 49%] 2025-12-04T13:28:26.4932577Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_float64 PASSED [0.0276s] [ 49%] 2025-12-04T13:28:26.4932680Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_int16 PASSED [0.0221s] [ 49%] 2025-12-04T13:28:26.4932798Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_int8 PASSED [0.0203s] [ 49%] 2025-12-04T13:28:26.4932901Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_exp_cuda_uint8 PASSED [0.0205s] [ 49%] 2025-12-04T13:28:26.4933013Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_float16 PASSED [1.4832s] [ 49%] 2025-12-04T13:28:26.4933137Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_as_cuda_float32 PASSED [1.4695s] [ 49%] 2025-12-04T13:28:26.4933268Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_bfloat16 PASSED [1.4576s] [ 49%] 2025-12-04T13:28:26.4933383Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_bool PASSED [1.4537s] [ 49%] 2025-12-04T13:28:26.4933503Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_complex64 PASSED [1.4702s] [ 49%] 2025-12-04T13:28:26.4933618Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_float32 PASSED [1.4577s] [ 49%] 2025-12-04T13:28:26.4933734Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_copy_cuda_float64 PASSED [1.4932s] [ 49%] 2025-12-04T13:28:26.4933839Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expand_cuda_int32 PASSED [1.4701s] [ 49%] 2025-12-04T13:28:26.4933950Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_complex64 PASSED [1.5215s] [ 49%] 2025-12-04T13:28:26.4934057Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_expm1_cuda_float64 PASSED [1.4749s] [ 49%] 2025-12-04T13:28:26.4934161Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_bool PASSED [1.5360s] [ 49%] 2025-12-04T13:28:26.4934264Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_float16 PASSED [1.5152s] [ 49%] 2025-12-04T13:28:26.4934378Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_float8_e4m3fnuz PASSED [1.5045s] [ 49%] 2025-12-04T13:28:26.4934480Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_eye_cuda_int64 PASSED [0.0705s] [ 49%] 2025-12-04T13:28:26.4934590Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_float32 PASSED [0.0079s] [ 50%] 2025-12-04T13:28:26.4934697Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fft2_cuda_uint8 PASSED [0.0054s] [ 50%] 2025-12-04T13:28:26.4934815Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_bfloat16 PASSED [0.0060s] [ 50%] 2025-12-04T13:28:26.4934925Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_bool PASSED [0.0057s] [ 50%] 2025-12-04T13:28:26.4935046Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_complex128 PASSED [0.0057s] [ 50%] 2025-12-04T13:28:26.4935164Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_complex32 PASSED [0.0057s] [ 50%] 2025-12-04T13:28:26.4935290Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_int64 PASSED [0.0057s] [ 50%] 2025-12-04T13:28:26.4935402Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_int8 PASSED [0.0056s] [ 50%] 2025-12-04T13:28:26.4935514Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_fftshift_cuda_uint8 PASSED [0.0057s] [ 50%] 2025-12-04T13:28:26.4935624Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft2_cuda_uint8 PASSED [0.0096s] [ 50%] 2025-12-04T13:28:26.4935733Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_bool PASSED [0.0093s] [ 50%] 2025-12-04T13:28:26.4935845Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_complex64 PASSED [1.4430s] [ 50%] 2025-12-04T13:28:26.4935953Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfft_cuda_int8 PASSED [1.4648s] [ 50%] 2025-12-04T13:28:26.4936069Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_complex128 PASSED [1.4732s] [ 50%] 2025-12-04T13:28:26.4936183Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_complex32 PASSED [1.4747s] [ 50%] 2025-12-04T13:28:26.4936304Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_float64 PASSED [1.4753s] [ 50%] 2025-12-04T13:28:26.4936411Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_int8 PASSED [1.4923s] [ 50%] 2025-12-04T13:28:26.4936520Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_hfftn_cuda_uint8 PASSED [1.4784s] [ 50%] 2025-12-04T13:28:26.4936639Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft2_cuda_float16 PASSED [1.4720s] [ 50%] 2025-12-04T13:28:26.4936759Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_float32 PASSED [1.4936s] [ 50%] 2025-12-04T13:28:26.4936869Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int16 PASSED [1.5041s] [ 50%] 2025-12-04T13:28:26.4936975Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int32 PASSED [1.4934s] [ 50%] 2025-12-04T13:28:26.4937083Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifft_cuda_int64 PASSED [1.4838s] [ 50%] 2025-12-04T13:28:26.4937197Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_complex64 PASSED [1.5012s] [ 50%] 2025-12-04T13:28:26.4937307Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_float32 PASSED [1.4755s] [ 50%] 2025-12-04T13:28:26.4937416Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftn_cuda_int64 PASSED [1.4836s] [ 50%] 2025-12-04T13:28:26.4937534Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_bfloat16 PASSED [1.4712s] [ 50%] 2025-12-04T13:28:26.4937656Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_complex128 PASSED [1.4599s] [ 50%] 2025-12-04T13:28:26.4937775Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_complex32 PASSED [1.4590s] [ 50%] 2025-12-04T13:28:26.4937891Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_float16 PASSED [1.4658s] [ 50%] 2025-12-04T13:28:26.4938007Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_float32 PASSED [1.4736s] [ 50%] 2025-12-04T13:28:26.4938119Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ifftshift_cuda_int8 PASSED [1.4534s] [ 50%] 2025-12-04T13:28:26.4938231Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft2_cuda_float16 PASSED [1.8006s] [ 50%] 2025-12-04T13:28:26.4938342Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfft_cuda_float32 PASSED [0.0126s] [ 50%] 2025-12-04T13:28:26.4938452Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_bool PASSED [0.0132s] [ 50%] 2025-12-04T13:28:26.4938567Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_float16 PASSED [1.4695s] [ 50%] 2025-12-04T13:28:26.4938675Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_int32 PASSED [1.4794s] [ 50%] 2025-12-04T13:28:26.4938796Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_int8 PASSED [1.4704s] [ 50%] 2025-12-04T13:28:26.4938905Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_ihfftn_cuda_uint8 PASSED [1.4749s] [ 50%] 2025-12-04T13:28:26.4939014Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft2_cuda_int16 PASSED [1.4814s] [ 50%] 2025-12-04T13:28:26.4939128Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_complex128 PASSED [1.4897s] [ 50%] 2025-12-04T13:28:26.4939243Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_complex64 PASSED [1.4587s] [ 50%] 2025-12-04T13:28:26.4939352Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_float32 PASSED [1.4602s] [ 50%] 2025-12-04T13:28:26.4939465Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_float64 PASSED [1.4796s] [ 50%] 2025-12-04T13:28:26.4939572Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_irfft_cuda_int32 PASSED [1.4727s] [ 50%] 2025-12-04T13:28:26.4939680Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_int16 PASSED [1.4715s] [ 50%] 2025-12-04T13:28:26.4939798Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft2_cuda_int32 PASSED [1.4618s] [ 50%] 2025-12-04T13:28:26.4939908Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_float16 PASSED [1.4694s] [ 50%] 2025-12-04T13:28:26.4940016Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfft_cuda_int32 PASSED [1.4738s] [ 50%] 2025-12-04T13:28:26.4940133Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_int8 PASSED [1.4848s] [ 50%] 2025-12-04T13:28:26.4940253Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fft_rfftn_cuda_uint8 PASSED [1.4837s] [ 50%] 2025-12-04T13:28:26.4940361Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fill_cuda_complex32 PASSED [1.5143s] [ 50%] 2025-12-04T13:28:26.4940471Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flatten_cuda_int16 PASSED [0.0379s] [ 50%] 2025-12-04T13:28:26.4940578Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_float16 PASSED [0.0081s] [ 50%] 2025-12-04T13:28:26.4940686Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flip_cuda_float64 PASSED [0.0074s] [ 50%] 2025-12-04T13:28:26.4940798Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_complex128 PASSED [0.0035s] [ 50%] 2025-12-04T13:28:26.4940903Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fliplr_cuda_int64 PASSED [0.0035s] [ 50%] 2025-12-04T13:28:26.4941014Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_complex128 PASSED [0.0031s] [ 50%] 2025-12-04T13:28:26.4941124Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_float64 PASSED [0.0028s] [ 50%] 2025-12-04T13:28:26.4941230Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_int32 PASSED [0.0035s] [ 50%] 2025-12-04T13:28:26.4941337Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_flipud_cuda_int64 PASSED [0.0032s] [ 50%] 2025-12-04T13:28:26.4941457Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_complex128 PASSED [0.1342s] [ 50%] 2025-12-04T13:28:26.4941573Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_float32 PASSED [0.1385s] [ 50%] 2025-12-04T13:28:26.4941684Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_float_power_cuda_int8 PASSED [0.1245s] [ 50%] 2025-12-04T13:28:26.4941787Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_int16 PASSED [0.0224s] [ 50%] 2025-12-04T13:28:26.4941933Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_cuda_int8 PASSED [0.0220s] [ 50%] 2025-12-04T13:28:26.4942049Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_float16 PASSED [0.5497s] [ 50%] 2025-12-04T13:28:26.4942160Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_floor_divide_cuda_int8 PASSED [0.2378s] [ 50%] 2025-12-04T13:28:26.4942286Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_float32 PASSED [1.5825s] [ 51%] 2025-12-04T13:28:26.4942394Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmin_cuda_uint8 PASSED [1.5663s] [ 51%] 2025-12-04T13:28:26.4942499Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_fmod_cuda_float64 PASSED [0.1116s] [ 51%] 2025-12-04T13:28:26.4942602Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gcd_cuda_int8 PASSED [0.0908s] [ 51%] 2025-12-04T13:28:26.4942705Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_float16 PASSED [0.1083s] [ 51%] 2025-12-04T13:28:26.4942810Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ge_cuda_int16 PASSED [0.1016s] [ 51%] 2025-12-04T13:28:26.4942922Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_float16 PASSED [0.0094s] [ 51%] 2025-12-04T13:28:26.4943033Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_float64 PASSED [0.0071s] [ 51%] 2025-12-04T13:28:26.4943141Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int32 XFAIL [0.0025s] [ 51%] 2025-12-04T13:28:26.4943249Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_geometric_cuda_int64 XFAIL [0.0024s] [ 51%] 2025-12-04T13:28:26.4943367Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_bfloat16 PASSED [1.5813s] [ 51%] 2025-12-04T13:28:26.4943474Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_float32 PASSED [1.5657s] [ 51%] 2025-12-04T13:28:26.4943578Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_float64 PASSED [0.1068s] [ 51%] 2025-12-04T13:28:26.4943692Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_gt_cuda_int16 PASSED [0.0971s] [ 51%] 2025-12-04T13:28:26.4943822Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_bfloat16 PASSED [0.1795s] [ 51%] 2025-12-04T13:28:26.4943933Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_heaviside_cuda_float32 PASSED [0.1405s] [ 51%] 2025-12-04T13:28:26.4944039Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_bool PASSED [0.0033s] [ 51%] 2025-12-04T13:28:26.4944143Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hsplit_cuda_uint8 PASSED [1.4693s] [ 51%] 2025-12-04T13:28:26.4944253Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_float64 PASSED [1.4803s] [ 51%] 2025-12-04T13:28:26.4944356Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_int8 PASSED [1.4826s] [ 51%] 2025-12-04T13:28:26.4944461Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_hstack_cuda_uint8 PASSED [1.4859s] [ 51%] 2025-12-04T13:28:26.4944565Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_float32 PASSED [1.4958s] [ 51%] 2025-12-04T13:28:26.4944668Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_float64 PASSED [1.8118s] [ 51%] 2025-12-04T13:28:26.4944770Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_int32 PASSED [0.0274s] [ 51%] 2025-12-04T13:28:26.4944871Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_i0_cuda_int64 PASSED [1.4893s] [ 51%] 2025-12-04T13:28:26.4944980Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_imag_cuda_complex64 PASSED [1.5260s] [ 51%] 2025-12-04T13:28:26.4945095Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_bfloat16 PASSED [1.4938s] [ 51%] 2025-12-04T13:28:26.4945205Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_add_cuda_float64 PASSED [1.4856s] [ 51%] 2025-12-04T13:28:26.4945322Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_complex32 PASSED [1.4803s] [ 51%] 2025-12-04T13:28:26.4945434Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_copy_cuda_uint8 PASSED [1.4860s] [ 51%] 2025-12-04T13:28:26.4945544Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_bool PASSED [1.4883s] [ 51%] 2025-12-04T13:28:26.4945662Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_complex128 PASSED [1.4758s] [ 51%] 2025-12-04T13:28:26.4945784Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_fill_cuda_float64 PASSED [1.4904s] [ 51%] 2025-12-04T13:28:26.4945897Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_int32 PASSED [1.4862s] [ 51%] 2025-12-04T13:28:26.4946009Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_index_select_cuda_int64 PASSED [1.4987s] [ 51%] 2025-12-04T13:28:26.4946119Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_bool PASSED [1.5288s] [ 51%] 2025-12-04T13:28:26.4946231Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isfinite_cuda_complex64 PASSED [1.5327s] [ 51%] 2025-12-04T13:28:26.4946342Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_complex128 PASSED [0.0905s] [ 51%] 2025-12-04T13:28:26.4946450Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_complex64 PASSED [0.0835s] [ 51%] 2025-12-04T13:28:26.4946556Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isinf_cuda_int32 PASSED [0.0217s] [ 51%] 2025-12-04T13:28:26.4946666Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_float16 PASSED [1.5162s] [ 51%] 2025-12-04T13:28:26.4946772Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isnan_cuda_int8 PASSED [1.5075s] [ 51%] 2025-12-04T13:28:26.4946890Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isneginf_cuda_int64 PASSED [1.5187s] [ 51%] 2025-12-04T13:28:26.4946998Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_int16 PASSED [1.5113s] [ 51%] 2025-12-04T13:28:26.4947115Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_int32 PASSED [1.5091s] [ 51%] 2025-12-04T13:28:26.4947223Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_int8 PASSED [1.4993s] [ 51%] 2025-12-04T13:28:26.4947343Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isposinf_cuda_uint8 PASSED [1.5090s] [ 51%] 2025-12-04T13:28:26.4947447Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_bool PASSED [1.5136s] [ 51%] 2025-12-04T13:28:26.4947559Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_complex64 PASSED [1.5286s] [ 51%] 2025-12-04T13:28:26.4947663Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_int8 PASSED [1.5013s] [ 51%] 2025-12-04T13:28:26.4947770Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_isreal_cuda_uint8 PASSED [1.5078s] [ 51%] 2025-12-04T13:28:26.4947872Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int32 XFAIL [0.0053s] [ 51%] 2025-12-04T13:28:26.4947975Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_item_cuda_int64 XFAIL [1.4892s] [ 51%] 2025-12-04T13:28:26.4948077Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lcm_cuda_int64 PASSED [1.6564s] [ 51%] 2025-12-04T13:28:26.4948183Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_float16 PASSED [0.1075s] [ 51%] 2025-12-04T13:28:26.4948285Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_le_cuda_uint8 PASSED [0.0941s] [ 51%] 2025-12-04T13:28:26.4948395Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_bfloat16 PASSED [0.0317s] [ 51%] 2025-12-04T13:28:26.4948502Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_float32 PASSED [1.5172s] [ 51%] 2025-12-04T13:28:26.4948612Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_float64 PASSED [1.9279s] [ 51%] 2025-12-04T13:28:26.4948717Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_int16 PASSED [0.0261s] [ 51%] 2025-12-04T13:28:26.4948823Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lgamma_cuda_int64 PASSED [0.0230s] [ 51%] 2025-12-04T13:28:26.4948942Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_complex64 PASSED [0.0090s] [ 51%] 2025-12-04T13:28:26.4949054Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_cross_cuda_int8 PASSED [0.0074s] [ 51%] 2025-12-04T13:28:26.4949171Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_bool PASSED [0.0083s] [ 51%] 2025-12-04T13:28:26.4949305Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_complex128 PASSED [1.4957s] [ 51%] 2025-12-04T13:28:26.4949431Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_complex64 PASSED [1.5094s] [ 51%] 2025-12-04T13:28:26.4949549Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_int16 PASSED [1.4930s] [ 51%] 2025-12-04T13:28:26.4949667Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_int32 PASSED [1.4997s] [ 52%] 2025-12-04T13:28:26.4949783Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_diagonal_cuda_int64 PASSED [1.4957s] [ 52%] 2025-12-04T13:28:26.4949909Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_matrix_norm_cuda_bfloat16 PASSED [1.5356s] [ 52%] 2025-12-04T13:28:26.4950025Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svd_cuda_complex128 PASSED [1.6664s] [ 52%] 2025-12-04T13:28:26.4950142Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_svdvals_cuda_float32 PASSED [0.0372s] [ 52%] 2025-12-04T13:28:26.4950266Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_bfloat16 PASSED [0.1256s] [ 52%] 2025-12-04T13:28:26.4950400Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linalg_vector_norm_cuda_float16 PASSED [0.1037s] [ 52%] 2025-12-04T13:28:26.4950517Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_complex128 PASSED [0.0321s] [ 52%] 2025-12-04T13:28:26.4950628Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_float32 PASSED [0.0307s] [ 52%] 2025-12-04T13:28:26.4950747Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_cuda_int32 PASSED [0.0294s] [ 52%] 2025-12-04T13:28:26.4950888Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_linspace_tensor_overload_cuda_int32 PASSED [0.1227s] [ 52%] 2025-12-04T13:28:26.4950997Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_complex64 PASSED [1.8359s] [ 52%] 2025-12-04T13:28:26.4951104Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_float32 PASSED [1.5293s] [ 52%] 2025-12-04T13:28:26.4951209Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log10_cuda_int32 PASSED [1.5193s] [ 52%] 2025-12-04T13:28:26.4951314Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_bool PASSED [1.5258s] [ 52%] 2025-12-04T13:28:26.4951424Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_complex128 PASSED [1.5486s] [ 52%] 2025-12-04T13:28:26.4951532Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log1p_cuda_float16 PASSED [1.5343s] [ 52%] 2025-12-04T13:28:26.4951637Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log2_cuda_int64 PASSED [1.5221s] [ 52%] 2025-12-04T13:28:26.4951744Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_bfloat16 PASSED [1.5340s] [ 52%] 2025-12-04T13:28:26.4951886Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_cuda_uint8 PASSED [1.5060s] [ 52%] 2025-12-04T13:28:26.4952011Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_log_softmax_with_dtype_cuda_int8 PASSED [1.5074s] [ 52%] 2025-12-04T13:28:26.4952127Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp2_cuda_float32 PASSED [1.5062s] [ 52%] 2025-12-04T13:28:26.4952239Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logaddexp_cuda_float16 PASSED [0.2660s] [ 52%] 2025-12-04T13:28:26.4952349Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_bool PASSED [0.0830s] [ 52%] 2025-12-04T13:28:26.4952460Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_and_cuda_int8 PASSED [0.0969s] [ 52%] 2025-12-04T13:28:26.4952573Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_float64 PASSED [0.0284s] [ 52%] 2025-12-04T13:28:26.4952686Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_not_cuda_int64 PASSED [0.0238s] [ 52%] 2025-12-04T13:28:26.4952802Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_complex64 PASSED [0.1443s] [ 52%] 2025-12-04T13:28:26.4952928Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_float64 PASSED [0.1167s] [ 52%] 2025-12-04T13:28:26.4953039Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_or_cuda_int32 PASSED [0.1148s] [ 52%] 2025-12-04T13:28:26.4953150Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logical_xor_cuda_uint8 PASSED [0.1047s] [ 52%] 2025-12-04T13:28:26.4953260Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_bfloat16 PASSED [0.1298s] [ 52%] 2025-12-04T13:28:26.4953372Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_float64 PASSED [0.0935s] [ 52%] 2025-12-04T13:28:26.4953480Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_int32 PASSED [0.1130s] [ 52%] 2025-12-04T13:28:26.4953590Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_int8 PASSED [0.0458s] [ 52%] 2025-12-04T13:28:26.4953697Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_cuda_uint8 PASSED [0.0368s] [ 52%] 2025-12-04T13:28:26.4953835Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_complex64 PASSED [0.5765s] [ 52%] 2025-12-04T13:28:26.4953981Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logspace_tensor_overload_cuda_float16 PASSED [0.5581s] [ 52%] 2025-12-04T13:28:26.4954090Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_bool PASSED [0.0181s] [ 52%] 2025-12-04T13:28:26.4954200Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_logsumexp_cuda_uint8 PASSED [0.0086s] [ 52%] 2025-12-04T13:28:26.4954318Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_bool PASSED [0.0910s] [ 52%] 2025-12-04T13:28:26.4954441Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_lt_cuda_int32 PASSED [0.0948s] [ 52%] 2025-12-04T13:28:26.4954559Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_complex128 PASSED [0.0103s] [ 52%] 2025-12-04T13:28:26.4954674Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_masked_fill_cuda_float32 PASSED [0.0090s] [ 52%] 2025-12-04T13:28:26.4954780Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_int16 PASSED [0.0921s] [ 52%] 2025-12-04T13:28:26.4954889Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_maximum_cuda_int32 PASSED [0.0881s] [ 52%] 2025-12-04T13:28:26.4955017Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_bool PASSED [0.0146s] [ 52%] 2025-12-04T13:28:26.4955155Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_complex128 PASSED [1.5397s] [ 52%] 2025-12-04T13:28:26.4955287Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_float16 PASSED [1.5066s] [ 52%] 2025-12-04T13:28:26.4955418Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_int32 PASSED [1.5035s] [ 52%] 2025-12-04T13:28:26.4955547Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_list_of_tensors_cuda_uint8 PASSED [1.5116s] [ 52%] 2025-12-04T13:28:26.4955678Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_bool PASSED [1.5171s] [ 52%] 2025-12-04T13:28:26.4955817Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_complex64 PASSED [1.5091s] [ 52%] 2025-12-04T13:28:26.4955948Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_int16 PASSED [1.5243s] [ 52%] 2025-12-04T13:28:26.4956079Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_int32 PASSED [1.4961s] [ 52%] 2025-12-04T13:28:26.4956209Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_meshgrid_variadic_tensors_cuda_uint8 PASSED [1.5343s] [ 52%] 2025-12-04T13:28:26.4956320Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_float16 PASSED [0.1105s] [ 52%] 2025-12-04T13:28:26.4956428Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_float64 PASSED [0.0930s] [ 52%] 2025-12-04T13:28:26.4956547Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_int64 PASSED [0.0888s] [ 52%] 2025-12-04T13:28:26.4956655Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_minimum_cuda_uint8 PASSED [0.0863s] [ 52%] 2025-12-04T13:28:26.4956762Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_bool PASSED [0.0067s] [ 52%] 2025-12-04T13:28:26.4956875Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_complex128 PASSED [0.0067s] [ 52%] 2025-12-04T13:28:26.4956984Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_movedim_cuda_int16 PASSED [0.0066s] [ 52%] 2025-12-04T13:28:26.4957096Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_float16 PASSED [0.0699s] [ 52%] 2025-12-04T13:28:26.4957207Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_int16 PASSED [0.0216s] [ 52%] 2025-12-04T13:28:26.4957316Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_int64 PASSED [0.0216s] [ 52%] 2025-12-04T13:28:26.4957426Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nan_to_num_cuda_uint8 PASSED [0.0204s] [ 52%] 2025-12-04T13:28:26.4957552Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_bfloat16 PASSED [0.0198s] [ 53%] 2025-12-04T13:28:26.4957663Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_bool PASSED [0.0194s] [ 53%] 2025-12-04T13:28:26.4957781Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_complex32 PASSED [1.5311s] [ 53%] 2025-12-04T13:28:26.4957905Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_float64 PASSED [1.5360s] [ 53%] 2025-12-04T13:28:26.4958027Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_copy_cuda_int32 PASSED [1.5376s] [ 53%] 2025-12-04T13:28:26.4958134Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_float16 PASSED [1.5231s] [ 53%] 2025-12-04T13:28:26.4958242Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int16 PASSED [1.5404s] [ 53%] 2025-12-04T13:28:26.4958348Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int32 PASSED [1.5474s] [ 53%] 2025-12-04T13:28:26.4958455Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_narrow_cuda_int64 PASSED [1.5287s] [ 53%] 2025-12-04T13:28:26.4958578Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_native_layer_norm_cuda_float32 PASSED [0.0271s] [ 53%] 2025-12-04T13:28:26.4958679Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_int8 PASSED [0.0946s] [ 53%] 2025-12-04T13:28:26.4958783Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ne_cuda_uint8 PASSED [0.0938s] [ 53%] 2025-12-04T13:28:26.4958892Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_complex128 PASSED [0.0396s] [ 53%] 2025-12-04T13:28:26.4958996Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_float32 PASSED [0.0242s] [ 53%] 2025-12-04T13:28:26.4959100Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_neg_cuda_int64 PASSED [0.0232s] [ 53%] 2025-12-04T13:28:26.4959216Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_complex128 PASSED [0.0055s] [ 53%] 2025-12-04T13:28:26.4959327Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_int16 PASSED [0.0052s] [ 53%] 2025-12-04T13:28:26.4959436Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_cuda_uint8 PASSED [0.0052s] [ 53%] 2025-12-04T13:28:26.4959560Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_empty_strided_cuda_complex64 PASSED [0.0054s] [ 53%] 2025-12-04T13:28:26.4959676Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_complex128 PASSED [0.0057s] [ 53%] 2025-12-04T13:28:26.4959786Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_float16 PASSED [0.0054s] [ 53%] 2025-12-04T13:28:26.4959896Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_float32 PASSED [0.0054s] [ 53%] 2025-12-04T13:28:26.4960014Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_full_cuda_int32 PASSED [0.0054s] [ 53%] 2025-12-04T13:28:26.4960125Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_bfloat16 PASSED [0.0055s] [ 53%] 2025-12-04T13:28:26.4960240Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_complex128 PASSED [0.0053s] [ 53%] 2025-12-04T13:28:26.4960349Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_float16 PASSED [0.0053s] [ 53%] 2025-12-04T13:28:26.4960457Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_ones_cuda_int8 PASSED [0.0052s] [ 53%] 2025-12-04T13:28:26.4960572Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_new_zeros_cuda_complex64 PASSED [0.0055s] [ 53%] 2025-12-04T13:28:26.4960686Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nextafter_cuda_bfloat16 PASSED [0.1058s] [ 53%] 2025-12-04T13:28:26.4960797Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nextafter_cuda_float16 PASSED [0.1034s] [ 53%] 2025-12-04T13:28:26.4960934Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_alpha_dropout_cuda_float32 PASSED [1.5236s] [ 53%] 2025-12-04T13:28:26.4961069Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_celu_cuda_float64 PASSED [1.5698s] [ 53%] 2025-12-04T13:28:26.4961205Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_channel_shuffle_cuda_uint8 PASSED [1.5194s] [ 53%] 2025-12-04T13:28:26.4961327Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_elu_cuda_float64 PASSED [1.5541s] [ 53%] 2025-12-04T13:28:26.4961464Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_gelu_cuda_bfloat16 PASSED [1.5048s] [ 53%] 2025-12-04T13:28:26.4961598Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_glu_cuda_float32 PASSED [0.0705s] [ 53%] 2025-12-04T13:28:26.4961731Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardshrink_cuda_float32 PASSED [1.5510s] [ 53%] 2025-12-04T13:28:26.4961899Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_int64 PASSED [1.5754s] [ 53%] 2025-12-04T13:28:26.4962024Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hardtanh_cuda_int8 PASSED [1.5520s] [ 53%] 2025-12-04T13:28:26.4962171Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_hinge_embedding_loss_cuda_float16 PASSED [1.5600s] [ 53%] 2025-12-04T13:28:26.4962297Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_l1_loss_cuda_float64 PASSED [1.5201s] [ 53%] 2025-12-04T13:28:26.4962445Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_float32 PASSED [1.5158s] [ 53%] 2025-12-04T13:28:26.4962589Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_log_softmax_with_dtype_cuda_int8 PASSED [1.5344s] [ 53%] 2025-12-04T13:28:26.4962729Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_margin_ranking_loss_cuda_int64 PASSED [1.5493s] [ 53%] 2025-12-04T13:28:26.4962861Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_nll_loss_cuda_bfloat16 PASSED [1.6145s] [ 53%] 2025-12-04T13:28:26.4963007Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_complex64 PASSED [1.5315s] [ 53%] 2025-12-04T13:28:26.4963144Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pairwise_distance_cuda_uint8 PASSED [1.5166s] [ 53%] 2025-12-04T13:28:26.4963270Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pdist_cuda_float64 PASSED [1.5540s] [ 53%] 2025-12-04T13:28:26.4963409Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_bfloat16 PASSED [0.0151s] [ 53%] 2025-12-04T13:28:26.4963541Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_shuffle_cuda_bool PASSED [0.0124s] [ 53%] 2025-12-04T13:28:26.4963678Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_bool PASSED [0.0100s] [ 53%] 2025-12-04T13:28:26.4963829Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_float16 PASSED [0.0113s] [ 53%] 2025-12-04T13:28:26.4963965Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_int16 PASSED [0.0116s] [ 53%] 2025-12-04T13:28:26.4964102Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_pixel_unshuffle_cuda_int8 PASSED [0.0114s] [ 53%] 2025-12-04T13:28:26.4964243Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_bfloat16 PASSED [0.0795s] [ 53%] 2025-12-04T13:28:26.4964382Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_float16 PASSED [0.0722s] [ 53%] 2025-12-04T13:28:26.4964522Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_float64 PASSED [1.5659s] [ 53%] 2025-12-04T13:28:26.4964657Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_poisson_nll_loss_cuda_int16 PASSED [1.5968s] [ 53%] 2025-12-04T13:28:26.4964779Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_int16 PASSED [1.5692s] [ 53%] 2025-12-04T13:28:26.4964916Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu6_cuda_int32 PASSED [1.5663s] [ 53%] 2025-12-04T13:28:26.4965040Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_bfloat16 PASSED [1.5554s] [ 53%] 2025-12-04T13:28:26.4965178Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_float32 PASSED [1.5395s] [ 53%] 2025-12-04T13:28:26.4965297Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int64 PASSED [1.5633s] [ 53%] 2025-12-04T13:28:26.4965431Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_int8 PASSED [1.5566s] [ 53%] 2025-12-04T13:28:26.4965551Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_relu_cuda_uint8 PASSED [1.5753s] [ 53%] 2025-12-04T13:28:26.4965673Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_selu_cuda_float16 PASSED [1.5752s] [ 53%] 2025-12-04T13:28:26.4965807Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_smooth_l1_loss_cuda_float16 PASSED [1.5299s] [ 53%] 2025-12-04T13:28:26.4965944Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_bool PASSED [1.5517s] [ 54%] 2025-12-04T13:28:26.4966092Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_complex128 PASSED [1.5386s] [ 54%] 2025-12-04T13:28:26.4966232Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softmax_with_dtype_cuda_int16 PASSED [1.5259s] [ 54%] 2025-12-04T13:28:26.4966363Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_softplus_cuda_bfloat16 PASSED [1.5852s] [ 54%] 2025-12-04T13:28:26.4966497Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_bfloat16 PASSED [1.5560s] [ 54%] 2025-12-04T13:28:26.4966629Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_float32 PASSED [1.5530s] [ 54%] 2025-12-04T13:28:26.4966756Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_tanhshrink_cuda_int8 PASSED [1.5336s] [ 54%] 2025-12-04T13:28:26.4966885Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_float32 PASSED [1.5404s] [ 54%] 2025-12-04T13:28:26.4967010Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_threshold_cuda_int32 PASSED [1.5706s] [ 54%] 2025-12-04T13:28:26.4967155Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_float16 PASSED [1.5285s] [ 54%] 2025-12-04T13:28:26.4967298Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_int16 PASSED [1.5047s] [ 54%] 2025-12-04T13:28:26.4967448Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_nn_functional_triplet_margin_loss_cuda_int8 PASSED [1.5517s] [ 54%] 2025-12-04T13:28:26.4967556Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_float32 PASSED [1.5389s] [ 54%] 2025-12-04T13:28:26.4967663Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_norm_cuda_float64 PASSED [1.5433s] [ 54%] 2025-12-04T13:28:26.4967789Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_complex128 PASSED [1.5154s] [ 54%] 2025-12-04T13:28:26.4967909Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal__in_place_cuda_float32 PASSED [1.5257s] [ 54%] 2025-12-04T13:28:26.4968019Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_cuda_float32 PASSED [1.5472s] [ 54%] 2025-12-04T13:28:26.4968128Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_cuda_float64 PASSED [1.5309s] [ 54%] 2025-12-04T13:28:26.4968254Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_normal_number_mean_cuda_float64 PASSED [1.5315s] [ 54%] 2025-12-04T13:28:26.4968360Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ones_cuda_bool PASSED [1.5306s] [ 54%] 2025-12-04T13:28:26.4968478Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_float32 PASSED [1.5750s] [ 54%] 2025-12-04T13:28:26.4968600Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_copy_cuda_int8 PASSED [1.5619s] [ 54%] 2025-12-04T13:28:26.4968707Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_bool PASSED [1.5878s] [ 54%] 2025-12-04T13:28:26.4968834Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_complex64 PASSED [1.5795s] [ 54%] 2025-12-04T13:28:26.4968944Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_float16 PASSED [1.5672s] [ 54%] 2025-12-04T13:28:26.4969062Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_permute_cuda_int8 PASSED [1.5778s] [ 54%] 2025-12-04T13:28:26.4969173Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_bfloat16 PASSED [1.5559s] [ 54%] 2025-12-04T13:28:26.4969283Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_float16 PASSED [1.5346s] [ 54%] 2025-12-04T13:28:26.4969391Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_int16 PASSED [1.5345s] [ 54%] 2025-12-04T13:28:26.4969499Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_positive_cuda_uint8 PASSED [1.5312s] [ 54%] 2025-12-04T13:28:26.4969606Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_complex128 PASSED [1.6412s] [ 54%] 2025-12-04T13:28:26.4969710Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_pow_cuda_int16 PASSED [0.1023s] [ 54%] 2025-12-04T13:28:26.4969815Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_bool PASSED [0.0263s] [ 54%] 2025-12-04T13:28:26.4969918Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_int8 PASSED [1.5631s] [ 54%] 2025-12-04T13:28:26.4970024Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_prod_cuda_uint8 PASSED [1.5529s] [ 54%] 2025-12-04T13:28:26.4970134Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_float64 PASSED [1.5702s] [ 54%] 2025-12-04T13:28:26.4970243Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rad2deg_cuda_int16 PASSED [1.5540s] [ 54%] 2025-12-04T13:28:26.4970353Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_randn_cuda_complex32 PASSED [1.5191s] [ 54%] 2025-12-04T13:28:26.4970460Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_float16 PASSED [1.5263s] [ 54%] 2025-12-04T13:28:26.4970570Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_float32 PASSED [1.5554s] [ 54%] 2025-12-04T13:28:26.4970676Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_float64 PASSED [1.5411s] [ 54%] 2025-12-04T13:28:26.4970783Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_int16 PASSED [1.5260s] [ 54%] 2025-12-04T13:28:26.4970886Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_ravel_cuda_int64 PASSED [1.5332s] [ 54%] 2025-12-04T13:28:26.4971006Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_complex128 PASSED [1.5714s] [ 54%] 2025-12-04T13:28:26.4971109Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_int32 PASSED [1.5640s] [ 54%] 2025-12-04T13:28:26.4971215Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_real_cuda_uint8 PASSED [1.5480s] [ 54%] 2025-12-04T13:28:26.4971330Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_bfloat16 PASSED [1.5681s] [ 54%] 2025-12-04T13:28:26.4971440Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_bool PASSED [1.5896s] [ 54%] 2025-12-04T13:28:26.4971554Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_float32 PASSED [1.5499s] [ 54%] 2025-12-04T13:28:26.4971666Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reciprocal_cuda_int16 PASSED [1.5663s] [ 54%] 2025-12-04T13:28:26.4971778Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_bfloat16 PASSED [0.1242s] [ 54%] 2025-12-04T13:28:26.4971935Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_float16 PASSED [1.6665s] [ 54%] 2025-12-04T13:28:26.4972047Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_float32 PASSED [1.6374s] [ 54%] 2025-12-04T13:28:26.4972168Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_remainder_cuda_int16 PASSED [0.1063s] [ 54%] 2025-12-04T13:28:26.4972278Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_renorm_cuda_complex64 PASSED [0.0103s] [ 54%] 2025-12-04T13:28:26.4972397Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_int64 PASSED [0.0442s] [ 54%] 2025-12-04T13:28:26.4972505Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_repeat_cuda_uint8 PASSED [0.0442s] [ 54%] 2025-12-04T13:28:26.4972629Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_as_cuda_int32 PASSED [0.0347s] [ 54%] 2025-12-04T13:28:26.4972741Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_reshape_cuda_complex64 PASSED [1.5662s] [ 54%] 2025-12-04T13:28:26.4972849Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rot90_cuda_float64 PASSED [1.5501s] [ 54%] 2025-12-04T13:28:26.4972955Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_round_cuda_uint8 PASSED [1.5584s] [ 54%] 2025-12-04T13:28:26.4973063Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_complex128 PASSED [1.6188s] [ 54%] 2025-12-04T13:28:26.4973172Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_complex64 PASSED [1.6439s] [ 54%] 2025-12-04T13:28:26.4973278Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_rsub_cuda_float16 PASSED [1.6117s] [ 54%] 2025-12-04T13:28:26.4973394Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_bool PASSED [1.5382s] [ 54%] 2025-12-04T13:28:26.4973513Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_float16 PASSED [1.5525s] [ 54%] 2025-12-04T13:28:26.4973631Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_select_scatter_cuda_float64 PASSED [1.5425s] [ 54%] 2025-12-04T13:28:26.4973736Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_int64 PASSED [1.5702s] [ 55%] 2025-12-04T13:28:26.4973839Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sgn_cuda_uint8 PASSED [1.5865s] [ 55%] 2025-12-04T13:28:26.4973946Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_bool PASSED [1.5868s] [ 55%] 2025-12-04T13:28:26.4974059Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_complex128 PASSED [1.6109s] [ 55%] 2025-12-04T13:28:26.4974169Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sigmoid_cuda_float32 PASSED [0.0435s] [ 55%] 2025-12-04T13:28:26.4974277Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_bfloat16 PASSED [1.5839s] [ 55%] 2025-12-04T13:28:26.4974380Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_bool PASSED [1.5580s] [ 55%] 2025-12-04T13:28:26.4974485Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_float32 PASSED [1.5618s] [ 55%] 2025-12-04T13:28:26.4974603Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_int16 PASSED [1.5587s] [ 55%] 2025-12-04T13:28:26.4974707Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sign_cuda_int64 PASSED [1.5577s] [ 55%] 2025-12-04T13:28:26.4974816Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_float32 PASSED [1.5518s] [ 55%] 2025-12-04T13:28:26.4974920Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_signbit_cuda_int8 PASSED [1.5600s] [ 55%] 2025-12-04T13:28:26.4975026Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sin_cuda_bfloat16 PASSED [1.5547s] [ 55%] 2025-12-04T13:28:26.4975133Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_bfloat16 PASSED [1.5928s] [ 55%] 2025-12-04T13:28:26.4975239Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_bool PASSED [1.5751s] [ 55%] 2025-12-04T13:28:26.4975343Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinc_cuda_float64 PASSED [1.5862s] [ 55%] 2025-12-04T13:28:26.4975448Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sinh_cuda_float32 PASSED [1.5711s] [ 55%] 2025-12-04T13:28:26.4975583Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_float16 PASSED [1.5577s] [ 55%] 2025-12-04T13:28:26.4975706Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_float64 PASSED [1.5667s] [ 55%] 2025-12-04T13:28:26.4975826Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_softmax_with_dtype_cuda_int8 PASSED [1.5499s] [ 55%] 2025-12-04T13:28:26.4975957Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j0_cuda_float32 PASSED [1.5855s] [ 55%] 2025-12-04T13:28:26.4976091Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_int32 PASSED [1.5556s] [ 55%] 2025-12-04T13:28:26.4976209Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_bessel_j1_cuda_uint8 PASSED [1.5540s] [ 55%] 2025-12-04T13:28:26.4976326Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_float64 PASSED [1.6083s] [ 55%] 2025-12-04T13:28:26.4976440Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_entr_cuda_int64 PASSED [1.5929s] [ 55%] 2025-12-04T13:28:26.4976557Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_int32 PASSED [1.5812s] [ 55%] 2025-12-04T13:28:26.4976670Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_int64 PASSED [1.5743s] [ 55%] 2025-12-04T13:28:26.4976784Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_erfcx_cuda_int8 PASSED [1.5802s] [ 55%] 2025-12-04T13:28:26.4976899Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_bfloat16 PASSED [1.5636s] [ 55%] 2025-12-04T13:28:26.4977010Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i0e_cuda_bool PASSED [1.5779s] [ 55%] 2025-12-04T13:28:26.4977123Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_bfloat16 PASSED [1.5801s] [ 55%] 2025-12-04T13:28:26.4977237Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_float32 PASSED [1.5537s] [ 55%] 2025-12-04T13:28:26.4977348Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1_cuda_int64 PASSED [1.5703s] [ 55%] 2025-12-04T13:28:26.4977462Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_bfloat16 PASSED [1.5822s] [ 55%] 2025-12-04T13:28:26.4977572Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_i1e_cuda_int16 PASSED [1.5855s] [ 55%] 2025-12-04T13:28:26.4977712Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_float64 PASSED [1.5436s] [ 55%] 2025-12-04T13:28:26.4977849Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_log_softmax_with_dtype_cuda_int8 PASSED [1.5470s] [ 55%] 2025-12-04T13:28:26.4977963Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_int16 PASSED [1.6043s] [ 55%] 2025-12-04T13:28:26.4978087Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_int32 PASSED [1.5724s] [ 55%] 2025-12-04T13:28:26.4978201Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_logit_cuda_uint8 PASSED [1.5731s] [ 55%] 2025-12-04T13:28:26.4978350Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_float16 PASSED [1.6125s] [ 55%] 2025-12-04T13:28:26.4978496Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_1_cuda_float32 PASSED [1.5798s] [ 55%] 2025-12-04T13:28:26.4978639Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_3_cuda_int8 PASSED [1.5911s] [ 55%] 2025-12-04T13:28:26.4978784Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_float64 PASSED [0.0808s] [ 55%] 2025-12-04T13:28:26.4978928Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_int32 PASSED [0.0597s] [ 55%] 2025-12-04T13:28:26.4979070Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8 PASSED [0.0419s] [ 55%] 2025-12-04T13:28:26.4979183Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_ndtri_cuda_int8 PASSED [1.5789s] [ 55%] 2025-12-04T13:28:26.4979338Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_float32 PASSED [1.5576s] [ 55%] 2025-12-04T13:28:26.4979472Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_float64 PASSED [1.5460s] [ 55%] 2025-12-04T13:28:26.4979615Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_int16 PASSED [1.5538s] [ 55%] 2025-12-04T13:28:26.4979757Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_int64 PASSED [1.5656s] [ 55%] 2025-12-04T13:28:26.4979888Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_softmax_with_dtype_cuda_uint8 PASSED [1.5527s] [ 55%] 2025-12-04T13:28:26.4980022Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_spherical_bessel_j0_cuda_float64 PASSED [1.5768s] [ 55%] 2025-12-04T13:28:26.4980139Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_bool PASSED [0.1750s] [ 55%] 2025-12-04T13:28:26.4980257Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int16 PASSED [0.1540s] [ 55%] 2025-12-04T13:28:26.4980375Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_xlog1py_cuda_int32 PASSED [0.1360s] [ 55%] 2025-12-04T13:28:26.4980492Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_float64 PASSED [8.7538s] [ 55%] 2025-12-04T13:28:26.4980606Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_special_zeta_cuda_int8 PASSED [0.1244s] [ 55%] 2025-12-04T13:28:26.4980724Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_bool PASSED [1.5494s] [ 55%] 2025-12-04T13:28:26.4980844Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_float64 PASSED [1.5593s] [ 55%] 2025-12-04T13:28:26.4980964Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_split_with_sizes_cuda_int64 PASSED [1.5886s] [ 55%] 2025-12-04T13:28:26.4981073Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_bfloat16 PASSED [1.6033s] [ 55%] 2025-12-04T13:28:26.4981183Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_complex32 PASSED [1.5920s] [ 55%] 2025-12-04T13:28:26.4981291Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_complex64 PASSED [1.5993s] [ 55%] 2025-12-04T13:28:26.4981399Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_float16 PASSED [1.5928s] [ 55%] 2025-12-04T13:28:26.4981503Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sqrt_cuda_int8 PASSED [1.5755s] [ 55%] 2025-12-04T13:28:26.4981613Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_bfloat16 PASSED [1.5812s] [ 55%] 2025-12-04T13:28:26.4981730Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_float64 PASSED [1.5830s] [ 56%] 2025-12-04T13:28:26.4981836Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_square_cuda_int8 PASSED [1.5802s] [ 56%] 2025-12-04T13:28:26.4981997Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_copy_cuda_int8 PASSED [1.5542s] [ 56%] 2025-12-04T13:28:26.4982107Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_cuda_float32 PASSED [1.5626s] [ 56%] 2025-12-04T13:28:26.4982234Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_complex128 PASSED [1.5599s] [ 56%] 2025-12-04T13:28:26.4982356Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_int16 PASSED [1.5457s] [ 56%] 2025-12-04T13:28:26.4982476Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_squeeze_multiple_cuda_int64 PASSED [1.5472s] [ 56%] 2025-12-04T13:28:26.4982586Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_complex128 PASSED [0.0137s] [ 56%] 2025-12-04T13:28:26.4982697Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_complex32 PASSED [0.0111s] [ 56%] 2025-12-04T13:28:26.4982805Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_complex64 PASSED [0.0109s] [ 56%] 2025-12-04T13:28:26.4982937Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_float16 PASSED [0.0106s] [ 56%] 2025-12-04T13:28:26.4983045Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_float32 PASSED [0.0105s] [ 56%] 2025-12-04T13:28:26.4983164Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_int16 PASSED [0.0108s] [ 56%] 2025-12-04T13:28:26.4983268Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_stack_cuda_uint8 PASSED [0.0104s] [ 56%] 2025-12-04T13:28:26.4983390Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_complex128 PASSED [0.0109s] [ 56%] 2025-12-04T13:28:26.4983494Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_std_cuda_float16 PASSED [1.5653s] [ 56%] 2025-12-04T13:28:26.4983602Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_bfloat16 PASSED [0.1187s] [ 56%] 2025-12-04T13:28:26.4983709Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_complex32 PASSED [0.1665s] [ 56%] 2025-12-04T13:28:26.4983814Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_int64 PASSED [0.1020s] [ 56%] 2025-12-04T13:28:26.4983916Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_int8 PASSED [0.0997s] [ 56%] 2025-12-04T13:28:26.4984019Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sub_cuda_uint8 PASSED [0.0995s] [ 56%] 2025-12-04T13:28:26.4984121Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_cuda_uint8 PASSED [0.0155s] [ 56%] 2025-12-04T13:28:26.4984237Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_bfloat16 PASSED [1.5519s] [ 56%] 2025-12-04T13:28:26.4984348Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_sum_to_size_cuda_int16 PASSED [1.5441s] [ 56%] 2025-12-04T13:28:26.4984459Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_complex128 PASSED [1.5764s] [ 56%] 2025-12-04T13:28:26.4984567Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_float32 PASSED [1.5528s] [ 56%] 2025-12-04T13:28:26.4984673Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_copy_cuda_uint8 PASSED [1.5550s] [ 56%] 2025-12-04T13:28:26.4984779Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_t_cuda_bfloat16 PASSED [1.5567s] [ 56%] 2025-12-04T13:28:26.4984894Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_int16 PASSED [1.5642s] [ 56%] 2025-12-04T13:28:26.4985009Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_take_along_dim_cuda_int8 PASSED [1.5634s] [ 56%] 2025-12-04T13:28:26.4985118Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_complex64 PASSED [1.6025s] [ 56%] 2025-12-04T13:28:26.4985220Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_int32 PASSED [1.5874s] [ 56%] 2025-12-04T13:28:26.4985334Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tan_cuda_int64 PASSED [1.5803s] [ 56%] 2025-12-04T13:28:26.4985440Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_float32 PASSED [1.5910s] [ 56%] 2025-12-04T13:28:26.4985544Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tanh_cuda_int32 PASSED [1.5872s] [ 56%] 2025-12-04T13:28:26.4985665Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_complex128 PASSED [1.5710s] [ 56%] 2025-12-04T13:28:26.4985781Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_float16 PASSED [1.5729s] [ 56%] 2025-12-04T13:28:26.4985895Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_int16 PASSED [1.5686s] [ 56%] 2025-12-04T13:28:26.4986009Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tensor_split_cuda_int64 PASSED [1.5669s] [ 56%] 2025-12-04T13:28:26.4986115Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_complex64 PASSED [1.5703s] [ 56%] 2025-12-04T13:28:26.4986218Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_to_cuda_int64 PASSED [1.5703s] [ 56%] 2025-12-04T13:28:26.4986325Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_bfloat16 PASSED [1.5606s] [ 56%] 2025-12-04T13:28:26.4986445Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trace_cuda_complex32 PASSED [1.5530s] [ 56%] 2025-12-04T13:28:26.4986567Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_complex64 PASSED [1.1715s] [ 56%] 2025-12-04T13:28:26.4986699Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_float16 PASSED [1.1530s] [ 56%] 2025-12-04T13:28:26.4986816Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_int64 PASSED [1.1472s] [ 56%] 2025-12-04T13:28:26.4986943Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_copy_cuda_int8 PASSED [1.1314s] [ 56%] 2025-12-04T13:28:26.4987059Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_complex128 PASSED [1.1341s] [ 56%] 2025-12-04T13:28:26.4987177Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_complex32 PASSED [1.1308s] [ 56%] 2025-12-04T13:28:26.4987292Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_transpose_cuda_complex64 PASSED [1.1341s] [ 56%] 2025-12-04T13:28:26.4987401Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_complex64 PASSED [1.1264s] [ 56%] 2025-12-04T13:28:26.4987510Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_float16 PASSED [1.1154s] [ 56%] 2025-12-04T13:28:26.4987616Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_tril_cuda_uint8 PASSED [1.1296s] [ 56%] 2025-12-04T13:28:26.4987726Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_complex32 PASSED [1.1453s] [ 56%] 2025-12-04T13:28:26.4987829Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_int16 PASSED [1.1382s] [ 56%] 2025-12-04T13:28:26.4987933Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_int32 PASSED [1.1172s] [ 56%] 2025-12-04T13:28:26.4988036Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_int64 PASSED [1.1225s] [ 56%] 2025-12-04T13:28:26.4988140Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_triu_cuda_uint8 PASSED [1.1311s] [ 56%] 2025-12-04T13:28:26.4988255Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_bfloat16 PASSED [1.2487s] [ 56%] 2025-12-04T13:28:26.4988365Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_true_divide_cuda_bool PASSED [1.2261s] [ 56%] 2025-12-04T13:28:26.4988473Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_float64 PASSED [1.1282s] [ 56%] 2025-12-04T13:28:26.4988579Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_int16 PASSED [1.1249s] [ 56%] 2025-12-04T13:28:26.4988683Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_int64 PASSED [1.1462s] [ 56%] 2025-12-04T13:28:26.4988789Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_trunc_cuda_uint8 PASSED [1.1404s] [ 56%] 2025-12-04T13:28:26.4988915Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_bfloat16 PASSED [1.1246s] [ 56%] 2025-12-04T13:28:26.4989027Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_bool PASSED [1.1382s] [ 56%] 2025-12-04T13:28:26.4989144Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_complex64 PASSED [1.1296s] [ 56%] 2025-12-04T13:28:26.4989256Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_int16 PASSED [1.1330s] [ 57%] 2025-12-04T13:28:26.4989367Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_copy_cuda_uint8 PASSED [1.1266s] [ 57%] 2025-12-04T13:28:26.4989473Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_bool PASSED [1.1425s] [ 57%] 2025-12-04T13:28:26.4989583Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_complex32 PASSED [1.1453s] [ 57%] 2025-12-04T13:28:26.4989687Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_int32 PASSED [1.1496s] [ 57%] 2025-12-04T13:28:26.4989793Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unbind_cuda_int64 PASSED [1.1389s] [ 57%] 2025-12-04T13:28:26.4989910Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_bool PASSED [0.0113s] [ 57%] 2025-12-04T13:28:26.4990021Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_float16 PASSED [0.0095s] [ 57%] 2025-12-04T13:28:26.4990132Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_float64 PASSED [0.0093s] [ 57%] 2025-12-04T13:28:26.4990252Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unflatten_cuda_int8 PASSED [0.0092s] [ 57%] 2025-12-04T13:28:26.4990366Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_bool PASSED [1.1286s] [ 57%] 2025-12-04T13:28:26.4990473Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unfold_cuda_float16 PASSED [1.1424s] [ 57%] 2025-12-04T13:28:26.4990595Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_copy_cuda_complex32 PASSED [1.1095s] [ 57%] 2025-12-04T13:28:26.4990707Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_float64 PASSED [1.1138s] [ 57%] 2025-12-04T13:28:26.4990817Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_unsqueeze_cuda_uint8 PASSED [1.1109s] [ 57%] 2025-12-04T13:28:26.4990923Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_var_cuda_bfloat16 PASSED [1.1643s] [ 57%] 2025-12-04T13:28:26.4991031Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_complex128 PASSED [1.1326s] [ 57%] 2025-12-04T13:28:26.4991140Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vdot_cuda_complex64 PASSED [1.1216s] [ 57%] 2025-12-04T13:28:26.4991260Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_as_complex_cuda_float16 PASSED [1.1371s] [ 57%] 2025-12-04T13:28:26.4991372Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_bfloat16 PASSED [1.1187s] [ 57%] 2025-12-04T13:28:26.4991481Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_copy_cuda_uint8 PASSED [1.1533s] [ 57%] 2025-12-04T13:28:26.4991586Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_view_cuda_uint8 PASSED [1.1646s] [ 57%] 2025-12-04T13:28:26.4991695Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_bfloat16 PASSED [1.1215s] [ 57%] 2025-12-04T13:28:26.4991799Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_bool PASSED [1.1351s] [ 57%] 2025-12-04T13:28:26.4991953Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_complex64 PASSED [1.1280s] [ 57%] 2025-12-04T13:28:26.4992061Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_float32 PASSED [1.1372s] [ 57%] 2025-12-04T13:28:26.4992166Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vsplit_cuda_int8 PASSED [1.1348s] [ 57%] 2025-12-04T13:28:26.4992273Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_float16 PASSED [1.1491s] [ 57%] 2025-12-04T13:28:26.4992394Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_float32 PASSED [1.1315s] [ 57%] 2025-12-04T13:28:26.4992499Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_vstack_cuda_int32 PASSED [1.1311s] [ 57%] 2025-12-04T13:28:26.4992605Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_int16 PASSED [0.1468s] [ 57%] 2025-12-04T13:28:26.4992708Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_xlogy_cuda_int32 PASSED [0.1376s] [ 57%] 2025-12-04T13:28:26.4992816Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_float16 PASSED [1.1322s] [ 57%] 2025-12-04T13:28:26.4992924Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_float32 PASSED [1.1347s] [ 57%] 2025-12-04T13:28:26.4993029Z test_ops.py::TestCommonCUDA::test_python_ref_meta__refs_zeros_cuda_int64 PASSED [1.1365s] [ 57%] 2025-12-04T13:28:26.4993141Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_bool PASSED [0.0038s] [ 57%] 2025-12-04T13:28:26.4993261Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_complex32 PASSED [1.1473s] [ 57%] 2025-12-04T13:28:26.4993379Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_complex64 PASSED [0.0043s] [ 57%] 2025-12-04T13:28:26.4993504Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_T_cuda_int64 PASSED [1.1486s] [ 57%] 2025-12-04T13:28:26.4993645Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_bool PASSED [0.0172s] [ 57%] 2025-12-04T13:28:26.4993801Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bfloat16_cuda_float32 PASSED [1.1531s] [ 57%] 2025-12-04T13:28:26.4993949Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_bool PASSED [0.0133s] [ 57%] 2025-12-04T13:28:26.4994084Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_float16 PASSED [1.1511s] [ 57%] 2025-12-04T13:28:26.4994219Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_bool_cuda_int64 PASSED [0.0120s] [ 57%] 2025-12-04T13:28:26.4994359Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_byte_cuda_complex64 PASSED [1.1573s] [ 57%] 2025-12-04T13:28:26.4994505Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_bfloat16 PASSED [0.0177s] [ 57%] 2025-12-04T13:28:26.4994647Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_cdouble_cuda_float64 PASSED [1.1478s] [ 57%] 2025-12-04T13:28:26.4994785Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_float64 PASSED [0.0182s] [ 57%] 2025-12-04T13:28:26.4994921Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_chalf_cuda_int8 PASSED [1.1443s] [ 57%] 2025-12-04T13:28:26.4995061Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_complex32 PASSED [0.0262s] [ 57%] 2025-12-04T13:28:26.4995196Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_char_cuda_int16 PASSED [1.1354s] [ 57%] 2025-12-04T13:28:26.4995337Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_complex_cuda_float64 PASSED [0.0432s] [ 57%] 2025-12-04T13:28:26.4995482Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_complex64 PASSED [0.0277s] [ 57%] 2025-12-04T13:28:26.4995620Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_float32 PASSED [1.1445s] [ 57%] 2025-12-04T13:28:26.4995757Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_int32 PASSED [0.0155s] [ 57%] 2025-12-04T13:28:26.4995893Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_double_cuda_uint8 PASSED [1.1525s] [ 57%] 2025-12-04T13:28:26.4996035Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_complex128 PASSED [0.0308s] [ 57%] 2025-12-04T13:28:26.4996187Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_float32 PASSED [1.1507s] [ 57%] 2025-12-04T13:28:26.4996326Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_float_cuda_float64 PASSED [0.0170s] [ 57%] 2025-12-04T13:28:26.4996462Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_bfloat16 PASSED [1.1386s] [ 57%] 2025-12-04T13:28:26.4996598Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_float64 PASSED [0.0168s] [ 57%] 2025-12-04T13:28:26.4996731Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int16 PASSED [1.1493s] [ 57%] 2025-12-04T13:28:26.4996865Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int32 PASSED [0.0153s] [ 57%] 2025-12-04T13:28:26.4996998Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_int8 PASSED [1.1317s] [ 57%] 2025-12-04T13:28:26.4997133Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_half_cuda_uint8 PASSED [0.0145s] [ 57%] 2025-12-04T13:28:26.4997281Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_bfloat16 PASSED [1.1410s] [ 57%] 2025-12-04T13:28:26.4997416Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_float32 PASSED [0.0132s] [ 57%] 2025-12-04T13:28:26.4997561Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_float64 PASSED [1.1468s] [ 58%] 2025-12-04T13:28:26.4997694Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_int16 PASSED [0.0119s] [ 58%] 2025-12-04T13:28:26.4997836Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_int_cuda_uint8 PASSED [1.1410s] [ 58%] 2025-12-04T13:28:26.4997972Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_float32 PASSED [0.0133s] [ 58%] 2025-12-04T13:28:26.4998107Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_float64 PASSED [1.1352s] [ 58%] 2025-12-04T13:28:26.4998242Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_int16 PASSED [0.0120s] [ 58%] 2025-12-04T13:28:26.4998374Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_long_cuda_int64 PASSED [1.1376s] [ 58%] 2025-12-04T13:28:26.4998513Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_polar_cuda_float32 PASSED [0.0456s] [ 58%] 2025-12-04T13:28:26.4998651Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_bfloat16 PASSED [0.0116s] [ 58%] 2025-12-04T13:28:26.4998794Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_complex64 PASSED [1.1517s] [ 58%] 2025-12-04T13:28:26.4998930Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs__conversions_short_cuda_int16 PASSED [0.0120s] [ 58%] 2025-12-04T13:28:26.4999054Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_complex64 PASSED [0.0300s] [ 58%] 2025-12-04T13:28:26.4999171Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_int8 PASSED [1.1425s] [ 58%] 2025-12-04T13:28:26.4999288Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_abs_cuda_uint8 PASSED [0.0123s] [ 58%] 2025-12-04T13:28:26.4999412Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_complex32 PASSED [0.0312s] [ 58%] 2025-12-04T13:28:26.4999532Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acos_cuda_float64 PASSED [0.0161s] [ 58%] 2025-12-04T13:28:26.4999651Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int16 PASSED [1.1397s] [ 58%] 2025-12-04T13:28:26.4999768Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_acosh_cuda_int32 PASSED [0.0172s] [ 58%] 2025-12-04T13:28:26.4999900Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_int16 PASSED [0.0205s] [ 58%] 2025-12-04T13:28:26.5000022Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_int32 PASSED [1.1457s] [ 58%] 2025-12-04T13:28:26.5000142Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_int64 PASSED [0.0225s] [ 58%] 2025-12-04T13:28:26.5000262Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_int8 PASSED [0.0208s] [ 58%] 2025-12-04T13:28:26.5000382Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addcmul_cuda_uint8 PASSED [1.1626s] [ 58%] 2025-12-04T13:28:26.5000501Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_float16 PASSED [0.0082s] [ 58%] 2025-12-04T13:28:26.5000621Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_float64 PASSED [1.1510s] [ 58%] 2025-12-04T13:28:26.5000739Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_addr_cuda_int16 PASSED [0.0048s] [ 58%] 2025-12-04T13:28:26.5000870Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_bfloat16 PASSED [1.1391s] [ 58%] 2025-12-04T13:28:26.5001012Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_complex32 PASSED [0.0042s] [ 58%] 2025-12-04T13:28:26.5001141Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_float32 PASSED [1.1318s] [ 58%] 2025-12-04T13:28:26.5001275Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_alias_copy_cuda_uint8 PASSED [0.0039s] [ 58%] 2025-12-04T13:28:26.5001394Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_all_cuda_float32 PASSED [1.1437s] [ 58%] 2025-12-04T13:28:26.5001530Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_allclose_cuda_bfloat16 PASSED [0.0116s] [ 58%] 2025-12-04T13:28:26.5001648Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_int8 PASSED [0.0077s] [ 58%] 2025-12-04T13:28:26.5001766Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amax_cuda_uint8 PASSED [1.1436s] [ 58%] 2025-12-04T13:28:26.5001923Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_int16 PASSED [0.0086s] [ 58%] 2025-12-04T13:28:26.5002039Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_int32 PASSED [0.0073s] [ 58%] 2025-12-04T13:28:26.5002154Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_amin_cuda_int8 PASSED [0.0071s] [ 58%] 2025-12-04T13:28:26.5002273Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_float16 PASSED [1.1298s] [ 58%] 2025-12-04T13:28:26.5002387Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_any_cuda_int8 PASSED [0.0075s] [ 58%] 2025-12-04T13:28:26.5002512Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_bfloat16 PASSED [0.0108s] [ 58%] 2025-12-04T13:28:26.5002632Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_arange_cuda_int64 PASSED [0.0077s] [ 58%] 2025-12-04T13:28:26.5002767Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_float64 PASSED [0.0039s] [ 58%] 2025-12-04T13:28:26.5002899Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_int64 PASSED [1.1161s] [ 58%] 2025-12-04T13:28:26.5003031Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_copy_cuda_uint8 PASSED [0.0046s] [ 58%] 2025-12-04T13:28:26.5003160Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_bfloat16 PASSED [1.1387s] [ 58%] 2025-12-04T13:28:26.5003295Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_complex128 PASSED [0.0062s] [ 58%] 2025-12-04T13:28:26.5003425Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_float64 PASSED [0.0043s] [ 58%] 2025-12-04T13:28:26.5003565Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_int32 PASSED [1.1274s] [ 58%] 2025-12-04T13:28:26.5003691Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_int8 PASSED [0.0048s] [ 58%] 2025-12-04T13:28:26.5003815Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_cuda_uint8 PASSED [0.0036s] [ 58%] 2025-12-04T13:28:26.5003963Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_float32 PASSED [1.1388s] [ 58%] 2025-12-04T13:28:26.5004109Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_partial_views_cuda_float64 PASSED [0.0049s] [ 58%] 2025-12-04T13:28:26.5004250Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_as_strided_scatter_cuda_complex32 PASSED [1.1446s] [ 58%] 2025-12-04T13:28:26.5004370Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_bfloat16 PASSED [0.0178s] [ 58%] 2025-12-04T13:28:26.5004495Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_complex128 PASSED [0.0297s] [ 58%] 2025-12-04T13:28:26.5004615Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_float64 PASSED [0.0148s] [ 58%] 2025-12-04T13:28:26.5004746Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asin_cuda_int8 PASSED [1.1539s] [ 58%] 2025-12-04T13:28:26.5004864Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_bool PASSED [0.0183s] [ 58%] 2025-12-04T13:28:26.5005005Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_complex128 PASSED [0.0297s] [ 58%] 2025-12-04T13:28:26.5005129Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_float32 PASSED [0.0149s] [ 58%] 2025-12-04T13:28:26.5005262Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_asinh_cuda_int16 PASSED [1.1571s] [ 58%] 2025-12-04T13:28:26.5005383Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan2_cuda_float64 PASSED [0.0481s] [ 58%] 2025-12-04T13:28:26.5005500Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_int16 PASSED [0.0137s] [ 58%] 2025-12-04T13:28:26.5005616Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atan_cuda_int8 PASSED [0.0124s] [ 58%] 2025-12-04T13:28:26.5005735Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_float32 PASSED [1.1410s] [ 58%] 2025-12-04T13:28:26.5005854Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_float64 PASSED [0.0176s] [ 58%] 2025-12-04T13:28:26.5005972Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atanh_cuda_int16 PASSED [0.0138s] [ 58%] 2025-12-04T13:28:26.5006102Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_1d_cuda_float16 PASSED [1.1325s] [ 59%] 2025-12-04T13:28:26.5006233Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_complex64 PASSED [0.0069s] [ 59%] 2025-12-04T13:28:26.5006362Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_2d_cuda_float16 PASSED [0.0052s] [ 59%] 2025-12-04T13:28:26.5006488Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_float16 PASSED [0.0053s] [ 59%] 2025-12-04T13:28:26.5006615Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_atleast_3d_cuda_float32 PASSED [0.0052s] [ 59%] 2025-12-04T13:28:26.5006740Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_and_cuda_int16 PASSED [0.0368s] [ 59%] 2025-12-04T13:28:26.5006878Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_left_shift_cuda_uint8 PASSED [0.0358s] [ 59%] 2025-12-04T13:28:26.5007003Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_bool PASSED [0.0135s] [ 59%] 2025-12-04T13:28:26.5007128Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_int64 PASSED [1.1513s] [ 59%] 2025-12-04T13:28:26.5007261Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_not_cuda_int8 PASSED [0.0128s] [ 59%] 2025-12-04T13:28:26.5007398Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_right_shift_cuda_int64 PASSED [0.0377s] [ 59%] 2025-12-04T13:28:26.5007524Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_bool PASSED [0.0362s] [ 59%] 2025-12-04T13:28:26.5007648Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bitwise_xor_cuda_int32 PASSED [0.0373s] [ 59%] 2025-12-04T13:28:26.5007774Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_bool PASSED [1.1312s] [ 59%] 2025-12-04T13:28:26.5007899Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_int8 PASSED [0.0047s] [ 59%] 2025-12-04T13:28:26.5008023Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_block_diag_cuda_uint8 PASSED [1.1341s] [ 59%] 2025-12-04T13:28:26.5008159Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_float32 PASSED [0.0078s] [ 59%] 2025-12-04T13:28:26.5008294Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_int16 PASSED [1.1185s] [ 59%] 2025-12-04T13:28:26.5008438Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_broadcast_tensors_cuda_int32 PASSED [0.0057s] [ 59%] 2025-12-04T13:28:26.5008564Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_float64 PASSED [1.1580s] [ 59%] 2025-12-04T13:28:26.5008697Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_bucketize_cuda_int8 PASSED [0.0290s] [ 59%] 2025-12-04T13:28:26.5008817Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_bfloat16 PASSED [0.0074s] [ 59%] 2025-12-04T13:28:26.5008951Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_bool PASSED [0.0058s] [ 59%] 2025-12-04T13:28:26.5009071Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_complex32 PASSED [0.0072s] [ 59%] 2025-12-04T13:28:26.5009191Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_float32 PASSED [0.0066s] [ 59%] 2025-12-04T13:28:26.5009306Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_int16 PASSED [0.0057s] [ 59%] 2025-12-04T13:28:26.5009422Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cat_cuda_uint8 PASSED [0.0058s] [ 59%] 2025-12-04T13:28:26.5009538Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_int32 PASSED [1.1369s] [ 59%] 2025-12-04T13:28:26.5009655Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ceil_cuda_uint8 PASSED [0.0119s] [ 59%] 2025-12-04T13:28:26.5009770Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_bool PASSED [0.0091s] [ 59%] 2025-12-04T13:28:26.5009889Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_chunk_cuda_int32 PASSED [1.1554s] [ 59%] 2025-12-04T13:28:26.5010011Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_cuda_bfloat16 PASSED [0.0152s] [ 59%] 2025-12-04T13:28:26.5010139Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_bfloat16 PASSED [0.0372s] [ 59%] 2025-12-04T13:28:26.5010262Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_int32 PASSED [0.0253s] [ 59%] 2025-12-04T13:28:26.5010384Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clamp_min_cuda_int8 PASSED [1.1492s] [ 59%] 2025-12-04T13:28:26.5010509Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_complex32 PASSED [0.0294s] [ 59%] 2025-12-04T13:28:26.5010628Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_float16 PASSED [0.0260s] [ 59%] 2025-12-04T13:28:26.5010748Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_int16 PASSED [0.0193s] [ 59%] 2025-12-04T13:28:26.5010864Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_int64 PASSED [0.0194s] [ 59%] 2025-12-04T13:28:26.5010993Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_clone_cuda_int8 PASSED [0.0192s] [ 59%] 2025-12-04T13:28:26.5011127Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_complex64 PASSED [1.1319s] [ 59%] 2025-12-04T13:28:26.5011259Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_float32 PASSED [0.0054s] [ 59%] 2025-12-04T13:28:26.5011389Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_column_stack_cuda_float64 PASSED [1.1375s] [ 59%] 2025-12-04T13:28:26.5011511Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_bfloat16 PASSED [0.0154s] [ 59%] 2025-12-04T13:28:26.5011629Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_bool PASSED [1.1309s] [ 59%] 2025-12-04T13:28:26.5011747Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_cuda_uint8 PASSED [0.0107s] [ 59%] 2025-12-04T13:28:26.5011918Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_conj_physical_cuda_float16 PASSED [1.1437s] [ 59%] 2025-12-04T13:28:26.5012065Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_float16 PASSED [0.0221s] [ 59%] 2025-12-04T13:28:26.5012197Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_int16 PASSED [0.0161s] [ 59%] 2025-12-04T13:28:26.5012327Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_constant_pad_nd_cuda_int64 PASSED [0.0158s] [ 59%] 2025-12-04T13:28:26.5012466Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_bool PASSED [1.1465s] [ 59%] 2025-12-04T13:28:26.5012609Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_complex32 PASSED [0.0247s] [ 59%] 2025-12-04T13:28:26.5012741Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_complex64 PASSED [1.1612s] [ 59%] 2025-12-04T13:28:26.5012866Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_int32 PASSED [0.0174s] [ 59%] 2025-12-04T13:28:26.5012992Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_int8 PASSED [1.1602s] [ 59%] 2025-12-04T13:28:26.5013116Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_contiguous_cuda_uint8 PASSED [0.0173s] [ 59%] 2025-12-04T13:28:26.5013238Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_copysign_cuda_uint8 PASSED [0.0709s] [ 59%] 2025-12-04T13:28:26.5013354Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_bool PASSED [0.0172s] [ 59%] 2025-12-04T13:28:26.5013475Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_complex32 PASSED [0.4301s] [ 59%] 2025-12-04T13:28:26.5013593Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_float32 PASSED [0.0154s] [ 59%] 2025-12-04T13:28:26.5013713Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cos_cuda_float64 PASSED [1.1561s] [ 59%] 2025-12-04T13:28:26.5013837Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_complex32 PASSED [0.0331s] [ 59%] 2025-12-04T13:28:26.5013958Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_float32 PASSED [0.0163s] [ 59%] 2025-12-04T13:28:26.5014075Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_int16 PASSED [0.0147s] [ 59%] 2025-12-04T13:28:26.5014192Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_int32 PASSED [1.1609s] [ 59%] 2025-12-04T13:28:26.5014307Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cosh_cuda_int8 PASSED [0.0155s] [ 59%] 2025-12-04T13:28:26.5014440Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_bfloat16 PASSED [1.1700s] [ 59%] 2025-12-04T13:28:26.5014569Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_int64 PASSED [0.0076s] [ 60%] 2025-12-04T13:28:26.5014710Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_count_nonzero_cuda_uint8 PASSED [1.6764s] [ 60%] 2025-12-04T13:28:26.5014833Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_int64 PASSED [0.0078s] [ 60%] 2025-12-04T13:28:26.5014953Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumprod_cuda_uint8 PASSED [1.6361s] [ 60%] 2025-12-04T13:28:26.5015074Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_cumsum_cuda_float64 PASSED [0.0062s] [ 60%] 2025-12-04T13:28:26.5015199Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_bfloat16 PASSED [1.5723s] [ 60%] 2025-12-04T13:28:26.5015322Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_deg2rad_cuda_int32 PASSED [0.0148s] [ 60%] 2025-12-04T13:28:26.5015439Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_cuda_int64 PASSED [1.5850s] [ 60%] 2025-12-04T13:28:26.5015570Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_complex32 PASSED [0.0142s] [ 60%] 2025-12-04T13:28:26.5015698Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_float32 PASSED [1.5904s] [ 60%] 2025-12-04T13:28:26.5015834Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diag_embed_cuda_float64 PASSED [0.0131s] [ 60%] 2025-12-04T13:28:26.5015968Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_complex32 PASSED [1.5755s] [ 60%] 2025-12-04T13:28:26.5016110Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_float16 PASSED [0.0111s] [ 60%] 2025-12-04T13:28:26.5016249Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_int16 PASSED [0.0075s] [ 60%] 2025-12-04T13:28:26.5016379Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_copy_cuda_int64 PASSED [1.5684s] [ 60%] 2025-12-04T13:28:26.5016507Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_complex64 PASSED [0.0110s] [ 60%] 2025-12-04T13:28:26.5016631Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_float16 PASSED [1.5795s] [ 60%] 2025-12-04T13:28:26.5016756Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_cuda_int16 PASSED [0.0087s] [ 60%] 2025-12-04T13:28:26.5016892Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_float16 PASSED [0.0081s] [ 60%] 2025-12-04T13:28:26.5017031Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_float64 PASSED [0.0076s] [ 60%] 2025-12-04T13:28:26.5017163Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_int16 PASSED [0.0061s] [ 60%] 2025-12-04T13:28:26.5017295Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_diagonal_scatter_cuda_int64 PASSED [0.0061s] [ 60%] 2025-12-04T13:28:26.5017431Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_int16 PASSED [0.0499s] [ 60%] 2025-12-04T13:28:26.5017564Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_floor_rounding_cuda_uint8 PASSED [1.5945s] [ 60%] 2025-12-04T13:28:26.5017701Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_no_rounding_mode_cuda_uint8 PASSED [0.0541s] [ 60%] 2025-12-04T13:28:26.5017834Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_div_trunc_rounding_cuda_int16 PASSED [0.0409s] [ 60%] 2025-12-04T13:28:26.5017955Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_bool PASSED [0.0028s] [ 60%] 2025-12-04T13:28:26.5018079Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_float16 PASSED [1.5703s] [ 60%] 2025-12-04T13:28:26.5018198Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_int16 PASSED [0.0044s] [ 60%] 2025-12-04T13:28:26.5018327Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dsplit_cuda_uint8 PASSED [1.5836s] [ 60%] 2025-12-04T13:28:26.5018453Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_dstack_cuda_complex64 PASSED [0.0061s] [ 60%] 2025-12-04T13:28:26.5018631Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_complex128 SKIPPED [0.0002s] (Expected: empty is not comparable) [ 60%] 2025-12-04T13:28:26.5018801Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_cuda_float32 SKIPPED [0.0001s] (Expected: empty is not comparable) [ 60%] 2025-12-04T13:28:26.5018988Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_empty_strided_cuda_int32 SKIPPED [0.0001s] (Expected: empty_strided is not comparable) [ 60%] 2025-12-04T13:28:26.5019109Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_complex64 PASSED [0.0525s] [ 60%] 2025-12-04T13:28:26.5019228Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eq_cuda_float16 PASSED [0.0399s] [ 60%] 2025-12-04T13:28:26.5019345Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_bool PASSED [1.5742s] [ 60%] 2025-12-04T13:28:26.5019480Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_complex128 PASSED [0.0063s] [ 60%] 2025-12-04T13:28:26.5019601Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_float64 PASSED [1.5793s] [ 60%] 2025-12-04T13:28:26.5019720Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_equal_cuda_int64 PASSED [0.0059s] [ 60%] 2025-12-04T13:28:26.5019844Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erf_cuda_int32 PASSED [0.0139s] [ 60%] 2025-12-04T13:28:26.5019973Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfc_cuda_int8 PASSED [1.6308s] [ 60%] 2025-12-04T13:28:26.5020091Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_bool PASSED [0.0186s] [ 60%] 2025-12-04T13:28:26.5020216Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_float16 PASSED [0.2657s] [ 60%] 2025-12-04T13:28:26.5020337Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_float64 PASSED [0.1987s] [ 60%] 2025-12-04T13:28:26.5020458Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_int64 PASSED [1.5917s] [ 60%] 2025-12-04T13:28:26.5020575Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_erfinv_cuda_int8 PASSED [0.0169s] [ 60%] 2025-12-04T13:28:26.5020694Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exp2_cuda_float64 PASSED [0.2791s] [ 60%] 2025-12-04T13:28:26.5020822Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_float16 PASSED [1.5666s] [ 60%] 2025-12-04T13:28:26.5020946Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_int64 PASSED [0.0044s] [ 60%] 2025-12-04T13:28:26.5021068Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_as_cuda_int8 PASSED [1.5736s] [ 60%] 2025-12-04T13:28:26.5021200Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_complex64 PASSED [0.0072s] [ 60%] 2025-12-04T13:28:26.5021326Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_copy_cuda_int16 PASSED [1.5709s] [ 60%] 2025-12-04T13:28:26.5021452Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_complex128 PASSED [0.0083s] [ 60%] 2025-12-04T13:28:26.5021574Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expand_cuda_float64 PASSED [0.0058s] [ 60%] 2025-12-04T13:28:26.5021696Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_bfloat16 PASSED [0.0157s] [ 60%] 2025-12-04T13:28:26.5021817Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_bool PASSED [0.0155s] [ 60%] 2025-12-04T13:28:26.5021976Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_expm1_cuda_int64 PASSED [1.5904s] [ 60%] 2025-12-04T13:28:26.5022178Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_exponential_cuda_float16 SKIPPED [0.0003s] (Expected: exponential is not comparable) [ 60%] 2025-12-04T13:28:26.5022294Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_bool PASSED [1.5790s] [ 60%] 2025-12-04T13:28:26.5022419Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_complex128 PASSED [0.0312s] [ 60%] 2025-12-04T13:28:26.5022547Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_eye_cuda_float8_e4m3fnuz PASSED [1.5814s] [ 60%] 2025-12-04T13:28:26.5022668Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_bool PASSED [0.0064s] [ 60%] 2025-12-04T13:28:26.5022795Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_complex32 PASSED [1.5850s] [ 60%] 2025-12-04T13:28:26.5022916Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int32 PASSED [0.0063s] [ 60%] 2025-12-04T13:28:26.5023038Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft2_cuda_int8 PASSED [1.5803s] [ 60%] 2025-12-04T13:28:26.5023160Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fft_cuda_float16 PASSED [0.0310s] [ 60%] 2025-12-04T13:28:26.5023301Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_int16 PASSED [1.5825s] [ 61%] 2025-12-04T13:28:26.5023422Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_int32 PASSED [0.0084s] [ 61%] 2025-12-04T13:28:26.5023567Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftn_cuda_int64 PASSED [1.5545s] [ 61%] 2025-12-04T13:28:26.5023697Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_bfloat16 PASSED [0.0055s] [ 61%] 2025-12-04T13:28:26.5023836Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_bool PASSED [1.5654s] [ 61%] 2025-12-04T13:28:26.5023971Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_complex128 PASSED [0.0058s] [ 61%] 2025-12-04T13:28:26.5024098Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_fftshift_cuda_uint8 PASSED [1.5558s] [ 61%] 2025-12-04T13:28:26.5024221Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_bool PASSED [0.0065s] [ 61%] 2025-12-04T13:28:26.5024346Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_float64 PASSED [1.5563s] [ 61%] 2025-12-04T13:28:26.5024469Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft2_cuda_int64 PASSED [0.0065s] [ 61%] 2025-12-04T13:28:26.5024596Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_complex128 PASSED [1.5706s] [ 61%] 2025-12-04T13:28:26.5024722Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_float16 PASSED [0.0087s] [ 61%] 2025-12-04T13:28:26.5024843Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfft_cuda_int16 PASSED [1.5678s] [ 61%] 2025-12-04T13:28:26.5024974Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_complex128 PASSED [0.0088s] [ 61%] 2025-12-04T13:28:26.5025099Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_float16 PASSED [1.5914s] [ 61%] 2025-12-04T13:28:26.5025223Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_float32 PASSED [0.0089s] [ 61%] 2025-12-04T13:28:26.5025345Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_hfftn_cuda_int16 PASSED [1.5799s] [ 61%] 2025-12-04T13:28:26.5025480Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_complex128 PASSED [0.0067s] [ 61%] 2025-12-04T13:28:26.5025604Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft2_cuda_uint8 PASSED [1.5682s] [ 61%] 2025-12-04T13:28:26.5025725Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_bool PASSED [0.0085s] [ 61%] 2025-12-04T13:28:26.5025871Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_complex128 PASSED [1.5745s] [ 61%] 2025-12-04T13:28:26.5025995Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_float16 PASSED [0.0082s] [ 61%] 2025-12-04T13:28:26.5026117Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifft_cuda_int16 PASSED [1.5771s] [ 61%] 2025-12-04T13:28:26.5026239Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_float64 PASSED [0.0084s] [ 61%] 2025-12-04T13:28:26.5026363Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftn_cuda_int32 PASSED [1.5796s] [ 61%] 2025-12-04T13:28:26.5026495Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_bfloat16 PASSED [0.0056s] [ 61%] 2025-12-04T13:28:26.5026623Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_int16 PASSED [1.5929s] [ 61%] 2025-12-04T13:28:26.5026751Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ifftshift_cuda_uint8 PASSED [0.0051s] [ 61%] 2025-12-04T13:28:26.5026877Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfft2_cuda_float32 PASSED [1.5675s] [ 61%] 2025-12-04T13:28:26.5027011Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_ihfftn_cuda_int16 PASSED [0.0097s] [ 61%] 2025-12-04T13:28:26.5027134Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_bool PASSED [1.5757s] [ 61%] 2025-12-04T13:28:26.5027267Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft2_cuda_int64 PASSED [0.0066s] [ 61%] 2025-12-04T13:28:26.5027399Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_bool PASSED [1.5585s] [ 61%] 2025-12-04T13:28:26.5027526Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_complex32 PASSED [0.0077s] [ 61%] 2025-12-04T13:28:26.5027650Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_float64 PASSED [1.5804s] [ 61%] 2025-12-04T13:28:26.5027772Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfft_cuda_int64 PASSED [0.0078s] [ 61%] 2025-12-04T13:28:26.5027904Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_complex128 PASSED [1.5621s] [ 61%] 2025-12-04T13:28:26.5028027Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_irfftn_cuda_int8 PASSED [0.0085s] [ 61%] 2025-12-04T13:28:26.5028152Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_float16 PASSED [1.9123s] [ 61%] 2025-12-04T13:28:26.5028275Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_float32 PASSED [1.5817s] [ 61%] 2025-12-04T13:28:26.5028398Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_int32 PASSED [0.0058s] [ 61%] 2025-12-04T13:28:26.5028521Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft2_cuda_int8 PASSED [1.5691s] [ 61%] 2025-12-04T13:28:26.5028642Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_int32 PASSED [0.0080s] [ 61%] 2025-12-04T13:28:26.5028764Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfft_cuda_uint8 PASSED [1.5716s] [ 61%] 2025-12-04T13:28:26.5028888Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft_rfftn_cuda_float16 PASSED [0.0077s] [ 61%] 2025-12-04T13:28:26.5029007Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_int16 PASSED [0.0122s] [ 61%] 2025-12-04T13:28:26.5029123Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fill_cuda_int64 PASSED [1.5785s] [ 61%] 2025-12-04T13:28:26.5029249Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_bfloat16 PASSED [0.0196s] [ 61%] 2025-12-04T13:28:26.5029370Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flatten_cuda_int16 PASSED [0.0136s] [ 61%] 2025-12-04T13:28:26.5029497Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_bool PASSED [0.0045s] [ 61%] 2025-12-04T13:28:26.5029622Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_complex128 PASSED [0.0051s] [ 61%] 2025-12-04T13:28:26.5029740Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_float16 PASSED [0.0048s] [ 61%] 2025-12-04T13:28:26.5029860Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_float64 PASSED [0.0049s] [ 61%] 2025-12-04T13:28:26.5029978Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flip_cuda_uint8 PASSED [0.0043s] [ 61%] 2025-12-04T13:28:26.5030097Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_bool PASSED [0.0024s] [ 61%] 2025-12-04T13:28:26.5030222Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_complex64 PASSED [0.0027s] [ 61%] 2025-12-04T13:28:26.5030345Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fliplr_cuda_float32 PASSED [0.0026s] [ 61%] 2025-12-04T13:28:26.5030468Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_bfloat16 PASSED [0.0026s] [ 61%] 2025-12-04T13:28:26.5030602Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_float64 PASSED [0.0027s] [ 61%] 2025-12-04T13:28:26.5030721Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_flipud_cuda_int16 PASSED [0.0024s] [ 61%] 2025-12-04T13:28:26.5030859Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_float_power_cuda_float32 PASSED [0.0537s] [ 61%] 2025-12-04T13:28:26.5030978Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_floor_cuda_uint8 PASSED [1.5913s] [ 61%] 2025-12-04T13:28:26.5031107Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_float64 PASSED [0.0472s] [ 61%] 2025-12-04T13:28:26.5031224Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmax_cuda_int16 PASSED [0.0326s] [ 61%] 2025-12-04T13:28:26.5031340Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_int16 PASSED [0.0322s] [ 61%] 2025-12-04T13:28:26.5031456Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_fmin_cuda_int64 PASSED [1.5953s] [ 61%] 2025-12-04T13:28:26.5031576Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frac_cuda_bfloat16 PASSED [0.0171s] [ 61%] 2025-12-04T13:28:26.5031697Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_frexp_cuda_float16 PASSED [0.0178s] [ 62%] 2025-12-04T13:28:26.5031814Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_float32 PASSED [0.0377s] [ 62%] 2025-12-04T13:28:26.5031971Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ge_cuda_int64 PASSED [0.0356s] [ 62%] 2025-12-04T13:28:26.5032149Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_geometric_cuda_int16 SKIPPED [0.0001s] (Expected: geometric is not comparable) [ 62%] 2025-12-04T13:28:26.5032265Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_int16 PASSED [0.0353s] [ 62%] 2025-12-04T13:28:26.5032377Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_gt_cuda_int32 PASSED [1.6173s] [ 62%] 2025-12-04T13:28:26.5032504Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_float32 PASSED [0.0484s] [ 62%] 2025-12-04T13:28:26.5032627Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_int32 PASSED [0.0352s] [ 62%] 2025-12-04T13:28:26.5032755Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_heaviside_cuda_uint8 PASSED [1.6091s] [ 62%] 2025-12-04T13:28:26.5032881Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hsplit_cuda_complex64 PASSED [0.0054s] [ 62%] 2025-12-04T13:28:26.5032999Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_bool PASSED [1.5829s] [ 62%] 2025-12-04T13:28:26.5033136Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hstack_cuda_float16 PASSED [0.0051s] [ 62%] 2025-12-04T13:28:26.5033257Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hypot_cuda_bfloat16 PASSED [0.0473s] [ 62%] 2025-12-04T13:28:26.5033378Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_hypot_cuda_float16 PASSED [0.0461s] [ 62%] 2025-12-04T13:28:26.5033500Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_igammac_cuda_float32 PASSED [0.0561s] [ 62%] 2025-12-04T13:28:26.5033623Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_bool PASSED [1.5764s] [ 62%] 2025-12-04T13:28:26.5033750Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_complex32 PASSED [0.0072s] [ 62%] 2025-12-04T13:28:26.5033878Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_float16 PASSED [1.5812s] [ 62%] 2025-12-04T13:28:26.5034002Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_float32 PASSED [0.0070s] [ 62%] 2025-12-04T13:28:26.5034125Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_int32 PASSED [1.5595s] [ 62%] 2025-12-04T13:28:26.5034261Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_add_cuda_uint8 PASSED [0.0058s] [ 62%] 2025-12-04T13:28:26.5034388Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_float32 PASSED [1.5854s] [ 62%] 2025-12-04T13:28:26.5034516Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_float64 PASSED [0.0051s] [ 62%] 2025-12-04T13:28:26.5034654Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_int16 PASSED [1.5895s] [ 62%] 2025-12-04T13:28:26.5034796Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_copy_cuda_int32 PASSED [0.0044s] [ 62%] 2025-12-04T13:28:26.5034924Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_float32 PASSED [1.5956s] [ 62%] 2025-12-04T13:28:26.5035049Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_fill_cuda_uint8 PASSED [0.0054s] [ 62%] 2025-12-04T13:28:26.5035179Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_bfloat16 PASSED [1.5658s] [ 62%] 2025-12-04T13:28:26.5035312Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_complex32 PASSED [0.0052s] [ 62%] 2025-12-04T13:28:26.5035441Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_float32 PASSED [1.5646s] [ 62%] 2025-12-04T13:28:26.5035569Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_index_select_cuda_int64 PASSED [0.0046s] [ 62%] 2025-12-04T13:28:26.5035697Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_complex128 PASSED [0.0307s] [ 62%] 2025-12-04T13:28:26.5035819Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isfinite_cuda_uint8 PASSED [1.5830s] [ 62%] 2025-12-04T13:28:26.5035940Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_float32 PASSED [0.0145s] [ 62%] 2025-12-04T13:28:26.5036059Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_int16 PASSED [0.0109s] [ 62%] 2025-12-04T13:28:26.5036179Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isnan_cuda_int32 PASSED [0.0106s] [ 62%] 2025-12-04T13:28:26.5036302Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_float32 PASSED [1.5910s] [ 62%] 2025-12-04T13:28:26.5036424Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isneginf_cuda_int8 PASSED [0.0112s] [ 62%] 2025-12-04T13:28:26.5036546Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isposinf_cuda_int64 PASSED [0.0099s] [ 62%] 2025-12-04T13:28:26.5036669Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_bfloat16 PASSED [1.5773s] [ 62%] 2025-12-04T13:28:26.5036798Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_bool PASSED [0.0141s] [ 62%] 2025-12-04T13:28:26.5036922Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_float16 PASSED [0.0122s] [ 62%] 2025-12-04T13:28:26.5037043Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_isreal_cuda_int64 PASSED [1.5836s] [ 62%] 2025-12-04T13:28:26.5037160Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_item_cuda_int32 PASSED [0.0052s] [ 62%] 2025-12-04T13:28:26.5037276Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_float32 PASSED [0.0387s] [ 62%] 2025-12-04T13:28:26.5037390Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_int32 PASSED [0.0356s] [ 62%] 2025-12-04T13:28:26.5037504Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_le_cuda_uint8 PASSED [0.0347s] [ 62%] 2025-12-04T13:28:26.5037624Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lerp_cuda_float16 PASSED [1.5941s] [ 62%] 2025-12-04T13:28:26.5037745Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lgamma_cuda_float16 PASSED [0.4312s] [ 62%] 2025-12-04T13:28:26.5037880Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_complex128 PASSED [0.0041s] [ 62%] 2025-12-04T13:28:26.5038030Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_float16 PASSED [1.5723s] [ 62%] 2025-12-04T13:28:26.5038159Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_cross_cuda_float64 PASSED [0.0056s] [ 62%] 2025-12-04T13:28:26.5038283Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_complex32 2025-12-04T13:28:26.5038298Z 2025-12-04T13:28:26.5038474Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/test_ops/test_ops-104b49b023bdc7f9.xml - 2025-12-04T13:28:26.5038538Z !!!!!!!!!!!!!!!!!!!!!!!!!!!!!! KeyboardInterrupt !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:28:26.5038694Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py:2653: KeyboardInterrupt 2025-12-04T13:28:26.5038774Z (to show a full traceback on KeyboardInterrupt use --full-trace) 2025-12-04T13:28:26.5038852Z ========== 3634 passed, 516 skipped, 57 xfailed in 1793.18s (0:29:53) ========== 2025-12-04T13:28:26.5038901Z Command took >30min, returning 124 2025-12-04T13:28:26.5038938Z Got exit code 124 2025-12-04T13:28:26.5038980Z Retrying single test... 2025-12-04T13:28:26.5039102Z Test results will be stored in test-reports/python-pytest/test_ops/test_ops-1e1c25e384ae3fde.xml 2025-12-04T13:28:26.5039163Z ============================= test session starts ============================== 2025-12-04T13:28:26.5039276Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:28:26.5039318Z cachedir: .pytest_cache 2025-12-04T13:28:26.5039477Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:28:26.5039527Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:28:26.5039567Z configfile: pytest.ini 2025-12-04T13:28:26.5039748Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, anyio-4.12.0, typeguard-4.3.0 2025-12-04T13:28:26.5039831Z collecting ... collected 33666 items / 6701 deselected / 26965 selected 2025-12-04T13:28:26.5040032Z stepcurrent: skipping 4207 already run items. Running only test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_complex32 2025-12-04T13:28:26.5040078Z Running 1 items in this shard 2025-12-04T13:28:26.5040081Z 2025-12-04T13:28:26.5040222Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_complex32 PASSED [0.1680s] [100%] 2025-12-04T13:28:26.5040225Z 2025-12-04T13:28:26.5040385Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/test_ops/test_ops-1e1c25e384ae3fde.xml - 2025-12-04T13:28:26.5040462Z ====================== 1 passed, 6701 deselected in 1.64s ====================== 2025-12-04T13:28:26.5040501Z Got exit code 0 2025-12-04T13:28:26.5040583Z Test succeeded in new process, continuing with the rest of the tests 2025-12-04T13:28:26.5040699Z Test results will be stored in test-reports/python-pytest/test_ops/test_ops-f555603e316361f2.xml 2025-12-04T13:28:26.5040756Z ============================= test session starts ============================== 2025-12-04T13:28:26.5040867Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:28:26.5040907Z cachedir: .pytest_cache 2025-12-04T13:28:26.5041064Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:28:26.5041109Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:28:26.5041149Z configfile: pytest.ini 2025-12-04T13:28:26.5041324Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, anyio-4.12.0, typeguard-4.3.0 2025-12-04T13:28:26.5041405Z collecting ... collected 33666 items / 4208 deselected / 29458 selected 2025-12-04T13:28:26.5041473Z stepcurrent: skipping 4208 already run items. 2025-12-04T13:28:26.5041518Z Running 2494 items in this shard 2025-12-04T13:28:26.5041520Z 2025-12-04T13:28:26.5041658Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_float32 PASSED [1.0747s] [ 0%] 2025-12-04T13:28:26.5041802Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_int64 PASSED [0.8297s] [ 0%] 2025-12-04T13:28:26.5041991Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_bfloat16 PASSED [0.9241s] [ 0%] 2025-12-04T13:28:26.5042132Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_matrix_norm_cuda_complex64 PASSED [0.1577s] [ 0%] 2025-12-04T13:28:26.5042264Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vecdot_cuda_float16 PASSED [0.8652s] [ 0%] 2025-12-04T13:28:26.5042402Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vector_norm_cuda_bfloat16 PASSED [0.8064s] [ 0%] 2025-12-04T13:28:26.5042545Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_vector_norm_cuda_complex128 PASSED [0.8197s] [ 0%] 2025-12-04T13:28:26.5042670Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_float16 PASSED [0.7908s] [ 0%] 2025-12-04T13:28:26.5042797Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_float64 PASSED [0.0225s] [ 0%] 2025-12-04T13:28:26.5042919Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linspace_cuda_int16 XFAIL [0.0110s] [ 0%] 2025-12-04T13:28:26.5043038Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_bool PASSED [0.7789s] [ 0%] 2025-12-04T13:28:26.5043157Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log10_cuda_int32 PASSED [0.7654s] [ 0%] 2025-12-04T13:28:26.5043275Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_bool PASSED [0.0193s] [ 0%] 2025-12-04T13:28:26.5043394Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_int32 PASSED [0.0148s] [ 0%] 2025-12-04T13:28:26.5043512Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log1p_cuda_uint8 PASSED [0.0131s] [ 0%] 2025-12-04T13:28:26.5043629Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_bool PASSED [0.0181s] [ 0%] 2025-12-04T13:28:26.5043745Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log2_cuda_int32 PASSED [0.7683s] [ 0%] 2025-12-04T13:28:26.5043862Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int32 PASSED [0.0196s] [ 0%] 2025-12-04T13:28:26.5043975Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int64 PASSED [0.0164s] [ 0%] 2025-12-04T13:28:26.5044103Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_int8 PASSED [0.0154s] [ 0%] 2025-12-04T13:28:26.5044217Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_cuda_uint8 PASSED [0.7776s] [ 0%] 2025-12-04T13:28:26.5044403Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_normal_cuda_float16 SKIPPED [0.0002s] (Expected: log_normal is not comparable) [ 0%] 2025-12-04T13:28:26.5044584Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_normal_cuda_float64 SKIPPED [0.0001s] (Expected: log_normal is not comparable) [ 0%] 2025-12-04T13:28:26.5044734Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_complex32 PASSED [0.7880s] [ 0%] 2025-12-04T13:28:26.5044880Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_log_softmax_with_dtype_cuda_complex64 PASSED [0.7612s] [ 1%] 2025-12-04T13:28:26.5045012Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logaddexp2_cuda_float64 PASSED [0.7631s] [ 1%] 2025-12-04T13:28:26.5045147Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_and_cuda_complex64 PASSED [0.0944s] [ 1%] 2025-12-04T13:28:26.5045293Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_complex64 PASSED [0.8102s] [ 1%] 2025-12-04T13:28:26.5045422Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_float64 PASSED [0.0156s] [ 1%] 2025-12-04T13:28:26.5045560Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_int16 PASSED [0.0117s] [ 1%] 2025-12-04T13:28:26.5045687Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_not_cuda_int64 PASSED [0.7836s] [ 1%] 2025-12-04T13:28:26.5045829Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_bfloat16 PASSED [0.0448s] [ 1%] 2025-12-04T13:28:26.5045956Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_float32 PASSED [0.0404s] [ 1%] 2025-12-04T13:28:26.5046081Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_or_cuda_int64 PASSED [0.0375s] [ 1%] 2025-12-04T13:28:26.5046210Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logical_xor_cuda_float64 PASSED [0.0317s] [ 1%] 2025-12-04T13:28:26.5046334Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_cuda_bfloat16 PASSED [0.1204s] [ 1%] 2025-12-04T13:28:26.5046481Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_float64 PASSED [0.4106s] [ 1%] 2025-12-04T13:28:26.5046623Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_int16 XFAIL [0.0472s] [ 1%] 2025-12-04T13:28:26.5046766Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_int64 XFAIL [0.8094s] [ 1%] 2025-12-04T13:28:26.5046909Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logspace_tensor_overload_cuda_uint8 PASSED [0.1069s] [ 1%] 2025-12-04T13:28:26.5047041Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_complex128 PASSED [0.0132s] [ 1%] 2025-12-04T13:28:26.5047168Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_float16 PASSED [0.0056s] [ 1%] 2025-12-04T13:28:26.5047293Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_float64 PASSED [0.0091s] [ 1%] 2025-12-04T13:28:26.5047419Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_logsumexp_cuda_int64 PASSED [0.0053s] [ 1%] 2025-12-04T13:28:26.5047537Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_lt_cuda_bfloat16 PASSED [0.0399s] [ 1%] 2025-12-04T13:28:26.5047668Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_float32 PASSED [0.0053s] [ 1%] 2025-12-04T13:28:26.5047791Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_int16 PASSED [0.7692s] [ 1%] 2025-12-04T13:28:26.5047927Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_masked_fill_cuda_int32 PASSED [0.0064s] [ 1%] 2025-12-04T13:28:26.5048050Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mean_cuda_complex64 PASSED [0.0184s] [ 1%] 2025-12-04T13:28:26.5048193Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_bool PASSED [0.0063s] [ 2%] 2025-12-04T13:28:26.5048343Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_complex128 PASSED [0.0084s] [ 2%] 2025-12-04T13:28:26.5048491Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_float64 PASSED [0.7571s] [ 2%] 2025-12-04T13:28:26.5048633Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_list_of_tensors_cuda_int8 PASSED [0.0077s] [ 2%] 2025-12-04T13:28:26.5048779Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_float64 PASSED [0.0083s] [ 2%] 2025-12-04T13:28:26.5048924Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_int64 PASSED [0.0061s] [ 2%] 2025-12-04T13:28:26.5049077Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_meshgrid_variadic_tensors_cuda_int8 PASSED [0.0061s] [ 2%] 2025-12-04T13:28:26.5049198Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_bool PASSED [0.0557s] [ 2%] 2025-12-04T13:28:26.5049332Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_int16 PASSED [0.0321s] [ 2%] 2025-12-04T13:28:26.5049454Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_minimum_cuda_uint8 PASSED [0.0314s] [ 2%] 2025-12-04T13:28:26.5049591Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_complex64 PASSED [0.7578s] [ 2%] 2025-12-04T13:28:26.5049715Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_float32 PASSED [0.0065s] [ 2%] 2025-12-04T13:28:26.5049837Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_float64 PASSED [0.7454s] [ 2%] 2025-12-04T13:28:26.5049959Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_int16 PASSED [0.0055s] [ 2%] 2025-12-04T13:28:26.5050078Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_movedim_cuda_uint8 PASSED [0.7521s] [ 2%] 2025-12-04T13:28:26.5050195Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int64 PASSED [0.0394s] [ 2%] 2025-12-04T13:28:26.5050309Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_mul_cuda_int8 PASSED [0.8074s] [ 2%] 2025-12-04T13:28:26.5050435Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_bool PASSED [0.0085s] [ 2%] 2025-12-04T13:28:26.5050568Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_complex64 PASSED [0.0093s] [ 2%] 2025-12-04T13:28:26.5050698Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_float16 PASSED [0.7598s] [ 2%] 2025-12-04T13:28:26.5050827Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_float32 PASSED [0.0105s] [ 2%] 2025-12-04T13:28:26.5050955Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_copy_cuda_float64 PASSED [0.0091s] [ 2%] 2025-12-04T13:28:26.5051078Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_bfloat16 PASSED [0.7732s] [ 2%] 2025-12-04T13:28:26.5051202Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_narrow_cuda_complex64 PASSED [0.0173s] [ 2%] 2025-12-04T13:28:26.5051322Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ne_cuda_bfloat16 PASSED [0.0402s] [ 2%] 2025-12-04T13:28:26.5051443Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_complex32 PASSED [0.1817s] [ 3%] 2025-12-04T13:28:26.5051580Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_complex64 PASSED [0.0290s] [ 3%] 2025-12-04T13:28:26.5051698Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_float16 PASSED [0.7882s] [ 3%] 2025-12-04T13:28:26.5051816Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_float32 PASSED [0.0165s] [ 3%] 2025-12-04T13:28:26.5051966Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_neg_cuda_float64 PASSED [0.0148s] [ 3%] 2025-12-04T13:28:26.5052138Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_bool SKIPPED [0.0001s] (Expected: empty is not comparable) [ 3%] 2025-12-04T13:28:26.5052317Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_complex32 SKIPPED [0.0001s] (Expected: empty is not comparable) [ 3%] 2025-12-04T13:28:26.5052487Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_cuda_uint8 SKIPPED [0.0001s] (Expected: empty is not comparable) [ 3%] 2025-12-04T13:28:26.5052687Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_complex32 SKIPPED [0.0001s] (Expected: empty_strided is not comparable) [ 3%] 2025-12-04T13:28:26.5052889Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_int16 SKIPPED [0.0001s] (Expected: empty_strided is not comparable) [ 3%] 2025-12-04T13:28:26.5053077Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_empty_strided_cuda_int8 SKIPPED [0.0001s] (Expected: empty_strided is not comparable) [ 3%] 2025-12-04T13:28:26.5053216Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_full_cuda_complex64 PASSED [0.7862s] [ 3%] 2025-12-04T13:28:26.5053354Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_float32 PASSED [0.0061s] [ 3%] 2025-12-04T13:28:26.5053478Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_float64 PASSED [0.7717s] [ 3%] 2025-12-04T13:28:26.5053604Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_ones_cuda_uint8 PASSED [0.0056s] [ 3%] 2025-12-04T13:28:26.5053727Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_new_zeros_cuda_int8 PASSED [0.7726s] [ 3%] 2025-12-04T13:28:26.5053853Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nextafter_cuda_bfloat16 PASSED [0.0628s] [ 3%] 2025-12-04T13:28:26.5053980Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nextafter_cuda_float16 PASSED [0.0433s] [ 3%] 2025-12-04T13:28:26.5054180Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_alpha_dropout_cuda_bfloat16 SKIPPED [0.0001s] (Expected: dropout is not comparable) [ 3%] 2025-12-04T13:28:26.5054323Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_celu_cuda_bfloat16 PASSED [0.0220s] [ 3%] 2025-12-04T13:28:26.5054461Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_celu_cuda_float32 PASSED [0.7715s] [ 3%] 2025-12-04T13:28:26.5054600Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_celu_cuda_float64 PASSED [0.0173s] [ 3%] 2025-12-04T13:28:26.5054758Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_complex64 PASSED [0.7585s] [ 3%] 2025-12-04T13:28:26.5054910Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_int32 PASSED [0.0040s] [ 3%] 2025-12-04T13:28:26.5055059Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_channel_shuffle_cuda_int64 PASSED [0.7682s] [ 3%] 2025-12-04T13:28:26.5055252Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_dropout_cuda_float32 SKIPPED [0.0002s] (Expected: dropout is not comparable) [ 4%] 2025-12-04T13:28:26.5055392Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_gelu_cuda_bfloat16 PASSED [0.7930s] [ 4%] 2025-12-04T13:28:26.5055542Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_glu_cuda_float32 PASSED [0.7770s] [ 4%] 2025-12-04T13:28:26.5055678Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_glu_cuda_float64 PASSED [0.7625s] [ 4%] 2025-12-04T13:28:26.5055827Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardshrink_cuda_float16 PASSED [0.0224s] [ 4%] 2025-12-04T13:28:26.5055972Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_float32 PASSED [0.0159s] [ 4%] 2025-12-04T13:28:26.5056112Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_hardtanh_cuda_int32 PASSED [0.7783s] [ 4%] 2025-12-04T13:28:26.5056259Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_huber_loss_cuda_float64 PASSED [0.0291s] [ 4%] 2025-12-04T13:28:26.5056399Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_l1_loss_cuda_float16 PASSED [0.7517s] [ 4%] 2025-12-04T13:28:26.5056544Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_layer_norm_cuda_float64 PASSED [0.0106s] [ 4%] 2025-12-04T13:28:26.5056696Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_leaky_relu_cuda_float64 PASSED [0.7648s] [ 4%] 2025-12-04T13:28:26.5056857Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_float16 PASSED [0.7514s] [ 4%] 2025-12-04T13:28:26.5057025Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_log_softmax_with_dtype_cuda_int32 PASSED [0.7523s] [ 4%] 2025-12-04T13:28:26.5057189Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_margin_ranking_loss_cuda_int32 PASSED [0.0136s] [ 4%] 2025-12-04T13:28:26.5057328Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mish_cuda_bfloat16 PASSED [0.0285s] [ 4%] 2025-12-04T13:28:26.5057466Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_mish_cuda_float32 PASSED [0.7665s] [ 4%] 2025-12-04T13:28:26.5057609Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_nll_loss_cuda_float32 PASSED [0.0523s] [ 4%] 2025-12-04T13:28:26.5057760Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_shuffle_cuda_complex64 PASSED [0.0041s] [ 4%] 2025-12-04T13:28:26.5057915Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_bfloat16 PASSED [0.0035s] [ 4%] 2025-12-04T13:28:26.5058072Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_complex128 PASSED [0.0036s] [ 4%] 2025-12-04T13:28:26.5058224Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_pixel_unshuffle_cuda_int32 PASSED [0.0031s] [ 4%] 2025-12-04T13:28:26.5058364Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_prelu_cuda_bfloat16 PASSED [0.0561s] [ 4%] 2025-12-04T13:28:26.5058503Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu6_cuda_bfloat16 PASSED [0.7715s] [ 4%] 2025-12-04T13:28:26.5058641Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_float32 PASSED [0.0180s] [ 4%] 2025-12-04T13:28:26.5058776Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int16 PASSED [0.0115s] [ 4%] 2025-12-04T13:28:26.5058911Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_relu_cuda_int64 PASSED [0.7755s] [ 5%] 2025-12-04T13:28:26.5059047Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_selu_cuda_float64 PASSED [0.0173s] [ 5%] 2025-12-04T13:28:26.5059203Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_float32 PASSED [0.7557s] [ 5%] 2025-12-04T13:28:26.5059365Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_int32 PASSED [0.7673s] [ 5%] 2025-12-04T13:28:26.5059518Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmax_with_dtype_cuda_uint8 PASSED [0.7601s] [ 5%] 2025-12-04T13:28:26.5059671Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_float32 PASSED [0.7518s] [ 5%] 2025-12-04T13:28:26.5059822Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softmin_with_dtype_cuda_uint8 PASSED [0.7623s] [ 5%] 2025-12-04T13:28:26.5059966Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softplus_cuda_bfloat16 PASSED [0.0184s] [ 5%] 2025-12-04T13:28:26.5060110Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softplus_cuda_float64 PASSED [0.0174s] [ 5%] 2025-12-04T13:28:26.5060256Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_softshrink_cuda_float32 PASSED [0.7841s] [ 5%] 2025-12-04T13:28:26.5060405Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_complex128 PASSED [0.0433s] [ 5%] 2025-12-04T13:28:26.5060563Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_float16 PASSED [0.7851s] [ 5%] 2025-12-04T13:28:26.5060706Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_float64 PASSED [0.0179s] [ 5%] 2025-12-04T13:28:26.5060857Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_tanhshrink_cuda_int64 PASSED [0.0143s] [ 5%] 2025-12-04T13:28:26.5061010Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_float16 PASSED [0.7821s] [ 5%] 2025-12-04T13:28:26.5061153Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_float64 PASSED [0.0187s] [ 5%] 2025-12-04T13:28:26.5061294Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_threshold_cuda_int8 PASSED [0.0115s] [ 5%] 2025-12-04T13:28:26.5061454Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_complex64 PASSED [0.7497s] [ 5%] 2025-12-04T13:28:26.5061605Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_nn_functional_triplet_margin_loss_cuda_int8 PASSED [0.0069s] [ 5%] 2025-12-04T13:28:26.5061727Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_norm_cuda_float32 PASSED [0.7742s] [ 5%] 2025-12-04T13:28:26.5061940Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_normal_cuda_float32 SKIPPED [0.0002s] (Expected: normal is not comparable) [ 5%] 2025-12-04T13:28:26.5062061Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_float64 PASSED [0.7548s] [ 5%] 2025-12-04T13:28:26.5062180Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_int16 PASSED [0.0036s] [ 5%] 2025-12-04T13:28:26.5062297Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_int32 PASSED [0.7594s] [ 5%] 2025-12-04T13:28:26.5062416Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ones_cuda_uint8 PASSED [0.0038s] [ 5%] 2025-12-04T13:28:26.5062550Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_complex32 PASSED [0.0226s] [ 6%] 2025-12-04T13:28:26.5062683Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_float16 PASSED [0.0206s] [ 6%] 2025-12-04T13:28:26.5062810Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_copy_cuda_int8 PASSED [0.0156s] [ 6%] 2025-12-04T13:28:26.5062939Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_complex128 PASSED [0.0244s] [ 6%] 2025-12-04T13:28:26.5063062Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_float16 PASSED [0.0227s] [ 6%] 2025-12-04T13:28:26.5063198Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_permute_cuda_int64 PASSED [0.0173s] [ 6%] 2025-12-04T13:28:26.5063320Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_positive_cuda_int64 PASSED [0.7696s] [ 6%] 2025-12-04T13:28:26.5063441Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_float32 PASSED [0.0516s] [ 6%] 2025-12-04T13:28:26.5063554Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_pow_cuda_int8 PASSED [0.7884s] [ 6%] 2025-12-04T13:28:26.5063677Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_bfloat16 PASSED [0.0245s] [ 6%] 2025-12-04T13:28:26.5063800Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_prod_cuda_complex32 PASSED [1.3604s] [ 6%] 2025-12-04T13:28:26.5063922Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_float16 PASSED [0.7649s] [ 6%] 2025-12-04T13:28:26.5064042Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rad2deg_cuda_int64 PASSED [0.0150s] [ 6%] 2025-12-04T13:28:26.5064162Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_bfloat16 PASSED [0.7602s] [ 6%] 2025-12-04T13:28:26.5064300Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_complex32 PASSED [0.0054s] [ 6%] 2025-12-04T13:28:26.5064419Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_ravel_cuda_int32 PASSED [0.7664s] [ 6%] 2025-12-04T13:28:26.5064552Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_float16 PASSED [0.0157s] [ 6%] 2025-12-04T13:28:26.5064669Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_int64 PASSED [0.7663s] [ 6%] 2025-12-04T13:28:26.5064805Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_real_cuda_int8 PASSED [0.0109s] [ 6%] 2025-12-04T13:28:26.5064927Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_bool PASSED [0.0177s] [ 6%] 2025-12-04T13:28:26.5065057Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_float16 PASSED [0.0162s] [ 6%] 2025-12-04T13:28:26.5065183Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reciprocal_cuda_int64 PASSED [0.7708s] [ 6%] 2025-12-04T13:28:26.5065309Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_float16 PASSED [0.0853s] [ 6%] 2025-12-04T13:28:26.5065433Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_remainder_cuda_uint8 PASSED [0.7652s] [ 6%] 2025-12-04T13:28:26.5065554Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_renorm_cuda_float16 PASSED [0.0095s] [ 6%] 2025-12-04T13:28:26.5065676Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_float64 PASSED [0.0129s] [ 7%] 2025-12-04T13:28:26.5065795Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_repeat_cuda_int64 PASSED [0.0105s] [ 7%] 2025-12-04T13:28:26.5065924Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_float16 PASSED [0.7325s] [ 7%] 2025-12-04T13:28:26.5066048Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_int64 PASSED [0.0093s] [ 7%] 2025-12-04T13:28:26.5066173Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_as_cuda_uint8 PASSED [0.7104s] [ 7%] 2025-12-04T13:28:26.5066292Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_bool PASSED [0.0158s] [ 7%] 2025-12-04T13:28:26.5066420Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_complex32 PASSED [0.0186s] [ 7%] 2025-12-04T13:28:26.5066547Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_complex64 PASSED [0.0179s] [ 7%] 2025-12-04T13:28:26.5066670Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_reshape_cuda_float32 PASSED [0.0171s] [ 7%] 2025-12-04T13:28:26.5066801Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_complex64 PASSED [0.7286s] [ 7%] 2025-12-04T13:28:26.5066922Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_float32 PASSED [0.0085s] [ 7%] 2025-12-04T13:28:26.5067039Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_roll_cuda_int16 PASSED [0.7178s] [ 7%] 2025-12-04T13:28:26.5067157Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_bool PASSED [0.0101s] [ 7%] 2025-12-04T13:28:26.5067282Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_complex128 PASSED [0.7333s] [ 7%] 2025-12-04T13:28:26.5067402Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_float16 PASSED [0.0134s] [ 7%] 2025-12-04T13:28:26.5067524Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_float32 PASSED [0.7410s] [ 7%] 2025-12-04T13:28:26.5067641Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_int64 PASSED [0.0103s] [ 7%] 2025-12-04T13:28:26.5067759Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rot90_cuda_uint8 PASSED [0.7194s] [ 7%] 2025-12-04T13:28:26.5067889Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_round_cuda_float64 PASSED [0.0171s] [ 7%] 2025-12-04T13:28:26.5068011Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_bfloat16 PASSED [0.0168s] [ 7%] 2025-12-04T13:28:26.5068130Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_float64 PASSED [0.0159s] [ 7%] 2025-12-04T13:28:26.5068258Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsqrt_cuda_int32 PASSED [0.7502s] [ 7%] 2025-12-04T13:28:26.5068384Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_rsub_cuda_int16 PASSED [0.0302s] [ 7%] 2025-12-04T13:28:26.5068515Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_select_scatter_cuda_bool PASSED [0.0036s] [ 7%] 2025-12-04T13:28:26.5068633Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_float32 PASSED [0.7279s] [ 7%] 2025-12-04T13:28:26.5068751Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_float64 PASSED [0.0142s] [ 8%] 2025-12-04T13:28:26.5068866Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sgn_cuda_int32 PASSED [0.0095s] [ 8%] 2025-12-04T13:28:26.5068986Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_bfloat16 PASSED [0.7396s] [ 8%] 2025-12-04T13:28:26.5069106Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_float32 PASSED [0.0170s] [ 8%] 2025-12-04T13:28:26.5069221Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sign_cuda_int64 PASSED [0.0108s] [ 8%] 2025-12-04T13:28:26.5069346Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_signbit_cuda_float32 PASSED [0.0119s] [ 8%] 2025-12-04T13:28:26.5069460Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_int32 PASSED [0.7396s] [ 8%] 2025-12-04T13:28:26.5069575Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sin_cuda_int64 PASSED [0.0150s] [ 8%] 2025-12-04T13:28:26.5069691Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_bool PASSED [0.0195s] [ 8%] 2025-12-04T13:28:26.5069813Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_complex64 PASSED [0.4726s] [ 8%] 2025-12-04T13:28:26.5069930Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_float32 PASSED [0.7363s] [ 8%] 2025-12-04T13:28:26.5070047Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_int16 PASSED [0.0165s] [ 8%] 2025-12-04T13:28:26.5070164Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinc_cuda_int64 PASSED [0.0145s] [ 8%] 2025-12-04T13:28:26.5070284Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sinh_cuda_float64 PASSED [0.0189s] [ 8%] 2025-12-04T13:28:26.5070431Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_float32 PASSED [0.7424s] [ 8%] 2025-12-04T13:28:26.5070565Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j0_cuda_int8 PASSED [0.0215s] [ 8%] 2025-12-04T13:28:26.5070702Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_float32 PASSED [0.0188s] [ 8%] 2025-12-04T13:28:26.5070834Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_bessel_j1_cuda_int32 PASSED [0.0162s] [ 8%] 2025-12-04T13:28:26.5070963Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_int64 PASSED [0.7480s] [ 8%] 2025-12-04T13:28:26.5071090Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_entr_cuda_uint8 PASSED [0.0148s] [ 8%] 2025-12-04T13:28:26.5071218Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_int64 PASSED [0.0183s] [ 8%] 2025-12-04T13:28:26.5071346Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_erfcx_cuda_uint8 PASSED [0.0133s] [ 8%] 2025-12-04T13:28:26.5071472Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_float32 PASSED [0.7372s] [ 8%] 2025-12-04T13:28:26.5071608Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1_cuda_float64 PASSED [0.3079s] [ 8%] 2025-12-04T13:28:26.5071739Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_bfloat16 PASSED [0.0185s] [ 8%] 2025-12-04T13:28:26.5071936Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_i1e_cuda_int32 PASSED [0.0156s] [ 9%] 2025-12-04T13:28:26.5072068Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_ndtr_cuda_int8 PASSED [0.7363s] [ 9%] 2025-12-04T13:28:26.5072235Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_bfloat16 PASSED [0.7104s] [ 9%] 2025-12-04T13:28:26.5072392Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_complex128 PASSED [0.7181s] [ 9%] 2025-12-04T13:28:26.5072548Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_complex64 PASSED [0.7149s] [ 9%] 2025-12-04T13:28:26.5072699Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_log_softmax_with_dtype_cuda_float32 PASSED [0.7097s] [ 9%] 2025-12-04T13:28:26.5072828Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_logit_cuda_int32 PASSED [0.0182s] [ 9%] 2025-12-04T13:28:26.5072987Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_float16 PASSED [0.0254s] [ 9%] 2025-12-04T13:28:26.5073146Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_1_cuda_float32 PASSED [0.0227s] [ 9%] 2025-12-04T13:28:26.5073304Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_bfloat16 PASSED [0.7416s] [ 9%] 2025-12-04T13:28:26.5073462Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_float32 PASSED [0.0236s] [ 9%] 2025-12-04T13:28:26.5073616Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_3_cuda_uint8 PASSED [0.0195s] [ 9%] 2025-12-04T13:28:26.5073770Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_multigammaln_mvlgamma_p_5_cuda_uint8 PASSED [0.0194s] [ 9%] 2025-12-04T13:28:26.5073899Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtr_cuda_int16 PASSED [0.7307s] [ 9%] 2025-12-04T13:28:26.5074032Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_ndtri_cuda_float64 PASSED [0.3245s] [ 9%] 2025-12-04T13:28:26.5074182Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_float16 PASSED [0.0045s] [ 9%] 2025-12-04T13:28:26.5074339Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_softmax_with_dtype_cuda_int8 PASSED [0.7283s] [ 9%] 2025-12-04T13:28:26.5074473Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_int32 PASSED [0.0471s] [ 9%] 2025-12-04T13:28:26.5074602Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_int8 PASSED [0.7680s] [ 9%] 2025-12-04T13:28:26.5074733Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_xlog1py_cuda_uint8 PASSED [0.0461s] [ 9%] 2025-12-04T13:28:26.5074863Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_float32 PASSED [1.3381s] [ 9%] 2025-12-04T13:28:26.5074995Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_float64 PASSED [12.4065s] [ 9%] 2025-12-04T13:28:26.5075121Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_special_zeta_cuda_int64 PASSED [0.7698s] [ 9%] 2025-12-04T13:28:26.5075240Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sqrt_cuda_uint8 PASSED [0.0144s] [ 9%] 2025-12-04T13:28:26.5075362Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_square_cuda_float32 PASSED [0.0176s] [ 9%] 2025-12-04T13:28:26.5075502Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_copy_cuda_int8 PASSED [0.0040s] [ 10%] 2025-12-04T13:28:26.5075620Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_cuda_int8 PASSED [0.7262s] [ 10%] 2025-12-04T13:28:26.5075769Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_float32 PASSED [0.0058s] [ 10%] 2025-12-04T13:28:26.5075914Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_int32 PASSED [0.0039s] [ 10%] 2025-12-04T13:28:26.5076044Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_squeeze_multiple_cuda_uint8 PASSED [0.7321s] [ 10%] 2025-12-04T13:28:26.5076169Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_complex128 PASSED [0.0166s] [ 10%] 2025-12-04T13:28:26.5076291Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_complex32 PASSED [0.0049s] [ 10%] 2025-12-04T13:28:26.5076411Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_int16 PASSED [0.0040s] [ 10%] 2025-12-04T13:28:26.5076528Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_stack_cuda_uint8 PASSED [0.0040s] [ 10%] 2025-12-04T13:28:26.5076647Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_cuda_float16 PASSED [0.7528s] [ 10%] 2025-12-04T13:28:26.5076770Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_std_mean_cuda_float64 PASSED [0.7457s] [ 10%] 2025-12-04T13:28:26.5076894Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_complex128 PASSED [0.0692s] [ 10%] 2025-12-04T13:28:26.5077014Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_complex32 PASSED [0.7986s] [ 10%] 2025-12-04T13:28:26.5077132Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_int16 PASSED [0.0402s] [ 10%] 2025-12-04T13:28:26.5077247Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sub_cuda_int8 PASSED [0.7642s] [ 10%] 2025-12-04T13:28:26.5077361Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_int32 PASSED [0.0088s] [ 10%] 2025-12-04T13:28:26.5077475Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_sum_cuda_int64 PASSED [0.0072s] [ 10%] 2025-12-04T13:28:26.5077593Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_bool PASSED [0.7169s] [ 10%] 2025-12-04T13:28:26.5077719Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_complex64 PASSED [0.0047s] [ 10%] 2025-12-04T13:28:26.5077838Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_float64 PASSED [0.7375s] [ 10%] 2025-12-04T13:28:26.5077967Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_copy_cuda_int16 PASSED [0.0042s] [ 10%] 2025-12-04T13:28:26.5078081Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_float32 PASSED [0.7233s] [ 10%] 2025-12-04T13:28:26.5078197Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_float64 PASSED [0.0043s] [ 10%] 2025-12-04T13:28:26.5078308Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_int16 PASSED [0.7146s] [ 10%] 2025-12-04T13:28:26.5078421Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_t_cuda_int8 PASSED [0.0039s] [ 10%] 2025-12-04T13:28:26.5078557Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_complex128 PASSED [0.0597s] [ 11%] 2025-12-04T13:28:26.5078694Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_complex64 PASSED [0.7351s] [ 11%] 2025-12-04T13:28:26.5078825Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_float64 PASSED [0.0075s] [ 11%] 2025-12-04T13:28:26.5078956Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_int32 PASSED [0.0048s] [ 11%] 2025-12-04T13:28:26.5079095Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_int8 PASSED [0.0045s] [ 11%] 2025-12-04T13:28:26.5079224Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_take_along_dim_cuda_uint8 PASSED [0.7134s] [ 11%] 2025-12-04T13:28:26.5079355Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tan_cuda_int8 PASSED [0.0205s] [ 11%] 2025-12-04T13:28:26.5079472Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_int16 PASSED [0.0141s] [ 11%] 2025-12-04T13:28:26.5079603Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_int32 PASSED [0.0137s] [ 11%] 2025-12-04T13:28:26.5079718Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_int64 PASSED [0.7312s] [ 11%] 2025-12-04T13:28:26.5079835Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tanh_cuda_uint8 PASSED [0.0153s] [ 11%] 2025-12-04T13:28:26.5079967Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_complex64 PASSED [0.7234s] [ 11%] 2025-12-04T13:28:26.5080095Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tensor_split_cuda_uint8 PASSED [0.0067s] [ 11%] 2025-12-04T13:28:26.5080212Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_to_cuda_float32 PASSED [0.0120s] [ 11%] 2025-12-04T13:28:26.5080330Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trace_cuda_int32 PASSED [0.7232s] [ 11%] 2025-12-04T13:28:26.5080467Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_complex128 PASSED [0.0066s] [ 11%] 2025-12-04T13:28:26.5080598Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_int32 PASSED [0.7130s] [ 11%] 2025-12-04T13:28:26.5080730Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_copy_cuda_int64 PASSED [0.0052s] [ 11%] 2025-12-04T13:28:26.5080861Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_complex128 PASSED [0.7223s] [ 11%] 2025-12-04T13:28:26.5080986Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_transpose_cuda_int16 PASSED [0.0053s] [ 11%] 2025-12-04T13:28:26.5081107Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_bfloat16 PASSED [0.7316s] [ 11%] 2025-12-04T13:28:26.5081232Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_complex128 PASSED [0.0067s] [ 11%] 2025-12-04T13:28:26.5081355Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_cuda_complex64 PASSED [0.7133s] [ 11%] 2025-12-04T13:28:26.5081483Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_tril_indices_cuda_int64 PASSED [0.0206s] [ 11%] 2025-12-04T13:28:26.5081614Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_bfloat16 PASSED [0.7247s] [ 11%] 2025-12-04T13:28:26.5081736Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_complex64 PASSED [0.0064s] [ 12%] 2025-12-04T13:28:26.5081894Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_triu_cuda_float16 PASSED [0.7294s] [ 12%] 2025-12-04T13:28:26.5082014Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_float16 PASSED [0.0173s] [ 12%] 2025-12-04T13:28:26.5082132Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_trunc_cuda_int64 PASSED [0.0107s] [ 12%] 2025-12-04T13:28:26.5082258Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_bool PASSED [0.7360s] [ 12%] 2025-12-04T13:28:26.5082391Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_complex128 PASSED [0.0088s] [ 12%] 2025-12-04T13:28:26.5082520Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_float16 PASSED [0.7129s] [ 12%] 2025-12-04T13:28:26.5082648Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_copy_cuda_float32 PASSED [0.0085s] [ 12%] 2025-12-04T13:28:26.5082786Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_complex128 PASSED [0.7337s] [ 12%] 2025-12-04T13:28:26.5082911Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_complex32 PASSED [0.0122s] [ 12%] 2025-12-04T13:28:26.5083030Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unbind_cuda_int32 PASSED [0.7140s] [ 12%] 2025-12-04T13:28:26.5083172Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unflatten_cuda_complex32 PASSED [0.0069s] [ 12%] 2025-12-04T13:28:26.5083308Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_bool PASSED [0.7180s] [ 12%] 2025-12-04T13:28:26.5083439Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_copy_cuda_complex64 PASSED [0.0107s] [ 12%] 2025-12-04T13:28:26.5083561Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unfold_cuda_float32 PASSED [0.7203s] [ 12%] 2025-12-04T13:28:26.5083696Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_bfloat16 PASSED [0.0069s] [ 12%] 2025-12-04T13:28:26.5083826Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_float32 PASSED [0.7157s] [ 12%] 2025-12-04T13:28:26.5083957Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_int16 PASSED [0.0055s] [ 12%] 2025-12-04T13:28:26.5084086Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_int32 PASSED [0.7174s] [ 12%] 2025-12-04T13:28:26.5084216Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_copy_cuda_int64 PASSED [0.0052s] [ 12%] 2025-12-04T13:28:26.5084341Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_float16 PASSED [0.0058s] [ 12%] 2025-12-04T13:28:26.5084464Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_unsqueeze_cuda_int64 PASSED [0.0045s] [ 12%] 2025-12-04T13:28:26.5084582Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_cuda_float16 PASSED [0.0088s] [ 12%] 2025-12-04T13:28:26.5084706Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_var_mean_cuda_float32 PASSED [0.0108s] [ 12%] 2025-12-04T13:28:26.5084827Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vdot_cuda_complex64 PASSED [0.7287s] [ 12%] 2025-12-04T13:28:26.5084952Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_complex32 PASSED [0.0119s] [ 13%] 2025-12-04T13:28:26.5085073Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int16 PASSED [0.7213s] [ 13%] 2025-12-04T13:28:26.5085192Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int32 PASSED [0.0093s] [ 13%] 2025-12-04T13:28:26.5085326Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_as_cuda_int64 PASSED [0.7290s] [ 13%] 2025-12-04T13:28:26.5085449Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_copy_cuda_int64 PASSED [0.0046s] [ 13%] 2025-12-04T13:28:26.5085571Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_float16 PASSED [0.0176s] [ 13%] 2025-12-04T13:28:26.5085687Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_int32 PASSED [0.7301s] [ 13%] 2025-12-04T13:28:26.5085804Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_int64 PASSED [0.0159s] [ 13%] 2025-12-04T13:28:26.5085919Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_view_cuda_int8 PASSED [0.0139s] [ 13%] 2025-12-04T13:28:26.5086047Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_complex128 PASSED [0.7216s] [ 13%] 2025-12-04T13:28:26.5086165Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vsplit_cuda_int8 PASSED [0.0039s] [ 13%] 2025-12-04T13:28:26.5086290Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_complex32 PASSED [0.7245s] [ 13%] 2025-12-04T13:28:26.5086420Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_vstack_cuda_int8 PASSED [0.0044s] [ 13%] 2025-12-04T13:28:26.5086544Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_complex128 PASSED [0.0111s] [ 13%] 2025-12-04T13:28:26.5086664Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_where_cuda_float16 PASSED [0.7330s] [ 13%] 2025-12-04T13:28:26.5086789Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_bool PASSED [0.0467s] [ 13%] 2025-12-04T13:28:26.5086920Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_float32 PASSED [0.7698s] [ 13%] 2025-12-04T13:28:26.5087036Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_int16 PASSED [0.0483s] [ 13%] 2025-12-04T13:28:26.5087155Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_int32 PASSED [0.7683s] [ 13%] 2025-12-04T13:28:26.5087271Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_xlogy_cuda_uint8 PASSED [0.0477s] [ 13%] 2025-12-04T13:28:26.5087389Z test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_zeros_cuda_int16 PASSED [0.7141s] [ 13%] 2025-12-04T13:28:26.5087494Z test_ops.py::TestCommonCUDA::test_reduction_ops_reduce_aminmax_cuda PASSED [0.0090s] [ 13%] 2025-12-04T13:28:26.5087593Z test_ops.py::TestCommonCUDA::test_reduction_ops_reduce_any_cuda PASSED [0.7163s] [ 13%] 2025-12-04T13:28:26.5087695Z test_ops.py::TestCommonCUDA::test_reduction_ops_reduce_argmax_cuda PASSED [0.0095s] [ 13%] 2025-12-04T13:28:26.5087796Z test_ops.py::TestCommonCUDA::test_reduction_ops_reduce_argmin_cuda PASSED [0.7332s] [ 13%] 2025-12-04T13:28:26.5087916Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rdiv___cuda_float32 PASSED [0.0109s] [ 14%] 2025-12-04T13:28:26.5088040Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rmul___cuda_complex64 PASSED [0.0177s] [ 14%] 2025-12-04T13:28:26.5088161Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager___rsub___cuda_complex64 PASSED [0.0151s] [ 14%] 2025-12-04T13:28:26.5089736Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager__unsafe_masked_index_put_accumulate_cuda_complex64 PASSED [0.7485s] [ 14%] 2025-12-04T13:28:26.5089861Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_acosh_cuda_complex64 PASSED [0.7435s] [ 14%] 2025-12-04T13:28:26.5089981Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_acosh_cuda_float32 PASSED [0.7289s] [ 14%] 2025-12-04T13:28:26.5090099Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmm_cuda_complex64 PASSED [0.7595s] [ 14%] 2025-12-04T13:28:26.5090218Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_addmv_cuda_complex64 PASSED [0.7816s] [ 14%] 2025-12-04T13:28:26.5090336Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_allclose_cuda_float32 PASSED [0.7503s] [ 14%] 2025-12-04T13:28:26.5090471Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_aminmax_cuda_float32 PASSED [0.0058s] [ 14%] 2025-12-04T13:28:26.5090589Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_angle_cuda_float32 PASSED [0.7461s] [ 14%] 2025-12-04T13:28:26.5090707Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argsort_cuda_float32 PASSED [0.1587s] [ 14%] 2025-12-04T13:28:26.5090828Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_argwhere_cuda_complex64 PASSED [0.8205s] [ 14%] 2025-12-04T13:28:26.5091011Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_cuda_complex64 SKIPPED [0.0002s] (Errors when storage_offset is included) [ 14%] 2025-12-04T13:28:26.5091189Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_as_strided_cuda_float32 SKIPPED [0.0002s] (Errors when storage_offset is included) [ 14%] 2025-12-04T13:28:26.5091304Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atan2_cuda_float32 PASSED [0.7715s] [ 14%] 2025-12-04T13:28:26.5091422Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atanh_cuda_complex64 PASSED [0.7445s] [ 14%] 2025-12-04T13:28:26.5091560Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_1d_cuda_complex64 PASSED [0.7335s] [ 14%] 2025-12-04T13:28:26.5091681Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_1d_cuda_float32 PASSED [0.0045s] [ 14%] 2025-12-04T13:28:26.5091804Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_atleast_2d_cuda_complex64 PASSED [0.0062s] [ 14%] 2025-12-04T13:28:26.5091985Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bfloat16_cuda_complex64 PASSED [0.7355s] [ 14%] 2025-12-04T13:28:26.5092117Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bfloat16_cuda_float32 PASSED [0.0065s] [ 14%] 2025-12-04T13:28:26.5092241Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_block_diag_cuda_complex64 PASSED [0.0104s] [ 14%] 2025-12-04T13:28:26.5092355Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bmm_cuda_float32 PASSED [3.3641s] [ 14%] 2025-12-04T13:28:26.5092491Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_tensors_cuda_complex64 PASSED [0.7117s] [ 14%] 2025-12-04T13:28:26.5092620Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_broadcast_to_cuda_complex64 PASSED [0.7363s] [ 15%] 2025-12-04T13:28:26.5092740Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_bucketize_cuda_float32 PASSED [0.0105s] [ 15%] 2025-12-04T13:28:26.5092860Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cdouble_cuda_complex64 PASSED [0.0096s] [ 15%] 2025-12-04T13:28:26.5092979Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cfloat_cuda_complex64 PASSED [0.7318s] [ 15%] 2025-12-04T13:28:26.5093098Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_char_cuda_complex64 PASSED [0.7228s] [ 15%] 2025-12-04T13:28:26.5093219Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_cuda_complex64 PASSED [0.8285s] [ 15%] 2025-12-04T13:28:26.5093355Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_inverse_cuda_complex64 PASSED [0.8052s] [ 15%] 2025-12-04T13:28:26.5093486Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cholesky_solve_cuda_complex64 PASSED [0.0264s] [ 15%] 2025-12-04T13:28:26.5093606Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_clamp_max_cuda_float32 PASSED [0.7253s] [ 15%] 2025-12-04T13:28:26.5093730Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_column_stack_cuda_float32 PASSED [0.0041s] [ 15%] 2025-12-04T13:28:26.5093859Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_combinations_cuda_complex64 PASSED [0.0320s] [ 15%] 2025-12-04T13:28:26.5093977Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_conj_cuda_complex64 PASSED [0.7361s] [ 15%] 2025-12-04T13:28:26.5094095Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_copysign_cuda_float32 PASSED [0.0187s] [ 15%] 2025-12-04T13:28:26.5094228Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_corrcoef_cuda_float32 PASSED [0.7268s] [ 15%] 2025-12-04T13:28:26.5094344Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cos_cuda_complex64 PASSED [0.7401s] [ 15%] 2025-12-04T13:28:26.5094460Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cosh_cuda_float32 PASSED [0.7444s] [ 15%] 2025-12-04T13:28:26.5094577Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cross_cuda_complex64 PASSED [0.7560s] [ 15%] 2025-12-04T13:28:26.5094695Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cummax_cuda_float32 PASSED [0.7393s] [ 15%] 2025-12-04T13:28:26.5094811Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumprod_cuda_float32 PASSED [0.7744s] [ 15%] 2025-12-04T13:28:26.5094929Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumsum_cuda_float32 PASSED [0.7167s] [ 15%] 2025-12-04T13:28:26.5095070Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_cumulative_trapezoid_cuda_complex64 PASSED [0.7348s] [ 15%] 2025-12-04T13:28:26.5095189Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_deg2rad_cuda_float32 PASSED [0.0050s] [ 15%] 2025-12-04T13:28:26.5095330Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagflat_cuda_complex64 PASSED [0.0106s] [ 15%] 2025-12-04T13:28:26.5095453Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diagonal_cuda_complex64 PASSED [0.0215s] [ 15%] 2025-12-04T13:28:26.5095567Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_diff_cuda_float32 PASSED [0.0499s] [ 15%] 2025-12-04T13:28:26.5095697Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dist_cuda_complex64 PASSED [0.8031s] [ 16%] 2025-12-04T13:28:26.5095822Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dist_cuda_float32 PASSED [0.0388s] [ 16%] 2025-12-04T13:28:26.5095955Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_div_floor_rounding_cuda_float32 PASSED [0.7476s] [ 16%] 2025-12-04T13:28:26.5096068Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_dot_cuda_float32 PASSED [0.0048s] [ 16%] 2025-12-04T13:28:26.5096192Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_like_cuda_complex64 PASSED [0.7321s] [ 16%] 2025-12-04T13:28:26.5096313Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_empty_like_cuda_float32 PASSED [0.0040s] [ 16%] 2025-12-04T13:28:26.5096429Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_erfinv_cuda_float32 PASSED [0.7214s] [ 16%] 2025-12-04T13:28:26.5096548Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_fft2_cuda_float32 PASSED [2.4162s] [ 16%] 2025-12-04T13:28:26.5096668Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_ihfftn_cuda_float32 PASSED [5.2449s] [ 16%] 2025-12-04T13:28:26.5096788Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_irfft_cuda_float32 PASSED [2.7822s] [ 16%] 2025-12-04T13:28:26.5096911Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fft_irfftn_cuda_complex64 PASSED [1.3662s] [ 16%] 2025-12-04T13:28:26.5097030Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_flatten_cuda_float32 PASSED [0.7861s] [ 16%] 2025-12-04T13:28:26.5097143Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_fmod_cuda_float32 PASSED [0.0122s] [ 16%] 2025-12-04T13:28:26.5097271Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_grid_sampler_2d_cuda_float32 PASSED [0.8025s] [ 16%] 2025-12-04T13:28:26.5097415Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_grid_sampler_3d_cuda_float32 SKIPPED [0.0002s] (Skipped!) [ 16%] 2025-12-04T13:28:26.5097530Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_gt_cuda_float32 PASSED [0.7936s] [ 16%] 2025-12-04T13:28:26.5097647Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_hypot_cuda_float32 PASSED [0.0124s] [ 16%] 2025-12-04T13:28:26.5097760Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_i0_cuda_float32 PASSED [0.7764s] [ 16%] 2025-12-04T13:28:26.5097889Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_add_cuda_float32 PASSED [0.0132s] [ 16%] 2025-12-04T13:28:26.5098014Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_fill_cuda_complex64 PASSED [0.0257s] [ 16%] 2025-12-04T13:28:26.5098143Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_reduce_amin_cuda_float32 PASSED [0.8018s] [ 16%] 2025-12-04T13:28:26.5098261Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_istft_cuda_complex64 PASSED [1.7483s] [ 16%] 2025-12-04T13:28:26.5098375Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_item_cuda_float32 PASSED [0.8097s] [ 16%] 2025-12-04T13:28:26.5098519Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_2inputs_2outputs_cuda_float32 PASSED [0.0063s] [ 16%] 2025-12-04T13:28:26.5098651Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_jiterator_unary_cuda_complex64 PASSED [0.7977s] [ 17%] 2025-12-04T13:28:26.5098766Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ldexp_cuda_complex64 XFAIL [0.0123s] [ 17%] 2025-12-04T13:28:26.5098899Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cholesky_cuda_complex64 PASSED [1.6034s] [ 17%] 2025-12-04T13:28:26.5099032Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cond_cuda_float32 PASSED [0.8372s] [ 17%] 2025-12-04T13:28:26.5099159Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_cross_cuda_complex64 PASSED [0.7777s] [ 17%] 2025-12-04T13:28:26.5099294Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_eigvals_cuda_float32 PASSED [0.0976s] [ 17%] 2025-12-04T13:28:26.5099419Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_inv_cuda_complex64 PASSED [0.0247s] [ 17%] 2025-12-04T13:28:26.5099559Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_factor_cuda_float32 PASSED [1.2358s] [ 17%] 2025-12-04T13:28:26.5099785Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_solve_cuda_complex64 SKIPPED [0.0011s] (skipCUDAIfRocm: test doesn't currently work on the ROCm stack) [ 17%] 2025-12-04T13:28:26.5100001Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_ldl_solve_cuda_float32 SKIPPED [0.0009s] (skipCUDAIfRocm: test doesn't currently work on the ROCm stack) [ 17%] 2025-12-04T13:28:26.5100129Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_lstsq_cuda_complex64 PASSED [1.6239s] [ 17%] 2025-12-04T13:28:26.5100263Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_multi_dot_cuda_complex64 PASSED [1.2345s] [ 17%] 2025-12-04T13:28:26.5100388Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_norm_cuda_complex64 PASSED [1.3066s] [ 17%] 2025-12-04T13:28:26.5100531Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_pinv_hermitian_cuda_complex64 PASSED [1.2927s] [ 17%] 2025-12-04T13:28:26.5100658Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_solve_ex_cuda_float32 PASSED [1.2899s] [ 17%] 2025-12-04T13:28:26.5100789Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_svdvals_cuda_complex64 PASSED [1.2553s] [ 17%] 2025-12-04T13:28:26.5100925Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_tensorsolve_cuda_complex64 PASSED [1.2429s] [ 17%] 2025-12-04T13:28:26.5101055Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vander_cuda_complex64 PASSED [1.2677s] [ 17%] 2025-12-04T13:28:26.5101181Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_linalg_vecdot_cuda_float32 PASSED [0.0136s] [ 17%] 2025-12-04T13:28:26.5101298Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log10_cuda_float32 PASSED [1.2299s] [ 17%] 2025-12-04T13:28:26.5101415Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log2_cuda_complex64 PASSED [1.2486s] [ 17%] 2025-12-04T13:28:26.5101536Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_normal_cuda_float32 XFAIL [0.0102s] [ 17%] 2025-12-04T13:28:26.5101669Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_softmax_cuda_float32 PASSED [2.4832s] [ 17%] 2025-12-04T13:28:26.5101809Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_log_softmax_with_dtype_cuda_float32 PASSED [0.0128s] [ 17%] 2025-12-04T13:28:26.5101971Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logdet_cuda_float32 PASSED [1.2468s] [ 17%] 2025-12-04T13:28:26.5102093Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_and_cuda_float32 PASSED [0.0052s] [ 18%] 2025-12-04T13:28:26.5102237Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_not_cuda_complex64 SKIPPED [0.0002s] (Skipped!) [ 18%] 2025-12-04T13:28:26.5102357Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_or_cuda_float32 PASSED [1.2589s] [ 18%] 2025-12-04T13:28:26.5102483Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_xor_cuda_complex64 PASSED [1.2421s] [ 18%] 2025-12-04T13:28:26.5102605Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logical_xor_cuda_float32 PASSED [1.2499s] [ 18%] 2025-12-04T13:28:26.5102729Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logsumexp_cuda_complex64 PASSED [1.2931s] [ 18%] 2025-12-04T13:28:26.5102863Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_logsumexp_cuda_float32 PASSED [1.2613s] [ 18%] 2025-12-04T13:28:26.5102977Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_lu_cuda_complex64 PASSED [1.3203s] [ 18%] 2025-12-04T13:28:26.5103090Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mT_cuda_complex64 PASSED [1.2642s] [ 18%] 2025-12-04T13:28:26.5103231Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_cumsum_cuda_complex64 PASSED [1.2509s] [ 18%] 2025-12-04T13:28:26.5103368Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_cumsum_cuda_float32 PASSED [0.0099s] [ 18%] 2025-12-04T13:28:26.5103493Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_fill_cuda_complex64 PASSED [0.0245s] [ 18%] 2025-12-04T13:28:26.5103615Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_fill_cuda_float32 PASSED [1.2384s] [ 18%] 2025-12-04T13:28:26.5103739Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_mean_cuda_complex64 PASSED [0.0651s] [ 18%] 2025-12-04T13:28:26.5103870Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_normalize_cuda_float32 PASSED [0.0133s] [ 18%] 2025-12-04T13:28:26.5103992Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_masked_prod_cuda_float32 PASSED [0.0321s] [ 18%] 2025-12-04T13:28:26.5104110Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_matmul_cuda_float32 PASSED [0.0261s] [ 18%] 2025-12-04T13:28:26.5104232Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_matrix_exp_cuda_complex64 PASSED [1.2973s] [ 18%] 2025-12-04T13:28:26.5104384Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_max_pool2d_with_indices_backward_cuda_float32 PASSED [0.3551s] [ 18%] 2025-12-04T13:28:26.5104500Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mean_cuda_complex64 PASSED [0.0303s] [ 18%] 2025-12-04T13:28:26.5104622Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_movedim_cuda_complex64 PASSED [1.2517s] [ 18%] 2025-12-04T13:28:26.5104736Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mv_cuda_complex64 PASSED [1.2724s] [ 18%] 2025-12-04T13:28:26.5104848Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mv_cuda_float32 PASSED [0.0051s] [ 18%] 2025-12-04T13:28:26.5104987Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_mvlgamma_mvlgamma_p_5_cuda_float32 PASSED [1.2605s] [ 18%] 2025-12-04T13:28:26.5105107Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nanmedian_cuda_float32 PASSED [0.0210s] [ 18%] 2025-12-04T13:28:26.5105227Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nansum_cuda_complex64 PASSED [1.3022s] [ 19%] 2025-12-04T13:28:26.5105352Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_narrow_copy_cuda_complex64 PASSED [1.2442s] [ 19%] 2025-12-04T13:28:26.5105484Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_narrow_cuda_complex64 PASSED [1.2605s] [ 19%] 2025-12-04T13:28:26.5105616Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_native_batch_norm_cuda_float32 PASSED [0.0233s] [ 19%] 2025-12-04T13:28:26.5105812Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_empty_strided_cuda_complex64 SKIPPED [0.0002s] (Expected: new_empty_strided is not comparable) [ 19%] 2025-12-04T13:28:26.5105932Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_new_zeros_cuda_float32 PASSED [1.2799s] [ 19%] 2025-12-04T13:28:26.5106087Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_avg_pool3d_cuda_float32 PASSED [0.0127s] [ 19%] 2025-12-04T13:28:26.5106240Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_adaptive_max_pool3d_cuda_float32 PASSED [1.2222s] [ 19%] 2025-12-04T13:28:26.5106380Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv1d_cuda_complex64 PASSED [1.3750s] [ 19%] 2025-12-04T13:28:26.5106518Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv1d_cuda_float32 PASSED [1.2274s] [ 19%] 2025-12-04T13:28:26.5106677Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose1d_cuda_float32 PASSED [0.0163s] [ 19%] 2025-12-04T13:28:26.5106828Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose2d_cuda_complex64 PASSED [0.0576s] [ 19%] 2025-12-04T13:28:26.5106989Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_conv_transpose3d_cuda_complex64 PASSED [0.2784s] [ 19%] 2025-12-04T13:28:26.5107138Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_ctc_loss_cuda_float32 PASSED [1.2455s] [ 19%] 2025-12-04T13:28:26.5107293Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_fractional_max_pool3d_cuda_float32 PASSED [0.0284s] [ 19%] 2025-12-04T13:28:26.5107427Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_gelu_cuda_float32 PASSED [1.2633s] [ 19%] 2025-12-04T13:28:26.5107567Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_group_norm_cuda_float32 PASSED [0.0425s] [ 19%] 2025-12-04T13:28:26.5107708Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_hardshrink_cuda_float32 PASSED [1.2643s] [ 19%] 2025-12-04T13:28:26.5107858Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_linear_cuda_float32 PASSED [0.0117s] [ 19%] 2025-12-04T13:28:26.5108010Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_nearest_cuda_float32 PASSED [1.2234s] [ 19%] 2025-12-04T13:28:26.5108166Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_interpolate_trilinear_cuda_float32 PASSED [1.2460s] [ 19%] 2025-12-04T13:28:26.5108307Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_logsigmoid_cuda_float32 PASSED [1.2395s] [ 19%] 2025-12-04T13:28:26.5108460Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_margin_ranking_loss_cuda_float32 PASSED [1.2444s] [ 19%] 2025-12-04T13:28:26.5108598Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_max_pool1d_cuda_float32 PASSED [0.1560s] [ 19%] 2025-12-04T13:28:26.5108744Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_pixel_unshuffle_cuda_float32 PASSED [1.2192s] [ 19%] 2025-12-04T13:28:26.5108876Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_relu_cuda_float32 PASSED [0.0046s] [ 20%] 2025-12-04T13:28:26.5109010Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_rrelu_cuda_float32 PASSED [1.2264s] [ 20%] 2025-12-04T13:28:26.5109155Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_silu_complex_cuda_complex64 PASSED [1.2402s] [ 20%] 2025-12-04T13:28:26.5109309Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softmin_cuda_float32 PASSED [1.2271s] [ 20%] 2025-12-04T13:28:26.5109447Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softshrink_cuda_float32 PASSED [0.0047s] [ 20%] 2025-12-04T13:28:26.5109586Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_softsign_cuda_float32 PASSED [1.2083s] [ 20%] 2025-12-04T13:28:26.5109758Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_triplet_margin_with_distance_loss_cuda_complex64 PASSED [1.2381s] [ 20%] 2025-12-04T13:28:26.5109908Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_upsample_bilinear_cuda_float32 PASSED [0.0103s] [ 20%] 2025-12-04T13:28:26.5110058Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nn_functional_upsample_nearest_cuda_float32 PASSED [1.2407s] [ 20%] 2025-12-04T13:28:26.5110209Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_nonzero_static_cuda_float32 SKIPPED [0.0012s] (Only runs on cpu) [ 20%] 2025-12-04T13:28:26.5110329Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_cuda_complex64 PASSED [1.2719s] [ 20%] 2025-12-04T13:28:26.5110443Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_cuda_float32 PASSED [0.0289s] [ 20%] 2025-12-04T13:28:26.5110574Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_inf_cuda_float32 PASSED [1.2230s] [ 20%] 2025-12-04T13:28:26.5110696Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_norm_nuc_cuda_complex64 PASSED [1.2581s] [ 20%] 2025-12-04T13:28:26.5110822Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ones_cuda_complex64 XFAIL [0.0046s] [ 20%] 2025-12-04T13:28:26.5110962Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polygamma_polygamma_n_0_cuda_float32 PASSED [1.2579s] [ 20%] 2025-12-04T13:28:26.5111127Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_polygamma_polygamma_n_4_cuda_float32 SKIPPED [0.0003s] (Skipped!) [ 20%] 2025-12-04T13:28:26.5111242Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_prod_cuda_float32 PASSED [1.2518s] [ 20%] 2025-12-04T13:28:26.5111356Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_qr_cuda_complex64 PASSED [1.2864s] [ 20%] 2025-12-04T13:28:26.5111473Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randint_cuda_float32 XFAIL [0.0051s] [ 20%] 2025-12-04T13:28:26.5111588Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_randn_cuda_float32 XFAIL [0.0027s] [ 20%] 2025-12-04T13:28:26.5111703Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_ravel_cuda_float32 PASSED [1.2460s] [ 20%] 2025-12-04T13:28:26.5111830Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_reciprocal_cuda_complex64 PASSED [1.2422s] [ 20%] 2025-12-04T13:28:26.5111990Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_renorm_cuda_float32 PASSED [0.0098s] [ 20%] 2025-12-04T13:28:26.5112106Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_repeat_cuda_float32 PASSED [1.2340s] [ 20%] 2025-12-04T13:28:26.5112226Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_resize__cuda_float32 PASSED [0.0044s] [ 21%] 2025-12-04T13:28:26.5112341Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_roll_cuda_complex64 PASSED [0.0248s] [ 21%] 2025-12-04T13:28:26.5112457Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_round_cuda_float32 PASSED [1.2390s] [ 21%] 2025-12-04T13:28:26.5112606Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_round_decimals_neg_3_cuda_float32 SKIPPED [0.0002s] (Skipped!) [ 21%] 2025-12-04T13:28:26.5112726Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_rsqrt_cuda_complex64 PASSED [1.2324s] [ 21%] 2025-12-04T13:28:26.5112842Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_short_cuda_float32 PASSED [0.0042s] [ 21%] 2025-12-04T13:28:26.5112995Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_hamming_cuda_float32 SKIPPED [0.0002s] (Skipped!) [ 21%] 2025-12-04T13:28:26.5113160Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_signal_windows_nuttall_cuda_float32 SKIPPED [0.0001s] (Skipped!) [ 21%] 2025-12-04T13:28:26.5113276Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sin_cuda_complex64 PASSED [1.2096s] [ 21%] 2025-12-04T13:28:26.5113393Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_slice_cuda_complex64 PASSED [1.2288s] [ 21%] 2025-12-04T13:28:26.5113530Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_softmax_with_dtype_cuda_complex64 PASSED [1.2379s] [ 21%] 2025-12-04T13:28:26.5113665Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_softmax_with_dtype_cuda_float32 PASSED [0.0129s] [ 21%] 2025-12-04T13:28:26.5113797Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_bessel_j1_cuda_float32 PASSED [1.1976s] [ 21%] 2025-12-04T13:28:26.5113930Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_bessel_y1_cuda_float32 PASSED [1.2125s] [ 21%] 2025-12-04T13:28:26.5114082Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_chebyshev_polynomial_u_cuda_float32 PASSED [1.2101s] [ 21%] 2025-12-04T13:28:26.5114206Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_i0e_cuda_float32 PASSED [1.1948s] [ 21%] 2025-12-04T13:28:26.5114362Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_modified_bessel_i1_cuda_float32 PASSED [1.2262s] [ 21%] 2025-12-04T13:28:26.5114506Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_modified_bessel_k0_cuda_float32 PASSED [1.2464s] [ 21%] 2025-12-04T13:28:26.5114646Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_special_xlog1py_cuda_float32 PASSED [1.2387s] [ 21%] 2025-12-04T13:28:26.5114760Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_cuda_float32 PASSED [0.0043s] [ 21%] 2025-12-04T13:28:26.5114903Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_list_args_cuda_complex64 PASSED [0.0065s] [ 21%] 2025-12-04T13:28:26.5115040Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_with_sizes_copy_cuda_float32 PASSED [1.2257s] [ 21%] 2025-12-04T13:28:26.5115172Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_split_with_sizes_cuda_complex64 PASSED [1.2214s] [ 21%] 2025-12-04T13:28:26.5115299Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_copy_cuda_float32 PASSED [0.0056s] [ 21%] 2025-12-04T13:28:26.5115419Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_squeeze_cuda_complex64 PASSED [0.0229s] [ 21%] 2025-12-04T13:28:26.5115535Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_stack_cuda_float32 PASSED [1.2471s] [ 22%] 2025-12-04T13:28:26.5115670Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_std_mean_unbiased_cuda_complex64 PASSED [1.2114s] [ 22%] 2025-12-04T13:28:26.5115783Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_sub_cuda_float32 PASSED [0.0295s] [ 22%] 2025-12-04T13:28:26.5115896Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_svd_cuda_float32 PASSED [0.1560s] [ 22%] 2025-12-04T13:28:26.5116016Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_t_copy_cuda_complex64 PASSED [1.2200s] [ 22%] 2025-12-04T13:28:26.5116133Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_take_cuda_complex64 PASSED [1.2583s] [ 22%] 2025-12-04T13:28:26.5116253Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_tensordot_cuda_float32 PASSED [0.0059s] [ 22%] 2025-12-04T13:28:26.5116367Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_to_cuda_complex64 PASSED [1.2765s] [ 22%] 2025-12-04T13:28:26.5116521Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_torch_ops_aten__safe_softmax_default_cuda_float32 PASSED [0.0060s] [ 22%] 2025-12-04T13:28:26.5116637Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_trace_cuda_float32 PASSED [1.2011s] [ 22%] 2025-12-04T13:28:26.5116765Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_transpose_copy_cuda_complex64 PASSED [1.2098s] [ 22%] 2025-12-04T13:28:26.5116899Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_transpose_cuda_complex64 PASSED [1.2934s] [ 22%] 2025-12-04T13:28:26.5117021Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_transpose_cuda_float32 PASSED [0.0266s] [ 22%] 2025-12-04T13:28:26.5117155Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_triangular_solve_cuda_complex64 PASSED [1.2717s] [ 22%] 2025-12-04T13:28:26.5117280Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_true_divide_cuda_complex64 PASSED [1.2385s] [ 22%] 2025-12-04T13:28:26.5117402Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_true_divide_cuda_float32 PASSED [0.0128s] [ 22%] 2025-12-04T13:28:26.5117527Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unbind_copy_cuda_complex64 PASSED [0.0056s] [ 22%] 2025-12-04T13:28:26.5117652Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unfold_copy_cuda_complex64 PASSED [0.0148s] [ 22%] 2025-12-04T13:28:26.5117779Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsafe_chunk_cuda_complex64 PASSED [1.2226s] [ 22%] 2025-12-04T13:28:26.5117905Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_unsafe_split_cuda_complex64 PASSED [1.2515s] [ 22%] 2025-12-04T13:28:26.5118051Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_var_mean_unbiased_cuda_complex64 PASSED [1.2122s] [ 22%] 2025-12-04T13:28:26.5118170Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_vstack_cuda_complex64 PASSED [1.2350s] [ 22%] 2025-12-04T13:28:26.5118293Z test_ops.py::TestCommonCUDA::test_variant_consistency_eager_zeros_like_cuda_float32 PASSED [0.0043s] [ 22%] 2025-12-04T13:28:26.5118427Z test_ops.py::TestCompositeComplianceCUDA::test_backward___getitem___cuda_float32 PASSED [0.0477s] [ 22%] 2025-12-04T13:28:26.5118563Z test_ops.py::TestCompositeComplianceCUDA::test_backward___rmatmul___cuda_float32 PASSED [0.1027s] [ 22%] 2025-12-04T13:28:26.5118702Z test_ops.py::TestCompositeComplianceCUDA::test_backward__segment_reduce_offsets_cuda_float32 PASSED [0.4245s] [ 23%] 2025-12-04T13:28:26.5118823Z test_ops.py::TestCompositeComplianceCUDA::test_backward_addcdiv_cuda_float32 PASSED [0.2172s] [ 23%] 2025-12-04T13:28:26.5118942Z test_ops.py::TestCompositeComplianceCUDA::test_backward_addmv_cuda_float32 PASSED [0.0849s] [ 23%] 2025-12-04T13:28:26.5119060Z test_ops.py::TestCompositeComplianceCUDA::test_backward_amax_cuda_float32 PASSED [0.0555s] [ 23%] 2025-12-04T13:28:26.5119176Z test_ops.py::TestCompositeComplianceCUDA::test_backward_amin_cuda_float32 PASSED [0.0558s] [ 23%] 2025-12-04T13:28:26.5119293Z test_ops.py::TestCompositeComplianceCUDA::test_backward_angle_cuda_float32 PASSED [0.0046s] [ 23%] 2025-12-04T13:28:26.5119409Z test_ops.py::TestCompositeComplianceCUDA::test_backward_atan2_cuda_float32 PASSED [0.0714s] [ 23%] 2025-12-04T13:28:26.5119533Z test_ops.py::TestCompositeComplianceCUDA::test_backward_atleast_2d_cuda_float32 PASSED [0.0587s] [ 23%] 2025-12-04T13:28:26.5119655Z test_ops.py::TestCompositeComplianceCUDA::test_backward_atleast_3d_cuda_float32 PASSED [0.0635s] [ 23%] 2025-12-04T13:28:26.5119774Z test_ops.py::TestCompositeComplianceCUDA::test_backward_bfloat16_cuda_float32 PASSED [0.0112s] [ 23%] 2025-12-04T13:28:26.5119889Z test_ops.py::TestCompositeComplianceCUDA::test_backward_cat_cuda_float32 PASSED [0.0204s] [ 23%] 2025-12-04T13:28:26.5120003Z test_ops.py::TestCompositeComplianceCUDA::test_backward_cdist_cuda_float32 PASSED [1.7261s] [ 23%] 2025-12-04T13:28:26.5120135Z test_ops.py::TestCompositeComplianceCUDA::test_backward_cholesky_inverse_cuda_float32 PASSED [1.3208s] [ 23%] 2025-12-04T13:28:26.5120262Z test_ops.py::TestCompositeComplianceCUDA::test_backward_column_stack_cuda_float32 PASSED [0.0129s] [ 23%] 2025-12-04T13:28:26.5120388Z test_ops.py::TestCompositeComplianceCUDA::test_backward_combinations_cuda_float32 PASSED [0.0955s] [ 23%] 2025-12-04T13:28:26.5120514Z test_ops.py::TestCompositeComplianceCUDA::test_backward_conj_physical_cuda_float32 PASSED [1.2217s] [ 23%] 2025-12-04T13:28:26.5120632Z test_ops.py::TestCompositeComplianceCUDA::test_backward_cummax_cuda_float32 PASSED [0.0118s] [ 23%] 2025-12-04T13:28:26.5120761Z test_ops.py::TestCompositeComplianceCUDA::test_backward_cummin_cuda_float32 PASSED [0.0084s] [ 23%] 2025-12-04T13:28:26.5120880Z test_ops.py::TestCompositeComplianceCUDA::test_backward_cumprod_cuda_float32 PASSED [0.0591s] [ 23%] 2025-12-04T13:28:26.5121008Z test_ops.py::TestCompositeComplianceCUDA::test_backward_diagonal_copy_cuda_float32 PASSED [0.0301s] [ 23%] 2025-12-04T13:28:26.5121140Z test_ops.py::TestCompositeComplianceCUDA::test_backward_div_trunc_rounding_cuda_float32 PASSED [0.0474s] [ 23%] 2025-12-04T13:28:26.5121255Z test_ops.py::TestCompositeComplianceCUDA::test_backward_dot_cuda_float32 PASSED [1.2579s] [ 23%] 2025-12-04T13:28:26.5121372Z test_ops.py::TestCompositeComplianceCUDA::test_backward_double_cuda_float32 PASSED [0.0138s] [ 23%] 2025-12-04T13:28:26.5121487Z test_ops.py::TestCompositeComplianceCUDA::test_backward_erf_cuda_float32 PASSED [0.0052s] [ 23%] 2025-12-04T13:28:26.5121601Z test_ops.py::TestCompositeComplianceCUDA::test_backward_exp_cuda_float32 PASSED [0.0086s] [ 23%] 2025-12-04T13:28:26.5121726Z test_ops.py::TestCompositeComplianceCUDA::test_backward_expand_copy_cuda_float32 PASSED [0.0184s] [ 24%] 2025-12-04T13:28:26.5121896Z test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_fftshift_cuda_float32 PASSED [1.2427s] [ 24%] 2025-12-04T13:28:26.5122017Z test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_hfftn_cuda_float32 PASSED [0.0517s] [ 24%] 2025-12-04T13:28:26.5122137Z test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_ihfft_cuda_float32 PASSED [0.0274s] [ 24%] 2025-12-04T13:28:26.5122273Z test_ops.py::TestCompositeComplianceCUDA::test_backward_fft_irfft2_cuda_float32 PASSED [0.3226s] [ 24%] 2025-12-04T13:28:26.5122408Z test_ops.py::TestCompositeComplianceCUDA::test_backward_fill_cuda_float32 PASSED [0.0071s] [ 24%] 2025-12-04T13:28:26.5122522Z test_ops.py::TestCompositeComplianceCUDA::test_backward_fmin_cuda_float32 PASSED [0.0687s] [ 24%] 2025-12-04T13:28:26.5122669Z test_ops.py::TestCompositeComplianceCUDA::test_backward_grid_sampler_3d_cuda_float32 SKIPPED [0.0001s] (Skipped!) [ 24%] 2025-12-04T13:28:26.5122786Z test_ops.py::TestCompositeComplianceCUDA::test_backward_hstack_cuda_float32 PASSED [0.0081s] [ 24%] 2025-12-04T13:28:26.5122917Z test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_cholesky_cuda_float32 PASSED [0.3881s] [ 24%] 2025-12-04T13:28:26.5123038Z test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_eig_cuda_float32 PASSED [0.2290s] [ 24%] 2025-12-04T13:28:26.5123166Z test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_eigvals_cuda_float32 PASSED [0.1226s] [ 24%] 2025-12-04T13:28:26.5123306Z test_ops.py::TestCompositeComplianceCUDA::test_backward_linalg_solve_triangular_cuda_float32 PASSED [2.3483s] [ 24%] 2025-12-04T13:28:26.5123421Z test_ops.py::TestCompositeComplianceCUDA::test_backward_log_cuda_float32 PASSED [0.0079s] [ 24%] 2025-12-04T13:28:26.5123553Z test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_logsumexp_cuda_float32 PASSED [0.5108s] [ 24%] 2025-12-04T13:28:26.5123683Z test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_scatter_cuda_float32 PASSED [0.8378s] [ 24%] 2025-12-04T13:28:26.5123809Z test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_select_cuda_float32 PASSED [0.0384s] [ 24%] 2025-12-04T13:28:26.5123936Z test_ops.py::TestCompositeComplianceCUDA::test_backward_masked_softmax_cuda_float32 PASSED [0.1178s] [ 24%] 2025-12-04T13:28:26.5124053Z test_ops.py::TestCompositeComplianceCUDA::test_backward_matmul_cuda_float32 PASSED [0.0988s] [ 24%] 2025-12-04T13:28:26.5124193Z test_ops.py::TestCompositeComplianceCUDA::test_backward_max_reduction_with_dim_cuda_float32 PASSED [0.0104s] [ 24%] 2025-12-04T13:28:26.5124312Z test_ops.py::TestCompositeComplianceCUDA::test_backward_maximum_cuda_float32 PASSED [0.0662s] [ 24%] 2025-12-04T13:28:26.5124427Z test_ops.py::TestCompositeComplianceCUDA::test_backward_mean_cuda_float32 PASSED [0.8401s] [ 24%] 2025-12-04T13:28:26.5124579Z test_ops.py::TestCompositeComplianceCUDA::test_backward_mvlgamma_mvlgamma_p_1_cuda_float32 PASSED [0.0287s] [ 24%] 2025-12-04T13:28:26.5124695Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nansum_cuda_float32 PASSED [0.0770s] [ 24%] 2025-12-04T13:28:26.5124809Z test_ops.py::TestCompositeComplianceCUDA::test_backward_neg_cuda_float32 PASSED [0.7892s] [ 24%] 2025-12-04T13:28:26.5124964Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_avg_pool3d_cuda_float32 PASSED [0.0270s] [ 25%] 2025-12-04T13:28:26.5125121Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_adaptive_max_pool3d_cuda_float32 PASSED [0.0390s] [ 25%] 2025-12-04T13:28:26.5125263Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_avg_pool1d_cuda_float32 PASSED [0.0298s] [ 25%] 2025-12-04T13:28:26.5125424Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_batch_norm_without_cudnn_cuda_float32 PASSED [0.4479s] [ 25%] 2025-12-04T13:28:26.5125576Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_conv_transpose1d_cuda_float32 PASSED [0.1221s] [ 25%] 2025-12-04T13:28:26.5125732Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_cosine_embedding_loss_cuda_float32 PASSED [0.2297s] [ 25%] 2025-12-04T13:28:26.5125891Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_cross_entropy_cuda_float32 PASSED [0.1626s] [ 25%] 2025-12-04T13:28:26.5126031Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_ctc_loss_cuda_float32 PASSED [0.3119s] [ 25%] 2025-12-04T13:28:26.5126178Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_l1_loss_cuda_float32 PASSED [0.0377s] [ 25%] 2025-12-04T13:28:26.5126323Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_linear_cuda_float32 PASSED [0.1866s] [ 25%] 2025-12-04T13:28:26.5126568Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_multi_head_attention_forward_cuda_float32 SKIPPED [0.0003s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 25%] 2025-12-04T13:28:26.5126733Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_multilabel_soft_margin_loss_cuda_float32 PASSED [0.0442s] [ 25%] 2025-12-04T13:28:26.5126875Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_normalize_cuda_float32 PASSED [0.0279s] [ 25%] 2025-12-04T13:28:26.5127008Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_prelu_cuda_float32 PASSED [0.0978s] [ 25%] 2025-12-04T13:28:26.5127142Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_relu6_cuda_float32 PASSED [0.0120s] [ 25%] 2025-12-04T13:28:26.5127284Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_softshrink_cuda_float32 PASSED [0.0080s] [ 25%] 2025-12-04T13:28:26.5127424Z test_ops.py::TestCompositeComplianceCUDA::test_backward_nn_functional_softsign_cuda_float32 PASSED [0.0079s] [ 25%] 2025-12-04T13:28:26.5127547Z test_ops.py::TestCompositeComplianceCUDA::test_backward_reciprocal_cuda_float32 PASSED [0.0062s] [ 25%] 2025-12-04T13:28:26.5127667Z test_ops.py::TestCompositeComplianceCUDA::test_backward_reshape_cuda_float32 PASSED [0.0103s] [ 25%] 2025-12-04T13:28:26.5127783Z test_ops.py::TestCompositeComplianceCUDA::test_backward_roll_cuda_float32 PASSED [0.0198s] [ 25%] 2025-12-04T13:28:26.5127913Z test_ops.py::TestCompositeComplianceCUDA::test_backward_round_decimals_0_cuda_float32 PASSED [0.0053s] [ 25%] 2025-12-04T13:28:26.5128048Z test_ops.py::TestCompositeComplianceCUDA::test_backward_round_decimals_neg_3_cuda_float32 PASSED [0.0052s] [ 25%] 2025-12-04T13:28:26.5128182Z test_ops.py::TestCompositeComplianceCUDA::test_backward_scatter_reduce_amax_cuda_float32 PASSED [0.3108s] [ 25%] 2025-12-04T13:28:26.5128298Z test_ops.py::TestCompositeComplianceCUDA::test_backward_sinh_cuda_float32 PASSED [0.0040s] [ 25%] 2025-12-04T13:28:26.5128424Z test_ops.py::TestCompositeComplianceCUDA::test_backward_slice_scatter_cuda_float32 PASSED [0.0361s] [ 25%] 2025-12-04T13:28:26.5128562Z test_ops.py::TestCompositeComplianceCUDA::test_backward_special_entr_cuda_float32 PASSED [0.0064s] [ 26%] 2025-12-04T13:28:26.5128692Z test_ops.py::TestCompositeComplianceCUDA::test_backward_special_log_ndtr_cuda_float32 PASSED [0.0081s] [ 26%] 2025-12-04T13:28:26.5128857Z test_ops.py::TestCompositeComplianceCUDA::test_backward_special_polygamma_special_polygamma_n_0_cuda_float32 PASSED [0.0149s] [ 26%] 2025-12-04T13:28:26.5128992Z test_ops.py::TestCompositeComplianceCUDA::test_backward_split_with_sizes_copy_cuda_float32 PASSED [0.0182s] [ 26%] 2025-12-04T13:28:26.5129113Z test_ops.py::TestCompositeComplianceCUDA::test_backward_std_mean_cuda_float32 PASSED [0.0671s] [ 26%] 2025-12-04T13:28:26.5129237Z test_ops.py::TestCompositeComplianceCUDA::test_backward_std_unbiased_cuda_float32 PASSED [0.0064s] [ 26%] 2025-12-04T13:28:26.5129352Z test_ops.py::TestCompositeComplianceCUDA::test_backward_stft_cuda_float32 PASSED [0.7045s] [ 26%] 2025-12-04T13:28:26.5129480Z test_ops.py::TestCompositeComplianceCUDA::test_backward_take_along_dim_cuda_float32 PASSED [0.0190s] [ 26%] 2025-12-04T13:28:26.5129596Z test_ops.py::TestCompositeComplianceCUDA::test_backward_tan_cuda_float32 PASSED [0.0037s] [ 26%] 2025-12-04T13:28:26.5129728Z test_ops.py::TestCompositeComplianceCUDA::test_backward_tensordot_cuda_float32 PASSED [0.0241s] [ 26%] 2025-12-04T13:28:26.5129848Z test_ops.py::TestCompositeComplianceCUDA::test_backward_trapezoid_cuda_float32 PASSED [0.0567s] [ 26%] 2025-12-04T13:28:26.5129964Z test_ops.py::TestCompositeComplianceCUDA::test_backward_unbind_cuda_float32 PASSED [0.0906s] [ 26%] 2025-12-04T13:28:26.5130094Z test_ops.py::TestCompositeComplianceCUDA::test_backward_unflatten_cuda_float32 PASSED [0.0135s] [ 26%] 2025-12-04T13:28:26.5130230Z test_ops.py::TestCompositeComplianceCUDA::test_backward_unfold_copy_cuda_float32 PASSED [0.0312s] [ 26%] 2025-12-04T13:28:26.5130360Z test_ops.py::TestCompositeComplianceCUDA::test_backward_var_mean_unbiased_cuda_float32 PASSED [0.0093s] [ 26%] 2025-12-04T13:28:26.5130482Z test_ops.py::TestCompositeComplianceCUDA::test_backward_view_copy_cuda_float32 PASSED [0.0107s] [ 26%] 2025-12-04T13:28:26.5130599Z test_ops.py::TestCompositeComplianceCUDA::test_backward_xlogy_cuda_float32 PASSED [0.0487s] [ 26%] 2025-12-04T13:28:26.5130716Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input___rmod___cuda_float32 PASSED [0.0062s] [ 26%] 2025-12-04T13:28:26.5130837Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input__chunk_cat_cuda_float32 PASSED [0.7985s] [ 26%] 2025-12-04T13:28:26.5130953Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_addmv_cuda_float32 PASSED [0.0080s] [ 26%] 2025-12-04T13:28:26.5131077Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_alias_copy_cuda_float32 PASSED [0.7862s] [ 26%] 2025-12-04T13:28:26.5131209Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_as_strided_scatter_cuda_float32 PASSED [0.0072s] [ 26%] 2025-12-04T13:28:26.5131323Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_asin_cuda_float32 PASSED [0.7908s] [ 26%] 2025-12-04T13:28:26.5131446Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_bernoulli_cuda_float32 PASSED [0.0182s] [ 26%] 2025-12-04T13:28:26.5131561Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_chunk_cuda_float32 PASSED [0.7867s] [ 26%] 2025-12-04T13:28:26.5131685Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_contiguous_cuda_float32 PASSED [0.0043s] [ 27%] 2025-12-04T13:28:26.5131804Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_corrcoef_cuda_float32 PASSED [0.8035s] [ 27%] 2025-12-04T13:28:26.5131950Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cummin_cuda_float32 PASSED [0.0051s] [ 27%] 2025-12-04T13:28:26.5132087Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_cumulative_trapezoid_cuda_float32 PASSED [0.8124s] [ 27%] 2025-12-04T13:28:26.5132216Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diagonal_copy_cuda_float32 PASSED [0.0096s] [ 27%] 2025-12-04T13:28:26.5132347Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_diagonal_scatter_cuda_float32 PASSED [0.0096s] [ 27%] 2025-12-04T13:28:26.5132474Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_dot_cuda_float32 PASSED [0.7865s] [ 27%] 2025-12-04T13:28:26.5132592Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_dstack_cuda_float32 PASSED [0.0046s] [ 27%] 2025-12-04T13:28:26.5132705Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_expm1_cuda_float32 PASSED [0.7830s] [ 27%] 2025-12-04T13:28:26.5132830Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_fftshift_cuda_float32 PASSED [0.0057s] [ 27%] 2025-12-04T13:28:26.5132948Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_ifft_cuda_float32 PASSED [0.8039s] [ 27%] 2025-12-04T13:28:26.5133070Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_fft_rfft2_cuda_float32 PASSED [0.9192s] [ 27%] 2025-12-04T13:28:26.5133189Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_full_like_cuda_float32 PASSED [0.7994s] [ 27%] 2025-12-04T13:28:26.5133307Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_gradient_cuda_float32 PASSED [0.0132s] [ 27%] 2025-12-04T13:28:26.5133425Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_igammac_cuda_float32 PASSED [0.0116s] [ 27%] 2025-12-04T13:28:26.5133571Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_index_reduce_amax_cuda_float32 PASSED [0.8028s] [ 27%] 2025-12-04T13:28:26.5133687Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isclose_cuda_float32 PASSED [0.0096s] [ 27%] 2025-12-04T13:28:26.5133802Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isin_cuda_float32 PASSED [0.7964s] [ 27%] 2025-12-04T13:28:26.5133934Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_isneginf_cuda_float32 PASSED [0.0036s] [ 27%] 2025-12-04T13:28:26.5134077Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_jiterator_binary_cuda_float32 PASSED [0.0073s] [ 27%] 2025-12-04T13:28:26.5134201Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_cond_cuda_float32 PASSED [0.8082s] [ 27%] 2025-12-04T13:28:26.5134325Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_det_cuda_float32 PASSED [0.0106s] [ 27%] 2025-12-04T13:28:26.5134453Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_diagonal_cuda_float32 PASSED [0.0075s] [ 27%] 2025-12-04T13:28:26.5134582Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_eigvals_cuda_float32 PASSED [0.0438s] [ 27%] 2025-12-04T13:28:26.5134798Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_ldl_solve_cuda_float32 SKIPPED [0.0011s] (skipCUDAIfRocm: test doesn't currently work on the ROCm stack) [ 27%] 2025-12-04T13:28:26.5134924Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lstsq_cuda_float32 PASSED [0.9078s] [ 28%] 2025-12-04T13:28:26.5135054Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_lu_solve_cuda_float32 PASSED [0.1981s] [ 28%] 2025-12-04T13:28:26.5135193Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_pinv_hermitian_cuda_float32 PASSED [0.0325s] [ 28%] 2025-12-04T13:28:26.5135319Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_solve_cuda_float32 PASSED [0.0187s] [ 28%] 2025-12-04T13:28:26.5135446Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_vander_cuda_float32 PASSED [0.7874s] [ 28%] 2025-12-04T13:28:26.5135580Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_linalg_vector_norm_cuda_float32 PASSED [0.0725s] [ 28%] 2025-12-04T13:28:26.5135696Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_log_cuda_float32 PASSED [0.7734s] [ 28%] 2025-12-04T13:28:26.5135815Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_logspace_cuda_float32 PASSED [0.0495s] [ 28%] 2025-12-04T13:28:26.5135928Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_lu_cuda_float32 PASSED [0.0387s] [ 28%] 2025-12-04T13:28:26.5136054Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_amin_cuda_float32 PASSED [0.0384s] [ 28%] 2025-12-04T13:28:26.5136179Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_argmax_cuda_float32 PASSED [0.0163s] [ 28%] 2025-12-04T13:28:26.5136315Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_argmin_cuda_float32 PASSED [0.0165s] [ 28%] 2025-12-04T13:28:26.5136446Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_logsumexp_cuda_float32 PASSED [0.8173s] [ 28%] 2025-12-04T13:28:26.5136576Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_normalize_cuda_float32 PASSED [0.0207s] [ 28%] 2025-12-04T13:28:26.5136702Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_masked_scatter_cuda_float32 PASSED [0.7599s] [ 28%] 2025-12-04T13:28:26.5136837Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_max_reduction_no_dim_cuda_float32 PASSED [0.0045s] [ 28%] 2025-12-04T13:28:26.5136956Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_movedim_cuda_float32 PASSED [0.7552s] [ 28%] 2025-12-04T13:28:26.5137095Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_mvlgamma_mvlgamma_p_3_cuda_float32 PASSED [0.0080s] [ 28%] 2025-12-04T13:28:26.5137212Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nanmean_cuda_float32 PASSED [0.0168s] [ 28%] 2025-12-04T13:28:26.5137333Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nanmedian_cuda_float32 PASSED [0.0074s] [ 28%] 2025-12-04T13:28:26.5137475Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_native_batch_norm_cuda_float32 PASSED [0.7753s] [ 28%] 2025-12-04T13:28:26.5137587Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_ne_cuda_float32 PASSED [0.0055s] [ 28%] 2025-12-04T13:28:26.5137724Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_new_empty_cuda_float32 PASSED [0.7736s] [ 28%] 2025-12-04T13:28:26.5137841Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_new_full_cuda_float32 PASSED [0.0049s] [ 28%] 2025-12-04T13:28:26.5138008Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_avg_pool1d_cuda_float32 PASSED [0.7900s] [ 28%] 2025-12-04T13:28:26.5138163Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_adaptive_max_pool3d_cuda_float32 PASSED [0.0119s] [ 29%] 2025-12-04T13:28:26.5138307Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_batch_norm_cuda_float32 PASSED [0.1021s] [ 29%] 2025-12-04T13:28:26.5138630Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv2d_cuda_float32 MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback AI] Solver , workspace required: 2400, provided ptr: 0x7b2ecb001c00 size: 1024 2025-12-04T13:28:26.5138817Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 2400, provided ptr: 0x7b2ecb001c00 size: 1024 2025-12-04T13:28:26.5139007Z MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback AI] Solver , workspace required: 2400, provided ptr: 0x7b2ecb000e00 size: 1024 2025-12-04T13:28:26.5139188Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 2400, provided ptr: 0x7b2ecb000e00 size: 1024 2025-12-04T13:28:26.5139386Z MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback AI] Solver , workspace required: 2400, provided ptr: 0x7b2ecb001000 size: 1024 2025-12-04T13:28:26.5139575Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 2400, provided ptr: 0x7b2ecb001000 size: 1024 2025-12-04T13:28:26.5139617Z PASSED [0.1105s] [ 29%] 2025-12-04T13:28:26.5139772Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_conv_transpose3d_cuda_float32 PASSED [0.7771s] [ 29%] 2025-12-04T13:28:26.5139926Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_cosine_similarity_cuda_float32 PASSED [0.0103s] [ 29%] 2025-12-04T13:28:26.5140073Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_cross_entropy_cuda_float32 PASSED [0.0134s] [ 29%] 2025-12-04T13:28:26.5140220Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_embedding_bag_cuda_float32 PASSED [0.0324s] [ 29%] 2025-12-04T13:28:26.5140405Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_feature_alpha_dropout_without_train_cuda_float32 PASSED [0.7686s] [ 29%] 2025-12-04T13:28:26.5140557Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_gaussian_nll_loss_cuda_float32 PASSED [0.2668s] [ 29%] 2025-12-04T13:28:26.5140691Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_gelu_cuda_float32 PASSED [0.0052s] [ 29%] 2025-12-04T13:28:26.5140831Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_hardtanh_cuda_float32 PASSED [0.7681s] [ 29%] 2025-12-04T13:28:26.5140985Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_margin_ranking_loss_cuda_float32 PASSED [0.0208s] [ 29%] 2025-12-04T13:28:26.5141127Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_pool1d_cuda_float32 PASSED [0.3451s] [ 29%] 2025-12-04T13:28:26.5141267Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_pool3d_cuda_float32 PASSED [0.2298s] [ 29%] 2025-12-04T13:28:26.5141410Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_max_unpool3d_cuda_float32 PASSED [0.8322s] [ 29%] 2025-12-04T13:28:26.5141558Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_mse_loss_cuda_float32 PASSED [0.0067s] [ 29%] 2025-12-04T13:28:26.5141702Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_constant_cuda_float32 PASSED [0.0206s] [ 29%] 2025-12-04T13:28:26.5141894Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pad_reflect_cuda_float32 PASSED [0.7684s] [ 29%] 2025-12-04T13:28:26.5142028Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_pdist_cuda_float32 PASSED [0.0072s] [ 29%] 2025-12-04T13:28:26.5142182Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_selu_cuda_float32 PASSED [0.7663s] [ 29%] 2025-12-04T13:28:26.5142313Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_silu_cuda_float32 PASSED [0.0051s] [ 29%] 2025-12-04T13:28:26.5142466Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_softmin_with_dtype_cuda_float32 PASSED [0.7657s] [ 29%] 2025-12-04T13:28:26.5142606Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_nn_functional_threshold_cuda_float32 PASSED [0.0054s] [ 29%] 2025-12-04T13:28:26.5142748Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_1_cuda_float32 PASSED [0.7684s] [ 29%] 2025-12-04T13:28:26.5142889Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_polygamma_polygamma_n_2_cuda_float32 PASSED [0.0081s] [ 29%] 2025-12-04T13:28:26.5143011Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_positive_cuda_float32 PASSED [0.7598s] [ 30%] 2025-12-04T13:28:26.5143126Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_pow_cuda_float32 PASSED [0.0096s] [ 30%] 2025-12-04T13:28:26.5143239Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_qr_cuda_float32 PASSED [0.0348s] [ 30%] 2025-12-04T13:28:26.5143363Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_reciprocal_cuda_float32 PASSED [0.0032s] [ 30%] 2025-12-04T13:28:26.5143486Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_reshape_as_cuda_float32 PASSED [0.7725s] [ 30%] 2025-12-04T13:28:26.5143608Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_resize_as__cuda_float32 PASSED [0.0038s] [ 30%] 2025-12-04T13:28:26.5143724Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_round_cuda_float32 PASSED [0.7670s] [ 30%] 2025-12-04T13:28:26.5143856Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_round_decimals_3_cuda_float32 PASSED [0.0044s] [ 30%] 2025-12-04T13:28:26.5143973Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_rsub_cuda_float32 PASSED [0.7919s] [ 30%] 2025-12-04T13:28:26.5144098Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scalar_tensor_cuda_float32 PASSED [0.0032s] [ 30%] 2025-12-04T13:28:26.5144247Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_reduce_amax_cuda_float32 PASSED [0.0180s] [ 30%] 2025-12-04T13:28:26.5144382Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_scatter_reduce_prod_cuda_float32 PASSED [0.0184s] [ 30%] 2025-12-04T13:28:26.5144522Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_bartlett_cuda_float32 PASSED [0.7671s] [ 30%] 2025-12-04T13:28:26.5144663Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_blackman_cuda_float32 PASSED [0.0118s] [ 30%] 2025-12-04T13:28:26.5144808Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_exponential_cuda_float32 PASSED [0.0039s] [ 30%] 2025-12-04T13:28:26.5144944Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signal_windows_hann_cuda_float32 PASSED [0.7654s] [ 30%] 2025-12-04T13:28:26.5145064Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_signbit_cuda_float32 PASSED [0.0033s] [ 30%] 2025-12-04T13:28:26.5145197Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_softmax_with_dtype_cuda_float32 PASSED [0.7625s] [ 30%] 2025-12-04T13:28:26.5145330Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_bessel_j0_cuda_float32 PASSED [0.0037s] [ 30%] 2025-12-04T13:28:26.5145468Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_erfcx_cuda_float32 PASSED [0.7809s] [ 30%] 2025-12-04T13:28:26.5145593Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_i0e_cuda_float32 PASSED [0.0047s] [ 30%] 2025-12-04T13:28:26.5145748Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_modified_bessel_i1_cuda_float32 PASSED [0.7666s] [ 30%] 2025-12-04T13:28:26.5145892Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_modified_bessel_k0_cuda_float32 PASSED [0.0035s] [ 30%] 2025-12-04T13:28:26.5146058Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_scaled_modified_bessel_k0_cuda_float32 PASSED [0.7671s] [ 30%] 2025-12-04T13:28:26.5146223Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_special_shifted_chebyshev_polynomial_u_cuda_float32 PASSED [0.0085s] [ 30%] 2025-12-04T13:28:26.5146352Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_split_list_args_cuda_float32 PASSED [0.7777s] [ 31%] 2025-12-04T13:28:26.5146483Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_split_with_sizes_cuda_float32 PASSED [0.0061s] [ 31%] 2025-12-04T13:28:26.5146613Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_squeeze_multiple_cuda_float32 PASSED [0.7775s] [ 31%] 2025-12-04T13:28:26.5146734Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_std_mean_cuda_float32 PASSED [0.0114s] [ 31%] 2025-12-04T13:28:26.5146859Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_std_unbiased_cuda_float32 PASSED [0.7591s] [ 31%] 2025-12-04T13:28:26.5146982Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_sum_to_size_cuda_float32 PASSED [0.0097s] [ 31%] 2025-12-04T13:28:26.5147099Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_t_copy_cuda_float32 PASSED [0.7575s] [ 31%] 2025-12-04T13:28:26.5147212Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_t_cuda_float32 PASSED [0.0049s] [ 31%] 2025-12-04T13:28:26.5147329Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_take_cuda_float32 PASSED [0.7692s] [ 31%] 2025-12-04T13:28:26.5147587Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_torch_ops_aten__efficient_attention_forward_cuda_float32 SKIPPED [0.0011s] (Efficient attention on ROCM doesn't support custom_mask_type==2) [ 31%] 2025-12-04T13:28:26.5147721Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_triangular_solve_cuda_float32 PASSED [0.0167s] [ 31%] 2025-12-04T13:28:26.5147835Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_tril_cuda_float32 PASSED [0.7681s] [ 31%] 2025-12-04T13:28:26.5147961Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unbind_copy_cuda_float32 PASSED [0.0077s] [ 31%] 2025-12-04T13:28:26.5148076Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unbind_cuda_float32 PASSED [0.7750s] [ 31%] 2025-12-04T13:28:26.5148202Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unfold_cuda_float32 PASSED [0.0125s] [ 31%] 2025-12-04T13:28:26.5148328Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsafe_split_cuda_float32 PASSED [0.0032s] [ 31%] 2025-12-04T13:28:26.5148450Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_unsqueeze_cuda_float32 PASSED [0.7349s] [ 31%] 2025-12-04T13:28:26.5148569Z test_ops.py::TestCompositeComplianceCUDA::test_cow_input_var_mean_cuda_float32 PASSED [0.0107s] [ 31%] 2025-12-04T13:28:26.5148683Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_T_cuda_float32 PASSED [0.0533s] [ 31%] 2025-12-04T13:28:26.5148868Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__segment_reduce_lengths_cuda_float32 SKIPPED [0.0011s] (Does not support forward_ad) [ 31%] 2025-12-04T13:28:26.5149048Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__segment_reduce_offsets_cuda_float32 SKIPPED [0.0010s] (Does not support forward_ad) [ 31%] 2025-12-04T13:28:26.5149206Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__unsafe_masked_index_put_accumulate_cuda_float32 PASSED [0.4119s] [ 31%] 2025-12-04T13:28:26.5149333Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_add_cuda_float32 PASSED [0.1143s] [ 31%] 2025-12-04T13:28:26.5149452Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addbmm_cuda_float32 PASSED [0.4352s] [ 31%] 2025-12-04T13:28:26.5149569Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addmm_cuda_float32 PASSED [0.3875s] [ 31%] 2025-12-04T13:28:26.5149712Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_addmm_decomposed_cuda_float32 PASSED [0.3517s] [ 32%] 2025-12-04T13:28:26.5149873Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_aminmax_cuda_float32 SKIPPED [0.0010s] (Does not support autograd) [ 32%] 2025-12-04T13:28:26.5150009Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atleast_1d_cuda_float32 PASSED [0.0194s] [ 32%] 2025-12-04T13:28:26.5150133Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_atleast_3d_cuda_float32 PASSED [0.0214s] [ 32%] 2025-12-04T13:28:26.5150256Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_bernoulli_cuda_float32 PASSED [0.0137s] [ 32%] 2025-12-04T13:28:26.5150420Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_bfloat16_cuda_float32 SKIPPED [0.0009s] (Does not support forward_ad) [ 32%] 2025-12-04T13:28:26.5150576Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_bool_cuda_float32 SKIPPED [0.0010s] (Does not support autograd) [ 32%] 2025-12-04T13:28:26.5150738Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cfloat_cuda_float32 SKIPPED [0.0009s] (Does not support forward_ad) [ 32%] 2025-12-04T13:28:26.5150900Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cholesky_cuda_float32 SKIPPED [0.0009s] (Does not support forward_ad) [ 32%] 2025-12-04T13:28:26.5151033Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cholesky_inverse_cuda_float32 PASSED [0.0927s] [ 32%] 2025-12-04T13:28:26.5151162Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cholesky_solve_cuda_float32 PASSED [0.2133s] [ 32%] 2025-12-04T13:28:26.5151286Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clamp_max_cuda_float32 PASSED [0.1030s] [ 32%] 2025-12-04T13:28:26.5151409Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clamp_min_cuda_float32 PASSED [0.1028s] [ 32%] 2025-12-04T13:28:26.5151526Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_clone_cuda_float32 PASSED [0.0066s] [ 32%] 2025-12-04T13:28:26.5151646Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_complex_cuda_float32 PASSED [0.0959s] [ 32%] 2025-12-04T13:28:26.5151776Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_constant_pad_nd_cuda_float32 PASSED [0.1157s] [ 32%] 2025-12-04T13:28:26.5151947Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_copysign_cuda_float32 PASSED [0.1098s] [ 32%] 2025-12-04T13:28:26.5152066Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cummax_cuda_float32 PASSED [0.0102s] [ 32%] 2025-12-04T13:28:26.5152198Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cumsum_cuda_float32 PASSED [0.0111s] [ 32%] 2025-12-04T13:28:26.5152315Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_double_cuda_float32 PASSED [0.0133s] [ 32%] 2025-12-04T13:28:26.5152431Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_einsum_cuda_float32 PASSED [0.0596s] [ 32%] 2025-12-04T13:28:26.5152593Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_empty_like_cuda_float32 SKIPPED [0.0009s] (Does not support autograd) [ 32%] 2025-12-04T13:28:26.5152712Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_exp2_cuda_float32 PASSED [0.0098s] [ 32%] 2025-12-04T13:28:26.5152833Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_fft_fft2_cuda_float32 PASSED [0.0222s] [ 32%] 2025-12-04T13:28:26.5152959Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_float_power_cuda_float32 PASSED [0.1831s] [ 32%] 2025-12-04T13:28:26.5153127Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_floor_divide_cuda_float32 SKIPPED [0.0009s] (Does not support autograd) [ 33%] 2025-12-04T13:28:26.5153245Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_gather_cuda_float32 PASSED [0.0321s] [ 33%] 2025-12-04T13:28:26.5153426Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_half_cuda_float32 SKIPPED [0.0010s] (Does not support forward_ad) [ 33%] 2025-12-04T13:28:26.5153544Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_hypot_cuda_float32 PASSED [0.1150s] [ 33%] 2025-12-04T13:28:26.5153671Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_i0_cuda_float32 PASSED [0.0093s] [ 33%] 2025-12-04T13:28:26.5153794Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_put_cuda_float32 PASSED [0.0945s] [ 33%] 2025-12-04T13:28:26.5153981Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_reduce_amax_cuda_float32 SKIPPED [0.0010s] (Does not support forward_ad) [ 33%] 2025-12-04T13:28:26.5154156Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_index_reduce_mean_cuda_float32 SKIPPED [0.0009s] (Does not support forward_ad) [ 33%] 2025-12-04T13:28:26.5154273Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_inner_cuda_float32 PASSED [0.0305s] [ 33%] 2025-12-04T13:28:26.5154430Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_isnan_cuda_float32 SKIPPED [0.0009s] (Does not support autograd) [ 33%] 2025-12-04T13:28:26.5154583Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_le_cuda_float32 SKIPPED [0.0009s] (Does not support autograd) [ 33%] 2025-12-04T13:28:26.5154715Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_cholesky_cuda_float32 PASSED [0.0812s] [ 33%] 2025-12-04T13:28:26.5154846Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_diagonal_cuda_float32 PASSED [0.0358s] [ 33%] 2025-12-04T13:28:26.5154969Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_eig_cuda_float32 PASSED [0.2248s] [ 33%] 2025-12-04T13:28:26.5155204Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_householder_product_cuda_float32 SKIPPED [0.0011s] (skipCUDAIfRocm: test doesn't currently work on the ROCm stack) [ 33%] 2025-12-04T13:28:26.5155379Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_ldl_factor_ex_cuda_float32 SKIPPED [0.0015s] (Does not support autograd) [ 33%] 2025-12-04T13:28:26.5155511Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lu_factor_cuda_float32 PASSED [0.3352s] [ 33%] 2025-12-04T13:28:26.5155648Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_lu_factor_ex_cuda_float32 PASSED [0.3178s] [ 33%] 2025-12-04T13:28:26.5155777Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_svdvals_cuda_float32 PASSED [0.3160s] [ 33%] 2025-12-04T13:28:26.5155938Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linspace_cuda_float32 SKIPPED [0.0013s] (Does not support autograd) [ 33%] 2025-12-04T13:28:26.5156054Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log10_cuda_float32 PASSED [0.0091s] [ 33%] 2025-12-04T13:28:26.5156190Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_log_softmax_cuda_float32 PASSED [0.0245s] [ 33%] 2025-12-04T13:28:26.5156353Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logical_or_cuda_float32 SKIPPED [0.0009s] (Does not support autograd) [ 33%] 2025-12-04T13:28:26.5156477Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_logsumexp_cuda_float32 PASSED [0.0535s] [ 33%] 2025-12-04T13:28:26.5156592Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lu_cuda_float32 PASSED [0.2829s] [ 34%] 2025-12-04T13:28:26.5156725Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_logaddexp_cuda_float32 PASSED [0.9731s] [ 34%] 2025-12-04T13:28:26.5156850Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_norm_cuda_float32 PASSED [3.5779s] [ 34%] 2025-12-04T13:28:26.5156982Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_normalize_cuda_float32 PASSED [0.2312s] [ 34%] 2025-12-04T13:28:26.5157108Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_prod_cuda_float32 PASSED [0.8594s] [ 34%] 2025-12-04T13:28:26.5157237Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_scatter_cuda_float32 PASSED [0.0903s] [ 34%] 2025-12-04T13:28:26.5157374Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_masked_select_cuda_float32 PASSED [0.0414s] [ 34%] 2025-12-04T13:28:26.5157512Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_max_reduction_no_dim_cuda_float32 PASSED [0.0078s] [ 34%] 2025-12-04T13:28:26.5157639Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mean_cuda_float32 PASSED [0.0478s] [ 34%] 2025-12-04T13:28:26.5157777Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_median_cuda_float32 PASSED [0.0394s] [ 34%] 2025-12-04T13:28:26.5157913Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_min_reduction_no_dim_cuda_float32 PASSED [0.0078s] [ 34%] 2025-12-04T13:28:26.5158032Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_movedim_cuda_float32 PASSED [0.0066s] [ 34%] 2025-12-04T13:28:26.5158198Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_multinomial_cuda_float32 SKIPPED [0.0009s] (Does not support autograd) [ 34%] 2025-12-04T13:28:26.5158339Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_mvlgamma_mvlgamma_p_1_cuda_float32 PASSED [0.0315s] [ 34%] 2025-12-04T13:28:26.5158463Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nan_to_num_cuda_float32 PASSED [0.0097s] [ 34%] 2025-12-04T13:28:26.5158580Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_narrow_cuda_float32 XFAIL [0.0057s] [ 34%] 2025-12-04T13:28:26.5158752Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_new_empty_strided_cuda_float32 SKIPPED [1.3043s] (Does not support autograd) [ 34%] 2025-12-04T13:28:26.5158914Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nextafter_cuda_float32 SKIPPED [0.0015s] (Does not support autograd) [ 34%] 2025-12-04T13:28:26.5159070Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_adaptive_max_pool3d_cuda_float32 PASSED [0.0651s] [ 34%] 2025-12-04T13:28:26.5159218Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_alpha_dropout_cuda_float32 PASSED [0.0821s] [ 34%] 2025-12-04T13:28:26.5159371Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_conv_transpose2d_cuda_float32 PASSED [0.5866s] [ 34%] 2025-12-04T13:28:26.5159514Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_embedding_cuda_float32 PASSED [1.2942s] [ 34%] 2025-12-04T13:28:26.5159674Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_fractional_max_pool2d_cuda_float32 PASSED [0.1601s] [ 34%] 2025-12-04T13:28:26.5159819Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_hardsigmoid_cuda_float32 PASSED [0.0198s] [ 34%] 2025-12-04T13:28:26.5159958Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_hardtanh_cuda_float32 PASSED [0.0136s] [ 34%] 2025-12-04T13:28:26.5160133Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_interpolate_nearest-exact_cuda_float32 PASSED [0.0551s] [ 35%] 2025-12-04T13:28:26.5160271Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_l1_loss_cuda_float32 PASSED [0.0899s] [ 35%] 2025-12-04T13:28:26.5160414Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_leaky_relu_cuda_float32 PASSED [0.0232s] [ 35%] 2025-12-04T13:28:26.5160551Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_linear_cuda_float32 PASSED [0.7705s] [ 35%] 2025-12-04T13:28:26.5160696Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_logsigmoid_cuda_float32 PASSED [1.2962s] [ 35%] 2025-12-04T13:28:26.5160858Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool2d_cuda_float32 SKIPPED [0.0003s] (Skipped!) [ 35%] 2025-12-04T13:28:26.5161010Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_unpool2d_grad_cuda_float32 PASSED [0.2053s] [ 35%] 2025-12-04T13:28:26.5161204Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_multi_margin_loss_cuda_float32 SKIPPED [0.0011s] (Does not support forward_ad) [ 35%] 2025-12-04T13:28:26.5161354Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_nll_loss_cuda_float32 PASSED [0.4794s] [ 35%] 2025-12-04T13:28:26.5161501Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_pixel_shuffle_cuda_float32 PASSED [0.0120s] [ 35%] 2025-12-04T13:28:26.5161663Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_poisson_nll_loss_cuda_float32 PASSED [1.9403s] [ 35%] 2025-12-04T13:28:26.5161815Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_rms_norm_cuda_float32 PASSED [0.1288s] [ 35%] 2025-12-04T13:28:26.5161991Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_rrelu_cuda_float32 PASSED [0.0210s] [ 35%] 2025-12-04T13:28:26.5162125Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_silu_cuda_float32 PASSED [1.2842s] [ 35%] 2025-12-04T13:28:26.5162267Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_softshrink_cuda_float32 PASSED [0.0159s] [ 35%] 2025-12-04T13:28:26.5162420Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_upsample_nearest_cuda_float32 PASSED [0.0324s] [ 35%] 2025-12-04T13:28:26.5162577Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nonzero_static_cuda_float32 SKIPPED [0.0007s] (Only runs on cpu) [ 35%] 2025-12-04T13:28:26.5162702Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_norm_fro_cuda_float32 PASSED [0.0114s] [ 35%] 2025-12-04T13:28:26.5162822Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_outer_cuda_float32 PASSED [1.3010s] [ 35%] 2025-12-04T13:28:26.5162942Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_permute_cuda_float32 PASSED [0.0136s] [ 35%] 2025-12-04T13:28:26.5163086Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_polygamma_polygamma_n_0_cuda_float32 PASSED [0.0277s] [ 35%] 2025-12-04T13:28:26.5163209Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_positive_cuda_float32 PASSED [1.2727s] [ 35%] 2025-12-04T13:28:26.5163328Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_prod_cuda_float32 PASSED [0.1452s] [ 35%] 2025-12-04T13:28:26.5163448Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_quantile_cuda_float32 PASSED [1.5855s] [ 35%] 2025-12-04T13:28:26.5163564Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_real_cuda_float32 PASSED [0.0074s] [ 35%] 2025-12-04T13:28:26.5163689Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_reciprocal_cuda_float32 PASSED [0.0089s] [ 36%] 2025-12-04T13:28:26.5163814Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_remainder_cuda_float32 PASSED [0.1073s] [ 36%] 2025-12-04T13:28:26.5163930Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_repeat_cuda_float32 PASSED [0.0532s] [ 36%] 2025-12-04T13:28:26.5164063Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_rot90_cuda_float32 PASSED [0.0807s] [ 36%] 2025-12-04T13:28:26.5164197Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_scatter_reduce_sum_cuda_float32 PASSED [0.4954s] [ 36%] 2025-12-04T13:28:26.5164366Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_searchsorted_cuda_float32 SKIPPED [0.0013s] (Does not support autograd) [ 36%] 2025-12-04T13:28:26.5164547Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_bartlett_cuda_float32 SKIPPED [0.0010s] (Does not support autograd) [ 36%] 2025-12-04T13:28:26.5164722Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_signal_windows_hann_cuda_float32 SKIPPED [0.0012s] (Does not support autograd) [ 36%] 2025-12-04T13:28:26.5164842Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_slice_cuda_float32 PASSED [0.0117s] [ 36%] 2025-12-04T13:28:26.5164976Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_softmax_with_dtype_cuda_float32 PASSED [0.0288s] [ 36%] 2025-12-04T13:28:26.5165135Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sparse_mm_reduce_cuda_float32 SKIPPED [0.0005s] (Only runs on cpu) [ 36%] 2025-12-04T13:28:26.5165335Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_hermite_polynomial_he_cuda_float32 SKIPPED [0.0009s] (Does not support autograd) [ 36%] 2025-12-04T13:28:26.5165524Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_laguerre_polynomial_l_cuda_float32 SKIPPED [0.0009s] (Does not support autograd) [ 36%] 2025-12-04T13:28:26.5165724Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_modified_bessel_k0_cuda_float32 SKIPPED [0.0009s] (Does not support autograd) [ 36%] 2025-12-04T13:28:26.5165917Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_special_modified_bessel_k1_cuda_float32 SKIPPED [0.0009s] (Does not support autograd) [ 36%] 2025-12-04T13:28:26.5166047Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_split_list_args_cuda_float32 PASSED [0.0167s] [ 36%] 2025-12-04T13:28:26.5166165Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_sqrt_cuda_float32 PASSED [1.2794s] [ 36%] 2025-12-04T13:28:26.5166300Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_std_mean_unbiased_cuda_float32 PASSED [0.0142s] [ 36%] 2025-12-04T13:28:26.5166416Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_stft_cuda_float32 PASSED [0.1040s] [ 36%] 2025-12-04T13:28:26.5166620Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_svd_lowrank_cuda_float32 SKIPPED [0.0013s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 36%] 2025-12-04T13:28:26.5166749Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_take_along_dim_cuda_float32 PASSED [0.0348s] [ 36%] 2025-12-04T13:28:26.5166914Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_to_sparse_cuda_float32 SKIPPED [0.0010s] (Does not support forward_ad) [ 36%] 2025-12-04T13:28:26.5167071Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_torch_ops_aten__safe_softmax_default_cuda_float32 PASSED [0.0362s] [ 36%] 2025-12-04T13:28:26.5167198Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unfold_copy_cuda_float32 PASSED [1.3138s] [ 36%] 2025-12-04T13:28:26.5167373Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_unique_consecutive_cuda_float32 SKIPPED [0.0016s] (Does not support autograd) [ 36%] 2025-12-04T13:28:26.5167491Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_vdot_cuda_float32 PASSED [0.0157s] [ 37%] 2025-12-04T13:28:26.5167614Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_view_copy_cuda_float32 PASSED [0.0197s] [ 37%] 2025-12-04T13:28:26.5167779Z test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_zeros_like_cuda_float32 SKIPPED [0.0010s] (Does not support autograd) [ 37%] 2025-12-04T13:28:26.5167894Z test_ops.py::TestCompositeComplianceCUDA::test_operator_T_cuda_float32 PASSED [1.2727s] [ 37%] 2025-12-04T13:28:26.5168035Z test_ops.py::TestCompositeComplianceCUDA::test_operator__segment_reduce_offsets_cuda_float32 PASSED [0.1273s] [ 37%] 2025-12-04T13:28:26.5168171Z test_ops.py::TestCompositeComplianceCUDA::test_operator_addmm_cuda_float32 PASSED [0.0167s] [ 37%] 2025-12-04T13:28:26.5168292Z test_ops.py::TestCompositeComplianceCUDA::test_operator_allclose_cuda_float32 PASSED [0.0283s] [ 37%] 2025-12-04T13:28:26.5168407Z test_ops.py::TestCompositeComplianceCUDA::test_operator_amax_cuda_float32 PASSED [0.0174s] [ 37%] 2025-12-04T13:28:26.5168536Z test_ops.py::TestCompositeComplianceCUDA::test_operator_as_strided_copy_cuda_float32 PASSED [0.0062s] [ 37%] 2025-12-04T13:28:26.5168659Z test_ops.py::TestCompositeComplianceCUDA::test_operator_block_diag_cuda_float32 PASSED [0.0096s] [ 37%] 2025-12-04T13:28:26.5168775Z test_ops.py::TestCompositeComplianceCUDA::test_operator_cauchy_cuda_float32 PASSED [0.0119s] [ 37%] 2025-12-04T13:28:26.5168889Z test_ops.py::TestCompositeComplianceCUDA::test_operator_char_cuda_float32 PASSED [0.0045s] [ 37%] 2025-12-04T13:28:26.5169008Z test_ops.py::TestCompositeComplianceCUDA::test_operator_cholesky_cuda_float32 PASSED [0.0147s] [ 37%] 2025-12-04T13:28:26.5169139Z test_ops.py::TestCompositeComplianceCUDA::test_operator_cholesky_inverse_cuda_float32 PASSED [0.0159s] [ 37%] 2025-12-04T13:28:26.5169277Z test_ops.py::TestCompositeComplianceCUDA::test_operator_cholesky_solve_cuda_float32 PASSED [0.0167s] [ 37%] 2025-12-04T13:28:26.5169396Z test_ops.py::TestCompositeComplianceCUDA::test_operator_corrcoef_cuda_float32 PASSED [0.0124s] [ 37%] 2025-12-04T13:28:26.5169510Z test_ops.py::TestCompositeComplianceCUDA::test_operator_cov_cuda_float32 PASSED [0.2376s] [ 37%] 2025-12-04T13:28:26.5169636Z test_ops.py::TestCompositeComplianceCUDA::test_operator_cummin_cuda_float32 PASSED [0.0049s] [ 37%] 2025-12-04T13:28:26.5169777Z test_ops.py::TestCompositeComplianceCUDA::test_operator_diagonal_scatter_cuda_float32 PASSED [0.0201s] [ 37%] 2025-12-04T13:28:26.5169913Z test_ops.py::TestCompositeComplianceCUDA::test_operator_div_no_rounding_mode_cuda_float32 PASSED [0.0121s] [ 37%] 2025-12-04T13:28:26.5170030Z test_ops.py::TestCompositeComplianceCUDA::test_operator_equal_cuda_float32 PASSED [0.0062s] [ 37%] 2025-12-04T13:28:26.5170146Z test_ops.py::TestCompositeComplianceCUDA::test_operator_erf_cuda_float32 PASSED [0.0028s] [ 37%] 2025-12-04T13:28:26.5170260Z test_ops.py::TestCompositeComplianceCUDA::test_operator_exp2_cuda_float32 PASSED [0.0040s] [ 37%] 2025-12-04T13:28:26.5170378Z test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_hfft_cuda_float32 PASSED [0.0104s] [ 37%] 2025-12-04T13:28:26.5170497Z test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_ifft_cuda_float32 PASSED [0.0104s] [ 37%] 2025-12-04T13:28:26.5170616Z test_ops.py::TestCompositeComplianceCUDA::test_operator_fft_rfft2_cuda_float32 PASSED [0.0082s] [ 38%] 2025-12-04T13:28:26.5170734Z test_ops.py::TestCompositeComplianceCUDA::test_operator_fliplr_cuda_float32 PASSED [0.0032s] [ 38%] 2025-12-04T13:28:26.5170847Z test_ops.py::TestCompositeComplianceCUDA::test_operator_fmax_cuda_float32 PASSED [0.0115s] [ 38%] 2025-12-04T13:28:26.5170970Z test_ops.py::TestCompositeComplianceCUDA::test_operator_geometric_cuda_float32 PASSED [0.0121s] [ 38%] 2025-12-04T13:28:26.5171083Z test_ops.py::TestCompositeComplianceCUDA::test_operator_gt_cuda_float32 PASSED [0.0084s] [ 38%] 2025-12-04T13:28:26.5171201Z test_ops.py::TestCompositeComplianceCUDA::test_operator_hstack_cuda_float32 PASSED [0.0037s] [ 38%] 2025-12-04T13:28:26.5171312Z test_ops.py::TestCompositeComplianceCUDA::test_operator_i0_cuda_float32 PASSED [1.2858s] [ 38%] 2025-12-04T13:28:26.5171445Z test_ops.py::TestCompositeComplianceCUDA::test_operator_index_reduce_amax_cuda_float32 PASSED [0.0219s] [ 38%] 2025-12-04T13:28:26.5171576Z test_ops.py::TestCompositeComplianceCUDA::test_operator_index_reduce_amin_cuda_float32 PASSED [0.0196s] [ 38%] 2025-12-04T13:28:26.5171691Z test_ops.py::TestCompositeComplianceCUDA::test_operator_int_cuda_float32 PASSED [1.2731s] [ 38%] 2025-12-04T13:28:26.5171809Z test_ops.py::TestCompositeComplianceCUDA::test_operator_isneginf_cuda_float32 PASSED [0.0044s] [ 38%] 2025-12-04T13:28:26.5171978Z test_ops.py::TestCompositeComplianceCUDA::test_operator_item_cuda_float32 XFAIL [0.0044s] [ 38%] 2025-12-04T13:28:26.5172094Z test_ops.py::TestCompositeComplianceCUDA::test_operator_ldexp_cuda_float32 PASSED [1.2758s] [ 38%] 2025-12-04T13:28:26.5172224Z test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_cholesky_cuda_float32 PASSED [0.0184s] [ 38%] 2025-12-04T13:28:26.5172349Z test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_eigh_cuda_float32 PASSED [0.0158s] [ 38%] 2025-12-04T13:28:26.5172478Z test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_eigvalsh_cuda_float32 PASSED [0.0118s] [ 38%] 2025-12-04T13:28:26.5172709Z test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_householder_product_cuda_float32 SKIPPED [0.0007s] (skipCUDAIfRocm: test doesn't currently work on the ROCm stack) [ 38%] 2025-12-04T13:28:26.5172836Z test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_inv_ex_cuda_float32 PASSED [0.0115s] [ 38%] 2025-12-04T13:28:26.5172974Z test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_ldl_factor_ex_cuda_float32 PASSED [0.0064s] [ 38%] 2025-12-04T13:28:26.5173108Z test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_lu_cuda_float32 PASSED [0.0681s] [ 38%] 2025-12-04T13:28:26.5173320Z test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_pinv_singular_cuda_float32 SKIPPED [0.0006s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 38%] 2025-12-04T13:28:26.5173461Z test_ops.py::TestCompositeComplianceCUDA::test_operator_linalg_vander_cuda_float32 PASSED [0.0149s] [ 38%] 2025-12-04T13:28:26.5173579Z test_ops.py::TestCompositeComplianceCUDA::test_operator_log1p_cuda_float32 PASSED [1.2625s] [ 38%] 2025-12-04T13:28:26.5173712Z test_ops.py::TestCompositeComplianceCUDA::test_operator_logaddexp_cuda_float32 PASSED [0.0142s] [ 38%] 2025-12-04T13:28:26.5173829Z test_ops.py::TestCompositeComplianceCUDA::test_operator_logdet_cuda_float32 PASSED [1.2789s] [ 38%] 2025-12-04T13:28:26.5173972Z test_ops.py::TestCompositeComplianceCUDA::test_operator_logspace_tensor_overload_cuda_float32 PASSED [1.2292s] [ 39%] 2025-12-04T13:28:26.5174099Z test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_cumprod_cuda_float32 PASSED [0.0346s] [ 39%] 2025-12-04T13:28:26.5174224Z test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_fill_cuda_float32 PASSED [0.0169s] [ 39%] 2025-12-04T13:28:26.5174347Z test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_norm_cuda_float32 PASSED [0.8221s] [ 39%] 2025-12-04T13:28:26.5174473Z test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_prod_cuda_float32 PASSED [0.1724s] [ 39%] 2025-12-04T13:28:26.5174597Z test_ops.py::TestCompositeComplianceCUDA::test_operator_masked_select_cuda_float32 PASSED [0.0135s] [ 39%] 2025-12-04T13:28:26.5174722Z test_ops.py::TestCompositeComplianceCUDA::test_operator_matrix_exp_cuda_float32 PASSED [0.0097s] [ 39%] 2025-12-04T13:28:26.5174863Z test_ops.py::TestCompositeComplianceCUDA::test_operator_max_reduction_with_dim_cuda_float32 PASSED [0.0057s] [ 39%] 2025-12-04T13:28:26.5174985Z test_ops.py::TestCompositeComplianceCUDA::test_operator_min_binary_cuda_float32 PASSED [0.0120s] [ 39%] 2025-12-04T13:28:26.5175105Z test_ops.py::TestCompositeComplianceCUDA::test_operator_movedim_cuda_float32 PASSED [1.2392s] [ 39%] 2025-12-04T13:28:26.5175220Z test_ops.py::TestCompositeComplianceCUDA::test_operator_mv_cuda_float32 PASSED [0.0050s] [ 39%] 2025-12-04T13:28:26.5175359Z test_ops.py::TestCompositeComplianceCUDA::test_operator_mvlgamma_mvlgamma_p_3_cuda_float32 PASSED [0.0112s] [ 39%] 2025-12-04T13:28:26.5175485Z test_ops.py::TestCompositeComplianceCUDA::test_operator_narrow_copy_cuda_float32 PASSED [1.2738s] [ 39%] 2025-12-04T13:28:26.5175627Z test_ops.py::TestCompositeComplianceCUDA::test_operator_native_dropout_backward_cuda_float32 PASSED [0.0297s] [ 39%] 2025-12-04T13:28:26.5175744Z test_ops.py::TestCompositeComplianceCUDA::test_operator_ne_cuda_float32 PASSED [0.0089s] [ 39%] 2025-12-04T13:28:26.5175872Z test_ops.py::TestCompositeComplianceCUDA::test_operator_neg_cuda_float32 PASSED [1.2859s] [ 39%] 2025-12-04T13:28:26.5176048Z test_ops.py::TestCompositeComplianceCUDA::test_operator_new_empty_cuda_float32 SKIPPED [0.0003s] (Expected: new_empty is not comparable) [ 39%] 2025-12-04T13:28:26.5176170Z test_ops.py::TestCompositeComplianceCUDA::test_operator_new_zeros_cuda_float32 PASSED [1.2542s] [ 39%] 2025-12-04T13:28:26.5176324Z test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_adaptive_avg_pool3d_cuda_float32 PASSED [0.0137s] [ 39%] 2025-12-04T13:28:26.5176466Z test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_avg_pool3d_cuda_float32 PASSED [0.0153s] [ 39%] 2025-12-04T13:28:26.5176637Z test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_binary_cross_entropy_with_logits_cuda_float32 PASSED [0.0631s] [ 39%] 2025-12-04T13:28:26.5176788Z test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_conv_transpose2d_cuda_float32 PASSED [0.0355s] [ 39%] 2025-12-04T13:28:26.5176945Z test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_cosine_embedding_loss_cuda_float32 PASSED [0.0540s] [ 39%] 2025-12-04T13:28:26.5177095Z test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_dropout_cuda_float32 PASSED [0.0283s] [ 39%] 2025-12-04T13:28:26.5177238Z test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_hardsigmoid_cuda_float32 PASSED [0.0042s] [ 39%] 2025-12-04T13:28:26.5177402Z test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_local_response_norm_cuda_float32 PASSED [0.0197s] [ 40%] 2025-12-04T13:28:26.5177547Z test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_max_unpool3d_cuda_float32 PASSED [0.1469s] [ 40%] 2025-12-04T13:28:26.5177697Z test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_mse_loss_cuda_float32 PASSED [0.0099s] [ 40%] 2025-12-04T13:28:26.5177835Z test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_nll_loss_cuda_float32 PASSED [0.1134s] [ 40%] 2025-12-04T13:28:26.5177976Z test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_normalize_cuda_float32 PASSED [0.0117s] [ 40%] 2025-12-04T13:28:26.5178123Z test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_pixel_shuffle_cuda_float32 PASSED [0.0054s] [ 40%] 2025-12-04T13:28:26.5178258Z test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_relu6_cuda_float32 PASSED [0.0042s] [ 40%] 2025-12-04T13:28:26.5178401Z test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_softshrink_cuda_float32 PASSED [0.0055s] [ 40%] 2025-12-04T13:28:26.5178543Z test_ops.py::TestCompositeComplianceCUDA::test_operator_nn_functional_tanhshrink_cuda_float32 PASSED [0.0045s] [ 40%] 2025-12-04T13:28:26.5178661Z test_ops.py::TestCompositeComplianceCUDA::test_operator_ormqr_cuda_float32 PASSED [0.2898s] [ 40%] 2025-12-04T13:28:26.5178786Z test_ops.py::TestCompositeComplianceCUDA::test_operator_pca_lowrank_cuda_float32 PASSED [0.2990s] [ 40%] 2025-12-04T13:28:26.5178931Z test_ops.py::TestCompositeComplianceCUDA::test_operator_polygamma_polygamma_n_1_cuda_float32 PASSED [0.0100s] [ 40%] 2025-12-04T13:28:26.5179046Z test_ops.py::TestCompositeComplianceCUDA::test_operator_pow_cuda_float32 PASSED [0.0123s] [ 40%] 2025-12-04T13:28:26.5179166Z test_ops.py::TestCompositeComplianceCUDA::test_operator_reshape_cuda_float32 PASSED [0.0073s] [ 40%] 2025-12-04T13:28:26.5179281Z test_ops.py::TestCompositeComplianceCUDA::test_operator_roll_cuda_float32 PASSED [1.2715s] [ 40%] 2025-12-04T13:28:26.5179414Z test_ops.py::TestCompositeComplianceCUDA::test_operator_round_decimals_0_cuda_float32 PASSED [0.0063s] [ 40%] 2025-12-04T13:28:26.5179551Z test_ops.py::TestCompositeComplianceCUDA::test_operator_round_decimals_neg_3_cuda_float32 PASSED [1.2498s] [ 40%] 2025-12-04T13:28:26.5179676Z test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_add_cuda_float32 PASSED [0.0208s] [ 40%] 2025-12-04T13:28:26.5179794Z test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_cuda_float32 PASSED [0.0583s] [ 40%] 2025-12-04T13:28:26.5179938Z test_ops.py::TestCompositeComplianceCUDA::test_operator_scatter_reduce_amin_cuda_float32 PASSED [0.0497s] [ 40%] 2025-12-04T13:28:26.5180056Z test_ops.py::TestCompositeComplianceCUDA::test_operator_select_cuda_float32 PASSED [0.0066s] [ 40%] 2025-12-04T13:28:26.5180206Z test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_general_hamming_cuda_float32 PASSED [0.0129s] [ 40%] 2025-12-04T13:28:26.5180345Z test_ops.py::TestCompositeComplianceCUDA::test_operator_signal_windows_nuttall_cuda_float32 PASSED [0.0124s] [ 40%] 2025-12-04T13:28:26.5180462Z test_ops.py::TestCompositeComplianceCUDA::test_operator_signbit_cuda_float32 PASSED [1.2501s] [ 40%] 2025-12-04T13:28:26.5180589Z test_ops.py::TestCompositeComplianceCUDA::test_operator_slice_scatter_cuda_float32 PASSED [0.0156s] [ 40%] 2025-12-04T13:28:26.5180740Z test_ops.py::TestCompositeComplianceCUDA::test_operator_special_chebyshev_polynomial_v_cuda_float32 PASSED [0.0146s] [ 41%] 2025-12-04T13:28:26.5180890Z test_ops.py::TestCompositeComplianceCUDA::test_operator_special_laguerre_polynomial_l_cuda_float32 PASSED [0.0136s] [ 41%] 2025-12-04T13:28:26.5181028Z test_ops.py::TestCompositeComplianceCUDA::test_operator_special_zeta_cuda_float32 PASSED [0.0121s] [ 41%] 2025-12-04T13:28:26.5181156Z test_ops.py::TestCompositeComplianceCUDA::test_operator_squeeze_copy_cuda_float32 PASSED [0.0078s] [ 41%] 2025-12-04T13:28:26.5181286Z test_ops.py::TestCompositeComplianceCUDA::test_operator_squeeze_multiple_cuda_float32 PASSED [0.0065s] [ 41%] 2025-12-04T13:28:26.5181411Z test_ops.py::TestCompositeComplianceCUDA::test_operator_sub_cuda_float32 PASSED [0.0144s] [ 41%] 2025-12-04T13:28:26.5181551Z test_ops.py::TestCompositeComplianceCUDA::test_operator_sum_to_size_cuda_float32 PASSED [0.0137s] [ 41%] 2025-12-04T13:28:26.5181672Z test_ops.py::TestCompositeComplianceCUDA::test_operator_svd_lowrank_cuda_float32 PASSED [0.4776s] [ 41%] 2025-12-04T13:28:26.5181787Z test_ops.py::TestCompositeComplianceCUDA::test_operator_tan_cuda_float32 PASSED [1.2482s] [ 41%] 2025-12-04T13:28:26.5181942Z test_ops.py::TestCompositeComplianceCUDA::test_operator_tanh_cuda_float32 PASSED [0.0046s] [ 41%] 2025-12-04T13:28:26.5182062Z test_ops.py::TestCompositeComplianceCUDA::test_operator_uniform_cuda_float32 PASSED [0.0072s] [ 41%] 2025-12-04T13:28:26.5182187Z test_ops.py::TestCompositeComplianceCUDA::test_operator_unsafe_split_cuda_float32 PASSED [0.0046s] [ 41%] 2025-12-04T13:28:26.5182308Z test_ops.py::TestCompositeComplianceCUDA::test_operator_unsqueeze_cuda_float32 PASSED [0.0088s] [ 41%] 2025-12-04T13:28:26.5182424Z test_ops.py::TestCompositeComplianceCUDA::test_operator_zero__cuda_float32 PASSED [0.0047s] [ 41%] 2025-12-04T13:28:26.5182547Z test_ops.py::TestCompositeComplianceCUDA::test_operator_zeros_like_cuda_float32 PASSED [0.0075s] [ 41%] 2025-12-04T13:28:26.5182674Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay___rmatmul___cuda_float32 PASSED [1.2453s] [ 41%] 2025-12-04T13:28:26.5182819Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay__segment_reduce_lengths_cuda_float32 PASSED [0.0216s] [ 41%] 2025-12-04T13:28:26.5182939Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_argmax_cuda_float32 PASSED [1.2314s] [ 41%] 2025-12-04T13:28:26.5183059Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_asinh_cuda_float32 PASSED [0.0119s] [ 41%] 2025-12-04T13:28:26.5183191Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_broadcast_shapes_cuda_float32 PASSED [0.0033s] [ 41%] 2025-12-04T13:28:26.5183309Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_byte_cuda_float32 PASSED [1.2395s] [ 41%] 2025-12-04T13:28:26.5183426Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cat_cuda_float32 PASSED [0.0044s] [ 41%] 2025-12-04T13:28:26.5183544Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_chunk_cuda_float32 PASSED [0.0087s] [ 41%] 2025-12-04T13:28:26.5183669Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_clamp_min_cuda_float32 PASSED [0.0035s] [ 41%] 2025-12-04T13:28:26.5183805Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_copysign_cuda_float32 PASSED [1.2583s] [ 41%] 2025-12-04T13:28:26.5183924Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cosh_cuda_float32 PASSED [0.0034s] [ 42%] 2025-12-04T13:28:26.5184041Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cross_cuda_float32 PASSED [1.2571s] [ 42%] 2025-12-04T13:28:26.5184160Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_cummax_cuda_float32 PASSED [0.0037s] [ 42%] 2025-12-04T13:28:26.5184277Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diag_cuda_float32 PASSED [1.2488s] [ 42%] 2025-12-04T13:28:26.5184413Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_diagonal_scatter_cuda_float32 PASSED [0.0054s] [ 42%] 2025-12-04T13:28:26.5184528Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_dist_cuda_float32 PASSED [1.2274s] [ 42%] 2025-12-04T13:28:26.5184667Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_div_no_rounding_mode_cuda_float32 PASSED [0.0051s] [ 42%] 2025-12-04T13:28:26.5184783Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_dot_cuda_float32 PASSED [1.2088s] [ 42%] 2025-12-04T13:28:26.5184915Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_empty_cuda_float32 PASSED [0.0036s] [ 42%] 2025-12-04T13:28:26.5185043Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_empty_strided_cuda_float32 PASSED [1.2366s] [ 42%] 2025-12-04T13:28:26.5185160Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_erfc_cuda_float32 PASSED [0.0035s] [ 42%] 2025-12-04T13:28:26.5185290Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_erfinv_cuda_float32 PASSED [1.2229s] [ 42%] 2025-12-04T13:28:26.5185419Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_exp2_cuda_float32 PASSED [0.0036s] [ 42%] 2025-12-04T13:28:26.5185538Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_expm1_cuda_float32 PASSED [1.2343s] [ 42%] 2025-12-04T13:28:26.5185667Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_exponential_cuda_float32 PASSED [0.0096s] [ 42%] 2025-12-04T13:28:26.5185786Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_eye_cuda_float32 PASSED [1.2363s] [ 42%] 2025-12-04T13:28:26.5185910Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_hfftn_cuda_float32 PASSED [0.0048s] [ 42%] 2025-12-04T13:28:26.5186035Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_fft_ihfft_cuda_float32 PASSED [1.2547s] [ 42%] 2025-12-04T13:28:26.5186156Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_flatten_cuda_float32 PASSED [0.0063s] [ 42%] 2025-12-04T13:28:26.5186283Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_float_power_cuda_float32 PASSED [0.0038s] [ 42%] 2025-12-04T13:28:26.5186400Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_full_cuda_float32 PASSED [1.2362s] [ 42%] 2025-12-04T13:28:26.5186522Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_full_like_cuda_float32 PASSED [0.0042s] [ 42%] 2025-12-04T13:28:26.5186639Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_ge_cuda_float32 PASSED [0.0036s] [ 42%] 2025-12-04T13:28:26.5186790Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_grid_sampler_3d_cuda_float32 SKIPPED [0.0001s] (Skipped!) [ 42%] 2025-12-04T13:28:26.5186914Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_heaviside_cuda_float32 PASSED [0.0033s] [ 42%] 2025-12-04T13:28:26.5187032Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_inner_cuda_float32 PASSED [1.2146s] [ 43%] 2025-12-04T13:28:26.5187148Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_int_cuda_float32 PASSED [0.0038s] [ 43%] 2025-12-04T13:28:26.5187270Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_isfinite_cuda_float32 PASSED [1.2281s] [ 43%] 2025-12-04T13:28:26.5187424Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_jiterator_binary_return_by_ref_cuda_float32 PASSED [0.0069s] [ 43%] 2025-12-04T13:28:26.5187550Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_le_cuda_float32 PASSED [0.0035s] [ 43%] 2025-12-04T13:28:26.5187668Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lerp_cuda_float32 PASSED [1.2430s] [ 43%] 2025-12-04T13:28:26.5187795Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_eig_cuda_float32 PASSED [0.0377s] [ 43%] 2025-12-04T13:28:26.5187919Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lu_cuda_float32 PASSED [0.0094s] [ 43%] 2025-12-04T13:28:26.5188057Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lu_factor_ex_cuda_float32 PASSED [0.0077s] [ 43%] 2025-12-04T13:28:26.5188191Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_lu_solve_cuda_float32 PASSED [0.0226s] [ 43%] 2025-12-04T13:28:26.5188327Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_matrix_rank_cuda_float32 PASSED [1.3083s] [ 43%] 2025-12-04T13:28:26.5188468Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_linalg_pinv_hermitian_cuda_float32 PASSED [0.0078s] [ 43%] 2025-12-04T13:28:26.5188587Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_log10_cuda_float32 PASSED [1.2645s] [ 43%] 2025-12-04T13:28:26.5188723Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logaddexp_cuda_float32 PASSED [0.0045s] [ 43%] 2025-12-04T13:28:26.5188850Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logical_and_cuda_float32 PASSED [0.0035s] [ 43%] 2025-12-04T13:28:26.5188995Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_logspace_tensor_overload_cuda_float32 PASSED [0.1750s] [ 43%] 2025-12-04T13:28:26.5189126Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_lu_solve_cuda_float32 PASSED [0.0127s] [ 43%] 2025-12-04T13:28:26.5189264Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_amax_cuda_float32 PASSED [0.0165s] [ 43%] 2025-12-04T13:28:26.5189389Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_amin_cuda_float32 PASSED [0.0162s] [ 43%] 2025-12-04T13:28:26.5189518Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_median_cuda_float32 PASSED [0.0054s] [ 43%] 2025-12-04T13:28:26.5189643Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_masked_std_cuda_float32 PASSED [0.0261s] [ 43%] 2025-12-04T13:28:26.5189789Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_meshgrid_list_of_tensors_cuda_float32 PASSED [1.2555s] [ 43%] 2025-12-04T13:28:26.5189911Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_minimum_cuda_float32 PASSED [0.0046s] [ 43%] 2025-12-04T13:28:26.5190031Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mode_cuda_float32 PASSED [1.4300s] [ 43%] 2025-12-04T13:28:26.5190155Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_movedim_cuda_float32 PASSED [1.2496s] [ 43%] 2025-12-04T13:28:26.5190274Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_msort_cuda_float32 PASSED [0.0039s] [ 44%] 2025-12-04T13:28:26.5190391Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_mul_cuda_float32 PASSED [0.0037s] [ 44%] 2025-12-04T13:28:26.5190511Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nanmean_cuda_float32 PASSED [1.2680s] [ 44%] 2025-12-04T13:28:26.5190631Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nansum_cuda_float32 PASSED [0.0072s] [ 44%] 2025-12-04T13:28:26.5190767Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_native_batch_norm_cuda_float32 PASSED [1.2623s] [ 44%] 2025-12-04T13:28:26.5190882Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_neg_cuda_float32 PASSED [0.0033s] [ 44%] 2025-12-04T13:28:26.5191017Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_new_empty_strided_cuda_float32 PASSED [1.2847s] [ 44%] 2025-12-04T13:28:26.5191170Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_channel_shuffle_cuda_float32 PASSED [0.0036s] [ 44%] 2025-12-04T13:28:26.5191312Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_dropout_cuda_float32 PASSED [1.2573s] [ 44%] 2025-12-04T13:28:26.5191473Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_embedding_bag_cuda_float32 PASSED [0.0122s] [ 44%] 2025-12-04T13:28:26.5191647Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_feature_alpha_dropout_with_train_cuda_float32 PASSED [1.2632s] [ 44%] 2025-12-04T13:28:26.5191821Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_feature_alpha_dropout_without_train_cuda_float32 PASSED [0.0057s] [ 44%] 2025-12-04T13:28:26.5192011Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hardsigmoid_cuda_float32 PASSED [1.2744s] [ 44%] 2025-12-04T13:28:26.5192155Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hardswish_cuda_float32 PASSED [0.0147s] [ 44%] 2025-12-04T13:28:26.5192300Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_hardtanh_cuda_float32 PASSED [1.2421s] [ 44%] 2025-12-04T13:28:26.5192460Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_local_response_norm_cuda_float32 PASSED [0.0051s] [ 44%] 2025-12-04T13:28:26.5192605Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_logsigmoid_cuda_float32 PASSED [1.2555s] [ 44%] 2025-12-04T13:28:26.5192773Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_multi_margin_loss_cuda_float32 PASSED [0.0102s] [ 44%] 2025-12-04T13:28:26.5192913Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_nll_loss_cuda_float32 PASSED [1.2310s] [ 44%] 2025-12-04T13:28:26.5193075Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pad_constant_cuda_float32 PASSED [0.0094s] [ 44%] 2025-12-04T13:28:26.5193236Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_pixel_shuffle_cuda_float32 PASSED [0.0028s] [ 44%] 2025-12-04T13:28:26.5193375Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_prelu_cuda_float32 PASSED [1.2384s] [ 44%] 2025-12-04T13:28:26.5193511Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_selu_cuda_float32 PASSED [0.0035s] [ 44%] 2025-12-04T13:28:26.5193646Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_silu_cuda_float32 PASSED [1.2286s] [ 44%] 2025-12-04T13:28:26.5193791Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_softshrink_cuda_float32 PASSED [0.0038s] [ 44%] 2025-12-04T13:28:26.5193935Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_nn_functional_tanhshrink_cuda_float32 PASSED [1.2414s] [ 45%] 2025-12-04T13:28:26.5194058Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_norm_inf_cuda_float32 PASSED [0.0038s] [ 45%] 2025-12-04T13:28:26.5194187Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_pca_lowrank_cuda_float32 PASSED [1.2686s] [ 45%] 2025-12-04T13:28:26.5194332Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_polygamma_polygamma_n_3_cuda_float32 PASSED [0.0044s] [ 45%] 2025-12-04T13:28:26.5194450Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_qr_cuda_float32 PASSED [0.0142s] [ 45%] 2025-12-04T13:28:26.5194577Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_resize_as__cuda_float32 PASSED [1.2398s] [ 45%] 2025-12-04T13:28:26.5194718Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_round_decimals_neg_3_cuda_float32 PASSED [0.0035s] [ 45%] 2025-12-04T13:28:26.5194857Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_scatter_reduce_prod_cuda_float32 PASSED [1.2431s] [ 45%] 2025-12-04T13:28:26.5194974Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sgn_cuda_float32 PASSED [0.0033s] [ 45%] 2025-12-04T13:28:26.5195117Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_blackman_cuda_float32 PASSED [1.2429s] [ 45%] 2025-12-04T13:28:26.5195269Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_general_cosine_cuda_float32 PASSED [0.0050s] [ 45%] 2025-12-04T13:28:26.5195407Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_signal_windows_hann_cuda_float32 PASSED [1.2264s] [ 45%] 2025-12-04T13:28:26.5195537Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sinh_cuda_float32 PASSED [0.0032s] [ 45%] 2025-12-04T13:28:26.5195660Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_softmax_cuda_float32 PASSED [1.2304s] [ 45%] 2025-12-04T13:28:26.5195793Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_bessel_y0_cuda_float32 PASSED [0.0049s] [ 45%] 2025-12-04T13:28:26.5195949Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_chebyshev_polynomial_u_cuda_float32 PASSED [0.0037s] [ 45%] 2025-12-04T13:28:26.5196076Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_entr_cuda_float32 PASSED [1.2363s] [ 45%] 2025-12-04T13:28:26.5196207Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_erfcx_cuda_float32 PASSED [0.0035s] [ 45%] 2025-12-04T13:28:26.5196354Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_modified_bessel_i1_cuda_float32 PASSED [1.2207s] [ 45%] 2025-12-04T13:28:26.5196483Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_ndtr_cuda_float32 PASSED [0.0037s] [ 45%] 2025-12-04T13:28:26.5196654Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_scaled_modified_bessel_k1_cuda_float32 PASSED [1.2483s] [ 45%] 2025-12-04T13:28:26.5196818Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_shifted_chebyshev_polynomial_v_cuda_float32 PASSED [0.0065s] [ 45%] 2025-12-04T13:28:26.5196999Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_shifted_chebyshev_polynomial_w_cuda_float32 PASSED [0.0046s] [ 45%] 2025-12-04T13:28:26.5197130Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_special_xlog1py_cuda_float32 PASSED [0.0034s] [ 45%] 2025-12-04T13:28:26.5197258Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_std_cuda_float32 PASSED [1.2535s] [ 45%] 2025-12-04T13:28:26.5197372Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sub_cuda_float32 PASSED [0.0048s] [ 46%] 2025-12-04T13:28:26.5197498Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_sum_to_size_cuda_float32 PASSED [1.2340s] [ 46%] 2025-12-04T13:28:26.5197613Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_svd_cuda_float32 PASSED [0.0674s] [ 46%] 2025-12-04T13:28:26.5197729Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tan_cuda_float32 PASSED [1.2534s] [ 46%] 2025-12-04T13:28:26.5197855Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tensordot_cuda_float32 PASSED [0.0038s] [ 46%] 2025-12-04T13:28:26.5197975Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_tile_cuda_float32 PASSED [0.0085s] [ 46%] 2025-12-04T13:28:26.5198093Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_to_cuda_float32 PASSED [1.2296s] [ 46%] 2025-12-04T13:28:26.5198351Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_torch_ops_aten__efficient_attention_forward_cuda_float32 SKIPPED [0.0012s] (Efficient attention on ROCM doesn't support custom_mask_type==2) [ 46%] 2025-12-04T13:28:26.5198471Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_trace_cuda_float32 PASSED [1.2233s] [ 46%] 2025-12-04T13:28:26.5198597Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_true_divide_cuda_float32 PASSED [0.0052s] [ 46%] 2025-12-04T13:28:26.5198717Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unique_cuda_float32 PASSED [0.1601s] [ 46%] 2025-12-04T13:28:26.5198845Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unsafe_split_cuda_float32 PASSED [1.2466s] [ 46%] 2025-12-04T13:28:26.5198977Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_unsqueeze_copy_cuda_float32 PASSED [0.0042s] [ 46%] 2025-12-04T13:28:26.5199098Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_view_as_cuda_float32 PASSED [1.2533s] [ 46%] 2025-12-04T13:28:26.5199215Z test_ops.py::TestCompositeComplianceCUDA::test_view_replay_xlogy_cuda_float32 PASSED [0.0046s] [ 46%] 2025-12-04T13:28:26.5199312Z test_ops.py::TestMathBitsCUDA::test_conj_view_H_cuda_complex64 PASSED [1.2635s] [ 46%] 2025-12-04T13:28:26.5199427Z test_ops.py::TestMathBitsCUDA::test_conj_view___radd___cuda_complex64 PASSED [0.0137s] [ 46%] 2025-12-04T13:28:26.5199531Z test_ops.py::TestMathBitsCUDA::test_conj_view___rpow___cuda_complex64 PASSED [0.0147s] [ 46%] 2025-12-04T13:28:26.5199652Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs__conversions_bool_cuda_complex64 PASSED [1.2513s] [ 46%] 2025-12-04T13:28:26.5199757Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_add_cuda_complex64 PASSED [0.0096s] [ 46%] 2025-12-04T13:28:26.5199867Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_addcdiv_cuda_complex64 PASSED [1.2530s] [ 46%] 2025-12-04T13:28:26.5200041Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_as_strided_copy_cuda_complex64 SKIPPED [0.0003s] (Errors when storage_offset is included) [ 46%] 2025-12-04T13:28:26.5200147Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_asin_cuda_complex64 PASSED [1.2505s] [ 46%] 2025-12-04T13:28:26.5200254Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_asinh_cuda_complex64 PASSED [0.0043s] [ 46%] 2025-12-04T13:28:26.5200365Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_atleast_2d_cuda_complex64 PASSED [0.0058s] [ 46%] 2025-12-04T13:28:26.5200483Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_cumprod_cuda_complex64 PASSED [1.2449s] [ 47%] 2025-12-04T13:28:26.5200588Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diag_cuda_complex64 PASSED [0.0084s] [ 47%] 2025-12-04T13:28:26.5200714Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_diagonal_copy_cuda_complex64 PASSED [0.0070s] [ 47%] 2025-12-04T13:28:26.5200889Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_empty_strided_cuda_complex64 SKIPPED [0.0002s] (Expected: empty_strided is not comparable) [ 47%] 2025-12-04T13:28:26.5201010Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_ifft2_cuda_complex64 PASSED [1.2250s] [ 47%] 2025-12-04T13:28:26.5201117Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_ifft_cuda_complex64 PASSED [0.0072s] [ 47%] 2025-12-04T13:28:26.5201228Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_ifftn_cuda_complex64 PASSED [1.2565s] [ 47%] 2025-12-04T13:28:26.5201339Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_fft_irfftn_cuda_complex64 PASSED [0.0078s] [ 47%] 2025-12-04T13:28:26.5201450Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_float_power_cuda_complex64 PASSED [0.0074s] [ 47%] 2025-12-04T13:28:26.5201560Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_isfinite_cuda_complex64 PASSED [1.2221s] [ 47%] 2025-12-04T13:28:26.5201666Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_isinf_cuda_complex64 PASSED [0.0039s] [ 47%] 2025-12-04T13:28:26.5201777Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linalg_svd_cuda_complex64 PASSED [1.4269s] [ 47%] 2025-12-04T13:28:26.5201927Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_linspace_cuda_complex64 XFAIL [0.0031s] [ 47%] 2025-12-04T13:28:26.5202054Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_log_softmax_with_dtype_cuda_complex64 PASSED [1.2520s] [ 47%] 2025-12-04T13:28:26.5202164Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logaddexp_cuda_complex64 PASSED [0.0104s] [ 47%] 2025-12-04T13:28:26.5202276Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_logical_xor_cuda_complex64 PASSED [0.0052s] [ 47%] 2025-12-04T13:28:26.5202385Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_masked_fill_cuda_complex64 PASSED [1.2439s] [ 47%] 2025-12-04T13:28:26.5202490Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_mean_cuda_complex64 PASSED [0.0153s] [ 47%] 2025-12-04T13:28:26.5202597Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_narrow_cuda_complex64 PASSED [1.2251s] [ 47%] 2025-12-04T13:28:26.5202723Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_l1_loss_cuda_complex64 PASSED [0.0073s] [ 47%] 2025-12-04T13:28:26.5202864Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_nn_functional_softmin_with_dtype_cuda_complex64 PASSED [1.2385s] [ 47%] 2025-12-04T13:28:26.5202985Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_ones_cuda_complex64 XFAIL [0.0034s] [ 47%] 2025-12-04T13:28:26.5203090Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_real_cuda_complex64 PASSED [1.2246s] [ 47%] 2025-12-04T13:28:26.5203198Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_renorm_cuda_complex64 PASSED [1.2323s] [ 47%] 2025-12-04T13:28:26.5203305Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_repeat_cuda_complex64 PASSED [0.0149s] [ 47%] 2025-12-04T13:28:26.5203416Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_reshape_as_cuda_complex64 PASSED [1.2600s] [ 48%] 2025-12-04T13:28:26.5203526Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_reshape_cuda_complex64 PASSED [0.0076s] [ 48%] 2025-12-04T13:28:26.5203648Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_softmax_with_dtype_cuda_complex64 PASSED [1.2677s] [ 48%] 2025-12-04T13:28:26.5203753Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_trace_cuda_complex64 PASSED [0.0039s] [ 48%] 2025-12-04T13:28:26.5203859Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_unbind_cuda_complex64 PASSED [1.2746s] [ 48%] 2025-12-04T13:28:26.5203966Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_view_as_cuda_complex64 PASSED [0.0050s] [ 48%] 2025-12-04T13:28:26.5204084Z test_ops.py::TestMathBitsCUDA::test_conj_view__refs_where_cuda_complex64 PASSED [1.2586s] [ 48%] 2025-12-04T13:28:26.5204182Z test_ops.py::TestMathBitsCUDA::test_conj_view_abs_cuda_complex64 PASSED [0.0049s] [ 48%] 2025-12-04T13:28:26.5204282Z test_ops.py::TestMathBitsCUDA::test_conj_view_addbmm_cuda_complex64 PASSED [1.2575s] [ 48%] 2025-12-04T13:28:26.5204397Z test_ops.py::TestMathBitsCUDA::test_conj_view_addmm_cuda_complex64 PASSED [0.0134s] [ 48%] 2025-12-04T13:28:26.5204523Z test_ops.py::TestMathBitsCUDA::test_conj_view_addmm_decomposed_cuda_complex64 PASSED [1.2548s] [ 48%] 2025-12-04T13:28:26.5204619Z test_ops.py::TestMathBitsCUDA::test_conj_view_all_cuda_complex64 PASSED [0.0076s] [ 48%] 2025-12-04T13:28:26.5204783Z test_ops.py::TestMathBitsCUDA::test_conj_view_as_strided_copy_cuda_complex64 SKIPPED [0.0002s] (Errors when storage_offset is included) [ 48%] 2025-12-04T13:28:26.5204881Z test_ops.py::TestMathBitsCUDA::test_conj_view_asinh_cuda_complex64 PASSED [1.2526s] [ 48%] 2025-12-04T13:28:26.5204988Z test_ops.py::TestMathBitsCUDA::test_conj_view_atleast_2d_cuda_complex64 PASSED [0.0105s] [ 48%] 2025-12-04T13:28:26.5205091Z test_ops.py::TestMathBitsCUDA::test_conj_view_baddbmm_cuda_complex64 PASSED [1.2677s] [ 48%] 2025-12-04T13:28:26.5205201Z test_ops.py::TestMathBitsCUDA::test_conj_view_cartesian_prod_cuda_complex64 XFAIL [0.0071s] [ 48%] 2025-12-04T13:28:26.5205301Z test_ops.py::TestMathBitsCUDA::test_conj_view_chunk_cuda_complex64 PASSED [2.5090s] [ 48%] 2025-12-04T13:28:26.5205400Z test_ops.py::TestMathBitsCUDA::test_conj_view_conj_cuda_complex64 PASSED [0.0072s] [ 48%] 2025-12-04T13:28:26.5205494Z test_ops.py::TestMathBitsCUDA::test_conj_view_cos_cuda_complex64 PASSED [1.2473s] [ 48%] 2025-12-04T13:28:26.5205592Z test_ops.py::TestMathBitsCUDA::test_conj_view_diag_cuda_complex64 PASSED [0.0203s] [ 48%] 2025-12-04T13:28:26.5205696Z test_ops.py::TestMathBitsCUDA::test_conj_view_diag_embed_cuda_complex64 PASSED [1.2612s] [ 48%] 2025-12-04T13:28:26.5205811Z test_ops.py::TestMathBitsCUDA::test_conj_view_empty_cuda_complex64 SKIPPED [0.0002s] (Skipped!) [ 48%] 2025-12-04T13:28:26.5205930Z test_ops.py::TestMathBitsCUDA::test_conj_view_empty_like_cuda_complex64 SKIPPED [0.0001s] (Skipped!) [ 48%] 2025-12-04T13:28:26.5206035Z test_ops.py::TestMathBitsCUDA::test_conj_view_expand_as_cuda_complex64 PASSED [1.2306s] [ 48%] 2025-12-04T13:28:26.5206137Z test_ops.py::TestMathBitsCUDA::test_conj_view_fft_hfft2_cuda_complex64 PASSED [0.0211s] [ 49%] 2025-12-04T13:28:26.5206242Z test_ops.py::TestMathBitsCUDA::test_conj_view_fft_ifftn_cuda_complex64 PASSED [1.2637s] [ 49%] 2025-12-04T13:28:26.5206345Z test_ops.py::TestMathBitsCUDA::test_conj_view_fft_irfft2_cuda_complex64 PASSED [1.2412s] [ 49%] 2025-12-04T13:28:26.5206446Z test_ops.py::TestMathBitsCUDA::test_conj_view_fliplr_cuda_complex64 PASSED [0.0061s] [ 49%] 2025-12-04T13:28:26.5206554Z test_ops.py::TestMathBitsCUDA::test_conj_view_float_cuda_complex64 PASSED [0.0061s] [ 49%] 2025-12-04T13:28:26.5206658Z test_ops.py::TestMathBitsCUDA::test_conj_view_full_like_cuda_complex64 PASSED [1.2306s] [ 49%] 2025-12-04T13:28:26.5206754Z test_ops.py::TestMathBitsCUDA::test_conj_view_half_cuda_complex64 PASSED [0.0077s] [ 49%] 2025-12-04T13:28:26.5206858Z test_ops.py::TestMathBitsCUDA::test_conj_view_index_add_cuda_complex64 PASSED [1.2189s] [ 49%] 2025-12-04T13:28:26.5206963Z test_ops.py::TestMathBitsCUDA::test_conj_view_index_put_cuda_complex64 PASSED [0.0107s] [ 49%] 2025-12-04T13:28:26.5207061Z test_ops.py::TestMathBitsCUDA::test_conj_view_inner_cuda_complex64 PASSED [1.2417s] [ 49%] 2025-12-04T13:28:26.5207157Z test_ops.py::TestMathBitsCUDA::test_conj_view_int_cuda_complex64 PASSED [0.0047s] [ 49%] 2025-12-04T13:28:26.5207255Z test_ops.py::TestMathBitsCUDA::test_conj_view_isinf_cuda_complex64 PASSED [1.2380s] [ 49%] 2025-12-04T13:28:26.5207356Z test_ops.py::TestMathBitsCUDA::test_conj_view_isreal_cuda_complex64 PASSED [0.0041s] [ 49%] 2025-12-04T13:28:26.5207466Z test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_eigvals_cuda_complex64 PASSED [0.0753s] [ 49%] 2025-12-04T13:28:26.5207584Z test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_inv_cuda_complex64 PASSED [0.0127s] [ 49%] 2025-12-04T13:28:26.5207693Z test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_inv_ex_cuda_complex64 PASSED [0.0072s] [ 49%] 2025-12-04T13:28:26.5207901Z test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_ldl_solve_cuda_complex64 SKIPPED [0.0006s] (skipCUDAIfRocm: test doesn't currently work on the ROCm stack) [ 49%] 2025-12-04T13:28:26.5208017Z test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lstsq_cuda_complex64 PASSED [1.0011s] [ 49%] 2025-12-04T13:28:26.5208144Z test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lstsq_grad_oriented_cuda_complex64 PASSED [0.2279s] [ 49%] 2025-12-04T13:28:26.5208256Z test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_lu_factor_cuda_complex64 PASSED [0.0362s] [ 49%] 2025-12-04T13:28:26.5208371Z test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_matrix_rank_cuda_complex64 PASSED [0.0448s] [ 49%] 2025-12-04T13:28:26.5208499Z test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_matrix_rank_hermitian_cuda_complex64 PASSED [0.0068s] [ 49%] 2025-12-04T13:28:26.5208605Z test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_pinv_cuda_complex64 PASSED [0.7660s] [ 49%] 2025-12-04T13:28:26.5208709Z test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_qr_cuda_complex64 PASSED [0.0265s] [ 49%] 2025-12-04T13:28:26.5208817Z test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_solve_cuda_complex64 PASSED [0.0329s] [ 50%] 2025-12-04T13:28:26.5208931Z test_ops.py::TestMathBitsCUDA::test_conj_view_linalg_tensorinv_cuda_complex64 PASSED [0.7388s] [ 50%] 2025-12-04T13:28:26.5209032Z test_ops.py::TestMathBitsCUDA::test_conj_view_linspace_cuda_complex64 XFAIL [0.0029s] [ 50%] 2025-12-04T13:28:26.5209130Z test_ops.py::TestMathBitsCUDA::test_conj_view_log_cuda_complex64 PASSED [0.7449s] [ 50%] 2025-12-04T13:28:26.5209249Z test_ops.py::TestMathBitsCUDA::test_conj_view_log_softmax_with_dtype_cuda_complex64 PASSED [0.0091s] [ 50%] 2025-12-04T13:28:26.5209356Z test_ops.py::TestMathBitsCUDA::test_conj_view_logical_and_cuda_complex64 PASSED [0.0042s] [ 50%] 2025-12-04T13:28:26.5209461Z test_ops.py::TestMathBitsCUDA::test_conj_view_logical_not_cuda_complex64 PASSED [0.7290s] [ 50%] 2025-12-04T13:28:26.5209566Z test_ops.py::TestMathBitsCUDA::test_conj_view_logical_or_cuda_complex64 PASSED [0.0084s] [ 50%] 2025-12-04T13:28:26.5209675Z test_ops.py::TestMathBitsCUDA::test_conj_view_masked_select_cuda_complex64 PASSED [0.7516s] [ 50%] 2025-12-04T13:28:26.5209804Z test_ops.py::TestMathBitsCUDA::test_conj_view_meshgrid_variadic_tensors_cuda_complex64 PASSED [0.0093s] [ 50%] 2025-12-04T13:28:26.5209899Z test_ops.py::TestMathBitsCUDA::test_conj_view_mm_cuda_complex64 PASSED [0.0072s] [ 50%] 2025-12-04T13:28:26.5210029Z test_ops.py::TestMathBitsCUDA::test_conj_view_new_empty_cuda_complex64 SKIPPED [0.0001s] (Skipped!) [ 50%] 2025-12-04T13:28:26.5210159Z test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_channel_shuffle_cuda_complex64 PASSED [0.7344s] [ 50%] 2025-12-04T13:28:26.5210287Z test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pad_constant_cuda_complex64 PASSED [0.0497s] [ 50%] 2025-12-04T13:28:26.5210421Z test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_pairwise_distance_cuda_complex64 PASSED [0.7444s] [ 50%] 2025-12-04T13:28:26.5210542Z test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_rms_norm_cuda_complex64 PASSED [0.0133s] [ 50%] 2025-12-04T13:28:26.5210678Z test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_triplet_margin_loss_cuda_complex64 PASSED [0.7397s] [ 50%] 2025-12-04T13:28:26.5210797Z test_ops.py::TestMathBitsCUDA::test_conj_view_nn_functional_unfold_cuda_complex64 PASSED [0.1498s] [ 50%] 2025-12-04T13:28:26.5210900Z test_ops.py::TestMathBitsCUDA::test_conj_view_norm_fro_cuda_complex64 PASSED [0.7365s] [ 50%] 2025-12-04T13:28:26.5211004Z test_ops.py::TestMathBitsCUDA::test_conj_view_norm_inf_cuda_complex64 PASSED [0.0081s] [ 50%] 2025-12-04T13:28:26.5211125Z test_ops.py::TestMathBitsCUDA::test_conj_view_ones_like_cuda_complex64 PASSED [0.7416s] [ 50%] 2025-12-04T13:28:26.5211223Z test_ops.py::TestMathBitsCUDA::test_conj_view_ormqr_cuda_complex64 PASSED [0.2652s] [ 50%] 2025-12-04T13:28:26.5211330Z test_ops.py::TestMathBitsCUDA::test_conj_view_permute_copy_cuda_complex64 PASSED [0.7430s] [ 50%] 2025-12-04T13:28:26.5211436Z test_ops.py::TestMathBitsCUDA::test_conj_view_prod_cuda_complex64 PASSED [0.0494s] [ 50%] 2025-12-04T13:28:26.5211540Z test_ops.py::TestMathBitsCUDA::test_conj_view_rand_like_cuda_complex64 PASSED [0.7466s] [ 50%] 2025-12-04T13:28:26.5211657Z test_ops.py::TestMathBitsCUDA::test_conj_view_reciprocal_cuda_complex64 PASSED [0.0078s] [ 51%] 2025-12-04T13:28:26.5211759Z test_ops.py::TestMathBitsCUDA::test_conj_view_renorm_cuda_complex64 PASSED [0.7351s] [ 51%] 2025-12-04T13:28:26.5211883Z test_ops.py::TestMathBitsCUDA::test_conj_view_rot90_cuda_complex64 PASSED [0.0376s] [ 51%] 2025-12-04T13:28:26.5211981Z test_ops.py::TestMathBitsCUDA::test_conj_view_rsqrt_cuda_complex64 PASSED [0.7399s] [ 51%] 2025-12-04T13:28:26.5212108Z test_ops.py::TestMathBitsCUDA::test_conj_view_scalar_tensor_cuda_complex64 SKIPPED [0.0002s] (Skipped!) [ 51%] 2025-12-04T13:28:26.5212206Z test_ops.py::TestMathBitsCUDA::test_conj_view_slice_cuda_complex64 PASSED [0.7410s] [ 51%] 2025-12-04T13:28:26.5212324Z test_ops.py::TestMathBitsCUDA::test_conj_view_split_with_sizes_copy_cuda_complex64 PASSED [0.0061s] [ 51%] 2025-12-04T13:28:26.5212439Z test_ops.py::TestMathBitsCUDA::test_conj_view_squeeze_multiple_cuda_complex64 PASSED [0.7450s] [ 51%] 2025-12-04T13:28:26.5212547Z test_ops.py::TestMathBitsCUDA::test_conj_view_std_unbiased_cuda_complex64 PASSED [0.0059s] [ 51%] 2025-12-04T13:28:26.5212644Z test_ops.py::TestMathBitsCUDA::test_conj_view_sub_cuda_complex64 PASSED [0.7585s] [ 51%] 2025-12-04T13:28:26.5212742Z test_ops.py::TestMathBitsCUDA::test_conj_view_sum_cuda_complex64 PASSED [0.0238s] [ 51%] 2025-12-04T13:28:26.5212836Z test_ops.py::TestMathBitsCUDA::test_conj_view_svd_cuda_complex64 PASSED [0.8956s] [ 51%] 2025-12-04T13:28:26.5212932Z test_ops.py::TestMathBitsCUDA::test_conj_view_t_cuda_complex64 PASSED [0.0079s] [ 51%] 2025-12-04T13:28:26.5213029Z test_ops.py::TestMathBitsCUDA::test_conj_view_tile_cuda_complex64 PASSED [0.7668s] [ 51%] 2025-12-04T13:28:26.5213137Z test_ops.py::TestMathBitsCUDA::test_conj_view_true_divide_cuda_complex64 PASSED [0.0167s] [ 51%] 2025-12-04T13:28:26.5213243Z test_ops.py::TestMathBitsCUDA::test_conj_view_unflatten_cuda_complex64 PASSED [0.0113s] [ 51%] 2025-12-04T13:28:26.5213344Z test_ops.py::TestMathBitsCUDA::test_conj_view_unfold_cuda_complex64 PASSED [0.7690s] [ 51%] 2025-12-04T13:28:26.5213443Z test_ops.py::TestMathBitsCUDA::test_conj_view_uniform_cuda_complex64 XFAIL [0.0050s] [ 51%] 2025-12-04T13:28:26.5213545Z test_ops.py::TestMathBitsCUDA::test_conj_view_var_mean_cuda_complex64 PASSED [1.4814s] [ 51%] 2025-12-04T13:28:26.5213656Z test_ops.py::TestMathBitsCUDA::test_conj_view_vdot_cuda_complex64 PASSED [0.0069s] [ 51%] 2025-12-04T13:28:26.5213758Z test_ops.py::TestMathBitsCUDA::test_conj_view_view_as_cuda_complex64 PASSED [0.7425s] [ 51%] 2025-12-04T13:28:26.5213924Z test_ops.py::TestMathBitsCUDA::test_conj_view_view_as_real_cuda_complex64 SKIPPED [0.0015s] (Operation doesn't support conjugated inputs.) [ 51%] 2025-12-04T13:28:26.5214022Z test_ops.py::TestMathBitsCUDA::test_conj_view_zero__cuda_complex64 PASSED [0.7454s] [ 51%] 2025-12-04T13:28:26.5214134Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view___getitem___cuda_complex128 PASSED [0.0121s] [ 51%] 2025-12-04T13:28:26.5214244Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rmul___cuda_complex128 PASSED [0.7521s] [ 51%] 2025-12-04T13:28:26.5214351Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view___rsub___cuda_complex128 PASSED [0.0049s] [ 52%] 2025-12-04T13:28:26.5214462Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_acosh_cuda_complex128 PASSED [0.7314s] [ 52%] 2025-12-04T13:28:26.5214576Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_addcdiv_cuda_complex128 PASSED [0.0040s] [ 52%] 2025-12-04T13:28:26.5214760Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_as_strided_cuda_complex128 SKIPPED [0.0002s] (Errors when storage_offset is included) [ 52%] 2025-12-04T13:28:26.5214951Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_as_strided_partial_views_cuda_complex128 SKIPPED [0.0001s] (Errors when storage_offset is included) [ 52%] 2025-12-04T13:28:26.5215074Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_atan_cuda_complex128 PASSED [0.0048s] [ 52%] 2025-12-04T13:28:26.5215206Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_atleast_1d_cuda_complex128 PASSED [0.7340s] [ 52%] 2025-12-04T13:28:26.5215323Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_block_diag_cuda_complex128 PASSED [0.0038s] [ 52%] 2025-12-04T13:28:26.5215448Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_constant_pad_nd_cuda_complex128 PASSED [0.7484s] [ 52%] 2025-12-04T13:28:26.5215556Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_cosh_cuda_complex128 PASSED [0.0057s] [ 52%] 2025-12-04T13:28:26.5215679Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_count_nonzero_cuda_complex128 PASSED [0.7239s] [ 52%] 2025-12-04T13:28:26.5215794Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_diagonal_cuda_complex128 PASSED [0.0038s] [ 52%] 2025-12-04T13:28:26.5215903Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_dot_cuda_complex128 PASSED [0.7379s] [ 52%] 2025-12-04T13:28:26.5216015Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_dsplit_cuda_complex128 PASSED [0.0041s] [ 52%] 2025-12-04T13:28:26.5216175Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_empty_cuda_complex128 SKIPPED [0.0002s] (Expected: empty is not comparable) [ 52%] 2025-12-04T13:28:26.5216292Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_expand_as_cuda_complex128 PASSED [0.7338s] [ 52%] 2025-12-04T13:28:26.5216401Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_expm1_cuda_complex128 PASSED [0.0041s] [ 52%] 2025-12-04T13:28:26.5216517Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_hfft2_cuda_complex128 PASSED [1.5875s] [ 52%] 2025-12-04T13:28:26.5216629Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fft_ifft_cuda_complex128 PASSED [1.3339s] [ 52%] 2025-12-04T13:28:26.5216740Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_fliplr_cuda_complex128 PASSED [0.7484s] [ 52%] 2025-12-04T13:28:26.5216851Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_hstack_cuda_complex128 PASSED [0.0043s] [ 52%] 2025-12-04T13:28:26.5216968Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_index_fill_cuda_complex128 PASSED [0.7443s] [ 52%] 2025-12-04T13:28:26.5217076Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_item_cuda_complex128 PASSED [0.0036s] [ 52%] 2025-12-04T13:28:26.5217216Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_matrix_norm_cuda_complex128 PASSED [0.7467s] [ 52%] 2025-12-04T13:28:26.5217335Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_norm_cuda_complex128 PASSED [0.0039s] [ 52%] 2025-12-04T13:28:26.5217460Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_svdvals_cuda_complex128 PASSED [0.7870s] [ 53%] 2025-12-04T13:28:26.5217588Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linalg_vector_norm_cuda_complex128 PASSED [0.7370s] [ 53%] 2025-12-04T13:28:26.5217729Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_linspace_tensor_overload_cuda_complex128 PASSED [0.0034s] [ 53%] 2025-12-04T13:28:26.5217843Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_logspace_cuda_complex128 XFAIL [0.0023s] [ 53%] 2025-12-04T13:28:26.5217957Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_movedim_cuda_complex128 PASSED [0.7230s] [ 53%] 2025-12-04T13:28:26.5218072Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_new_ones_cuda_complex128 PASSED [0.0035s] [ 53%] 2025-12-04T13:28:26.5218216Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_pixel_shuffle_cuda_complex128 PASSED [0.7426s] [ 53%] 2025-12-04T13:28:26.5218377Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_nn_functional_softmax_with_dtype_cuda_complex128 PASSED [0.0036s] [ 53%] 2025-12-04T13:28:26.5218528Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_randn_cuda_complex128 SKIPPED [0.0002s] (Test expects tensor input) [ 53%] 2025-12-04T13:28:26.5218651Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_renorm_cuda_complex128 PASSED [0.7322s] [ 53%] 2025-12-04T13:28:26.5218759Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_roll_cuda_complex128 PASSED [0.0035s] [ 53%] 2025-12-04T13:28:26.5218881Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_rsqrt_cuda_complex128 PASSED [0.7409s] [ 53%] 2025-12-04T13:28:26.5218989Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sin_cuda_complex128 PASSED [0.0057s] [ 53%] 2025-12-04T13:28:26.5219131Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_special_softmax_with_dtype_cuda_complex128 PASSED [0.7393s] [ 53%] 2025-12-04T13:28:26.5219241Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_stack_cuda_complex128 PASSED [0.0036s] [ 53%] 2025-12-04T13:28:26.5219349Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_sub_cuda_complex128 PASSED [0.7291s] [ 53%] 2025-12-04T13:28:26.5219457Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_tan_cuda_complex128 PASSED [0.0039s] [ 53%] 2025-12-04T13:28:26.5219565Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_to_cuda_complex128 PASSED [0.7451s] [ 53%] 2025-12-04T13:28:26.5219682Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_transpose_cuda_complex128 PASSED [0.0035s] [ 53%] 2025-12-04T13:28:26.5219790Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_triu_cuda_complex128 PASSED [0.7239s] [ 53%] 2025-12-04T13:28:26.5219907Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unflatten_cuda_complex128 PASSED [0.0036s] [ 53%] 2025-12-04T13:28:26.5220019Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unfold_cuda_complex128 PASSED [0.7268s] [ 53%] 2025-12-04T13:28:26.5220144Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_unsqueeze_copy_cuda_complex128 PASSED [0.0037s] [ 53%] 2025-12-04T13:28:26.5220252Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_view_cuda_complex128 PASSED [0.7239s] [ 53%] 2025-12-04T13:28:26.5220366Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_vsplit_cuda_complex128 PASSED [0.0039s] [ 53%] 2025-12-04T13:28:26.5220476Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__refs_vstack_cuda_complex128 PASSED [0.7272s] [ 54%] 2025-12-04T13:28:26.5220601Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view__unsafe_masked_index_cuda_complex128 PASSED [0.0062s] [ 54%] 2025-12-04T13:28:26.5220705Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_acos_cuda_complex128 PASSED [0.7402s] [ 54%] 2025-12-04T13:28:26.5220827Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_argwhere_cuda_complex128 PASSED [0.0035s] [ 54%] 2025-12-04T13:28:26.5220942Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_as_strided_copy_cuda_complex128 PASSED [0.7339s] [ 54%] 2025-12-04T13:28:26.5221049Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_asinh_cuda_complex128 PASSED [0.0053s] [ 54%] 2025-12-04T13:28:26.5221159Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_atleast_3d_cuda_complex128 PASSED [0.7289s] [ 54%] 2025-12-04T13:28:26.5221269Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_broadcast_to_cuda_complex128 PASSED [0.0048s] [ 54%] 2025-12-04T13:28:26.5221380Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_contiguous_cuda_complex128 PASSED [0.7394s] [ 54%] 2025-12-04T13:28:26.5221493Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_corrcoef_cuda_complex128 PASSED [0.0065s] [ 54%] 2025-12-04T13:28:26.5221613Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_diagonal_scatter_cuda_complex128 PASSED [0.7351s] [ 54%] 2025-12-04T13:28:26.5221735Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_div_no_rounding_mode_cuda_complex128 PASSED [0.0051s] [ 54%] 2025-12-04T13:28:26.5221841Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_dstack_cuda_complex128 PASSED [0.7310s] [ 54%] 2025-12-04T13:28:26.5222016Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_empty_cuda_complex128 SKIPPED [0.0002s] (Skipped!) [ 54%] 2025-12-04T13:28:26.5222120Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_exp2_cuda_complex128 PASSED [0.7248s] [ 54%] 2025-12-04T13:28:26.5222243Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_expand_as_cuda_complex128 PASSED [0.7400s] [ 54%] 2025-12-04T13:28:26.5222350Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_fftn_cuda_complex128 PASSED [0.3246s] [ 54%] 2025-12-04T13:28:26.5222470Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_fft_ifftn_cuda_complex128 PASSED [0.7336s] [ 54%] 2025-12-04T13:28:26.5222577Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_flatten_cuda_complex128 PASSED [0.0045s] [ 54%] 2025-12-04T13:28:26.5222684Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_float_cuda_complex128 PASSED [0.7363s] [ 54%] 2025-12-04T13:28:26.5222793Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_full_like_cuda_complex128 PASSED [0.0037s] [ 54%] 2025-12-04T13:28:26.5222898Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_hstack_cuda_complex128 PASSED [0.7418s] [ 54%] 2025-12-04T13:28:26.5223006Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_index_add_cuda_complex128 PASSED [0.0053s] [ 54%] 2025-12-04T13:28:26.5223119Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_index_select_cuda_complex128 PASSED [0.7387s] [ 54%] 2025-12-04T13:28:26.5223224Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_inner_cuda_complex128 PASSED [0.0054s] [ 54%] 2025-12-04T13:28:26.5223330Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_isreal_cuda_complex128 PASSED [0.7514s] [ 55%] 2025-12-04T13:28:26.5223434Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_istft_cuda_complex128 PASSED [0.3110s] [ 55%] 2025-12-04T13:28:26.5223571Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_jiterator_2inputs_2outputs_cuda_complex128 XFAIL [0.0046s] [ 55%] 2025-12-04T13:28:26.5223675Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ldexp_cuda_complex128 XFAIL [0.0028s] [ 55%] 2025-12-04T13:28:26.5223789Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_cross_cuda_complex128 PASSED [1.4703s] [ 55%] 2025-12-04T13:28:26.5223907Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_diagonal_cuda_complex128 PASSED [0.0048s] [ 55%] 2025-12-04T13:28:26.5224017Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_eig_cuda_complex128 PASSED [0.8678s] [ 55%] 2025-12-04T13:28:26.5224135Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_eigvalsh_cuda_complex128 PASSED [0.8014s] [ 55%] 2025-12-04T13:28:26.5224246Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lu_cuda_complex128 PASSED [0.8388s] [ 55%] 2025-12-04T13:28:26.5224362Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_lu_solve_cuda_complex128 PASSED [0.8660s] [ 55%] 2025-12-04T13:28:26.5224490Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_norm_cuda_complex128 PASSED [0.7923s] [ 55%] 2025-12-04T13:28:26.5224632Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_norm_subgradients_at_zero_cuda_complex128 PASSED [0.0043s] [ 55%] 2025-12-04T13:28:26.5224742Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_pinv_cuda_complex128 PASSED [0.8440s] [ 55%] 2025-12-04T13:28:26.5224851Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_qr_cuda_complex128 PASSED [0.7950s] [ 55%] 2025-12-04T13:28:26.5224965Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_linalg_vander_cuda_complex128 PASSED [0.0056s] [ 55%] 2025-12-04T13:28:26.5225069Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_log_cuda_complex128 PASSED [0.7857s] [ 55%] 2025-12-04T13:28:26.5225173Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_logdet_cuda_complex128 PASSED [0.0055s] [ 55%] 2025-12-04T13:28:26.5225293Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_normalize_cuda_complex128 PASSED [0.7974s] [ 55%] 2025-12-04T13:28:26.5225403Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_prod_cuda_complex128 PASSED [0.0052s] [ 55%] 2025-12-04T13:28:26.5225531Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_scatter_cuda_complex128 PASSED [0.7974s] [ 55%] 2025-12-04T13:28:26.5225640Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_masked_std_cuda_complex128 PASSED [0.0083s] [ 55%] 2025-12-04T13:28:26.5225750Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_matrix_exp_cuda_complex128 PASSED [0.8185s] [ 55%] 2025-12-04T13:28:26.5225860Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_mm_cuda_complex128 PASSED [0.0058s] [ 55%] 2025-12-04T13:28:26.5225973Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_ne_cuda_complex128 PASSED [0.8089s] [ 55%] 2025-12-04T13:28:26.5226097Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_new_empty_cuda_complex128 SKIPPED [0.0002s] (Skipped!) [ 55%] 2025-12-04T13:28:26.5226261Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_feature_alpha_dropout_without_train_cuda_complex128 PASSED [0.8181s] [ 56%] 2025-12-04T13:28:26.5226391Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_normalize_cuda_complex128 PASSED [0.0059s] [ 56%] 2025-12-04T13:28:26.5226518Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_rms_norm_cuda_complex128 PASSED [0.8143s] [ 56%] 2025-12-04T13:28:26.5226659Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_softmin_with_dtype_cuda_complex128 PASSED [0.0044s] [ 56%] 2025-12-04T13:28:26.5226820Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nn_functional_triplet_margin_with_distance_loss_cuda_complex128 PASSED [0.8130s] [ 56%] 2025-12-04T13:28:26.5226967Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_nonzero_static_cuda_complex128 SKIPPED [0.0010s] (Only runs on cpu) [ 56%] 2025-12-04T13:28:26.5227073Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_norm_cuda_complex128 PASSED [0.7937s] [ 56%] 2025-12-04T13:28:26.5227179Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_real_cuda_complex128 PASSED [0.0047s] [ 56%] 2025-12-04T13:28:26.5227289Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_reshape_as_cuda_complex128 PASSED [0.8011s] [ 56%] 2025-12-04T13:28:26.5227471Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_resize__cuda_complex128 SKIPPED [0.0015s] (Operation not tested with tensors with negative bit.) [ 56%] 2025-12-04T13:28:26.5227581Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_resize_as__cuda_complex128 PASSED [0.8066s] [ 56%] 2025-12-04T13:28:26.5227686Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_roll_cuda_complex128 PASSED [0.0047s] [ 56%] 2025-12-04T13:28:26.5227790Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_rot90_cuda_complex128 PASSED [0.7989s] [ 56%] 2025-12-04T13:28:26.5227893Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_rsub_cuda_complex128 PASSED [0.0051s] [ 56%] 2025-12-04T13:28:26.5227998Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_select_cuda_complex128 PASSED [0.8036s] [ 56%] 2025-12-04T13:28:26.5228117Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sigmoid_cuda_complex128 PASSED [0.1959s] [ 56%] 2025-12-04T13:28:26.5228222Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_split_cuda_complex128 PASSED [0.7985s] [ 56%] 2025-12-04T13:28:26.5228337Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_squeeze_copy_cuda_complex128 PASSED [0.0049s] [ 56%] 2025-12-04T13:28:26.5228449Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_std_unbiased_cuda_complex128 PASSED [0.7702s] [ 56%] 2025-12-04T13:28:26.5228552Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_sub_cuda_complex128 PASSED [0.0050s] [ 56%] 2025-12-04T13:28:26.5228664Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_tensor_split_cuda_complex128 PASSED [0.7300s] [ 56%] 2025-12-04T13:28:26.5228765Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_to_cuda_complex128 PASSED [0.0043s] [ 56%] 2025-12-04T13:28:26.5228875Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_true_divide_cuda_complex128 PASSED [0.7504s] [ 56%] 2025-12-04T13:28:26.5228986Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unbind_copy_cuda_complex128 PASSED [0.0042s] [ 56%] 2025-12-04T13:28:26.5229107Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unfold_copy_cuda_complex128 PASSED [0.7282s] [ 56%] 2025-12-04T13:28:26.5229212Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unfold_cuda_complex128 PASSED [0.0049s] [ 57%] 2025-12-04T13:28:26.5229318Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_uniform_cuda_complex128 XFAIL [0.0038s] [ 57%] 2025-12-04T13:28:26.5229440Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_unsafe_split_cuda_complex128 PASSED [0.7252s] [ 57%] 2025-12-04T13:28:26.5229544Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_vdot_cuda_complex128 PASSED [0.0054s] [ 57%] 2025-12-04T13:28:26.5229726Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_view_as_real_cuda_complex128 SKIPPED [0.0011s] (Operation doesn't support conjugated inputs.) [ 57%] 2025-12-04T13:28:26.5229833Z test_ops.py::TestMathBitsCUDA::test_neg_conj_view_zero__cuda_complex128 PASSED [0.7355s] [ 57%] 2025-12-04T13:28:26.5229931Z test_ops.py::TestMathBitsCUDA::test_neg_view___rdiv___cuda_float64 PASSED [0.0138s] [ 57%] 2025-12-04T13:28:26.5230030Z test_ops.py::TestMathBitsCUDA::test_neg_view___rmul___cuda_float64 PASSED [0.0102s] [ 57%] 2025-12-04T13:28:26.5230126Z test_ops.py::TestMathBitsCUDA::test_neg_view___rpow___cuda_float64 PASSED [0.0128s] [ 57%] 2025-12-04T13:28:26.5230227Z test_ops.py::TestMathBitsCUDA::test_neg_view__chunk_cat_cuda_float64 PASSED [0.0053s] [ 57%] 2025-12-04T13:28:26.5230351Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_bfloat16_cuda_float64 PASSED [0.7443s] [ 57%] 2025-12-04T13:28:26.5230470Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_bool_cuda_float64 PASSED [0.0044s] [ 57%] 2025-12-04T13:28:26.5230587Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_chalf_cuda_float64 PASSED [0.7319s] [ 57%] 2025-12-04T13:28:26.5230705Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_double_cuda_float64 PASSED [0.0050s] [ 57%] 2025-12-04T13:28:26.5230822Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_float_cuda_float64 PASSED [0.7522s] [ 57%] 2025-12-04T13:28:26.5230937Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs__conversions_half_cuda_float64 PASSED [0.0049s] [ 57%] 2025-12-04T13:28:26.5231038Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_addr_cuda_float64 PASSED [0.7344s] [ 57%] 2025-12-04T13:28:26.5231146Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_block_diag_cuda_float64 PASSED [0.0053s] [ 57%] 2025-12-04T13:28:26.5231258Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_broadcast_to_cuda_float64 PASSED [0.0054s] [ 57%] 2025-12-04T13:28:26.5231361Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cauchy_cuda_float64 XFAIL [0.0030s] [ 57%] 2025-12-04T13:28:26.5231464Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_cumsum_cuda_float64 PASSED [0.7401s] [ 57%] 2025-12-04T13:28:26.5231580Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diagonal_cuda_float64 PASSED [0.0086s] [ 57%] 2025-12-04T13:28:26.5231697Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_diagonal_scatter_cuda_float64 PASSED [0.0084s] [ 57%] 2025-12-04T13:28:26.5231907Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_empty_like_cuda_float64 SKIPPED [0.0001s] (Expected: empty is not comparable) [ 57%] 2025-12-04T13:28:26.5232010Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_expand_cuda_float64 PASSED [0.7324s] [ 57%] 2025-12-04T13:28:26.5232115Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_hfft_cuda_float64 PASSED [1.0767s] [ 58%] 2025-12-04T13:28:26.5232219Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_hfftn_cuda_float64 PASSED [1.6127s] [ 58%] 2025-12-04T13:28:26.5232324Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifft_cuda_float64 PASSED [1.4628s] [ 58%] 2025-12-04T13:28:26.5232434Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ifftshift_cuda_float64 PASSED [0.7312s] [ 58%] 2025-12-04T13:28:26.5232541Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ihfft2_cuda_float64 PASSED [0.8848s] [ 58%] 2025-12-04T13:28:26.5232644Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_ihfft_cuda_float64 PASSED [0.7596s] [ 58%] 2025-12-04T13:28:26.5232763Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fft_irfft_cuda_float64 PASSED [0.7560s] [ 58%] 2025-12-04T13:28:26.5232863Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_fill_cuda_float64 PASSED [0.0047s] [ 58%] 2025-12-04T13:28:26.5232972Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_float_power_cuda_float64 PASSED [0.0068s] [ 58%] 2025-12-04T13:28:26.5233088Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_geometric_cuda_float64 XFAIL [0.0029s] [ 58%] 2025-12-04T13:28:26.5233198Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_gt_cuda_float64 PASSED [0.7336s] [ 58%] 2025-12-04T13:28:26.5233305Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_index_copy_cuda_float64 PASSED [0.7296s] [ 58%] 2025-12-04T13:28:26.5233411Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_isposinf_cuda_float64 PASSED [0.0035s] [ 58%] 2025-12-04T13:28:26.5233522Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_svdvals_cuda_float64 PASSED [0.1833s] [ 58%] 2025-12-04T13:28:26.5233641Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linalg_vector_norm_cuda_float64 PASSED [0.8029s] [ 58%] 2025-12-04T13:28:26.5233747Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_linspace_cuda_float64 XFAIL [0.0029s] [ 58%] 2025-12-04T13:28:26.5233870Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_log_softmax_with_dtype_cuda_float64 PASSED [0.7261s] [ 58%] 2025-12-04T13:28:26.5233975Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logaddexp_cuda_float64 PASSED [0.0093s] [ 58%] 2025-12-04T13:28:26.5234079Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_logspace_cuda_float64 XFAIL [0.0023s] [ 58%] 2025-12-04T13:28:26.5234178Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_lt_cuda_float64 PASSED [0.0058s] [ 58%] 2025-12-04T13:28:26.5234278Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_mean_cuda_float64 PASSED [0.7529s] [ 58%] 2025-12-04T13:28:26.5234382Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_movedim_cuda_float64 PASSED [0.0041s] [ 58%] 2025-12-04T13:28:26.5234499Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_native_layer_norm_cuda_float64 PASSED [0.0139s] [ 58%] 2025-12-04T13:28:26.5234597Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_ne_cuda_float64 PASSED [0.0055s] [ 58%] 2025-12-04T13:28:26.5234699Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_new_full_cuda_float64 PASSED [0.7221s] [ 58%] 2025-12-04T13:28:26.5234875Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_dropout_cuda_float64 SKIPPED [0.0002s] (Expected: dropout is not comparable) [ 59%] 2025-12-04T13:28:26.5235003Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_hardshrink_cuda_float64 PASSED [0.7385s] [ 59%] 2025-12-04T13:28:26.5235127Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_hardtanh_cuda_float64 PASSED [0.0051s] [ 59%] 2025-12-04T13:28:26.5235257Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_prelu_cuda_float64 PASSED [0.0143s] [ 59%] 2025-12-04T13:28:26.5235375Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_relu_cuda_float64 PASSED [0.7312s] [ 59%] 2025-12-04T13:28:26.5235513Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_softmin_with_dtype_cuda_float64 PASSED [0.0058s] [ 59%] 2025-12-04T13:28:26.5235634Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_softplus_cuda_float64 PASSED [0.7330s] [ 59%] 2025-12-04T13:28:26.5235760Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_softshrink_cuda_float64 PASSED [0.0052s] [ 59%] 2025-12-04T13:28:26.5235885Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_nn_functional_tanhshrink_cuda_float64 PASSED [0.7226s] [ 59%] 2025-12-04T13:28:26.5235986Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_prod_cuda_float64 PASSED [0.0205s] [ 59%] 2025-12-04T13:28:26.5236090Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_rad2deg_cuda_float64 PASSED [0.7248s] [ 59%] 2025-12-04T13:28:26.5236191Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_real_cuda_float64 PASSED [0.0042s] [ 59%] 2025-12-04T13:28:26.5236301Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sgn_cuda_float64 PASSED [0.7213s] [ 59%] 2025-12-04T13:28:26.5236436Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_log_softmax_with_dtype_cuda_float64 PASSED [0.0053s] [ 59%] 2025-12-04T13:28:26.5236560Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_special_xlog1py_cuda_float64 PASSED [0.0060s] [ 59%] 2025-12-04T13:28:26.5236670Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_squeeze_copy_cuda_float64 PASSED [0.7394s] [ 59%] 2025-12-04T13:28:26.5236791Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_std_mean_cuda_float64 PASSED [0.0106s] [ 59%] 2025-12-04T13:28:26.5236891Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sub_cuda_float64 PASSED [0.0097s] [ 59%] 2025-12-04T13:28:26.5236999Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_sum_to_size_cuda_float64 PASSED [0.7339s] [ 59%] 2025-12-04T13:28:26.5237111Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_transpose_copy_cuda_float64 PASSED [0.0059s] [ 59%] 2025-12-04T13:28:26.5237222Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unbind_copy_cuda_float64 PASSED [0.0066s] [ 59%] 2025-12-04T13:28:26.5237325Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unbind_cuda_float64 PASSED [0.7310s] [ 59%] 2025-12-04T13:28:26.5237435Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_unfold_copy_cuda_float64 PASSED [0.0101s] [ 59%] 2025-12-04T13:28:26.5237621Z test_ops.py::TestMathBitsCUDA::test_neg_view__refs_view_as_complex_cuda_float64 SKIPPED [0.0010s] (Operation not tested with tensors with negative bit.) [ 59%] 2025-12-04T13:28:26.5237758Z test_ops.py::TestMathBitsCUDA::test_neg_view__unsafe_masked_index_put_accumulate_cuda_float64 PASSED [0.7322s] [ 59%] 2025-12-04T13:28:26.5237853Z test_ops.py::TestMathBitsCUDA::test_neg_view_abs_cuda_float64 PASSED [0.0051s] [ 60%] 2025-12-04T13:28:26.5237953Z test_ops.py::TestMathBitsCUDA::test_neg_view_addcdiv_cuda_float64 PASSED [0.7572s] [ 60%] 2025-12-04T13:28:26.5238050Z test_ops.py::TestMathBitsCUDA::test_neg_view_addmv_cuda_float64 PASSED [0.0118s] [ 60%] 2025-12-04T13:28:26.5238146Z test_ops.py::TestMathBitsCUDA::test_neg_view_argmin_cuda_float64 PASSED [0.7429s] [ 60%] 2025-12-04T13:28:26.5238243Z test_ops.py::TestMathBitsCUDA::test_neg_view_argwhere_cuda_float64 PASSED [0.0051s] [ 60%] 2025-12-04T13:28:26.5238405Z test_ops.py::TestMathBitsCUDA::test_neg_view_as_strided_copy_cuda_float64 SKIPPED [0.0002s] (Errors when storage_offset is included) [ 60%] 2025-12-04T13:28:26.5238502Z test_ops.py::TestMathBitsCUDA::test_neg_view_asinh_cuda_float64 PASSED [0.7445s] [ 60%] 2025-12-04T13:28:26.5238603Z test_ops.py::TestMathBitsCUDA::test_neg_view_atleast_2d_cuda_float64 PASSED [0.0098s] [ 60%] 2025-12-04T13:28:26.5238704Z test_ops.py::TestMathBitsCUDA::test_neg_view_atleast_3d_cuda_float64 PASSED [0.7350s] [ 60%] 2025-12-04T13:28:26.5238812Z test_ops.py::TestMathBitsCUDA::test_neg_view_baddbmm_cuda_float64 PASSED [0.0127s] [ 60%] 2025-12-04T13:28:26.5238914Z test_ops.py::TestMathBitsCUDA::test_neg_view_bernoulli_cuda_float64 PASSED [0.7483s] [ 60%] 2025-12-04T13:28:26.5239011Z test_ops.py::TestMathBitsCUDA::test_neg_view_bfloat16_cuda_float64 PASSED [0.0086s] [ 60%] 2025-12-04T13:28:26.5239106Z test_ops.py::TestMathBitsCUDA::test_neg_view_byte_cuda_float64 PASSED [0.7312s] [ 60%] 2025-12-04T13:28:26.5239200Z test_ops.py::TestMathBitsCUDA::test_neg_view_chunk_cuda_float64 PASSED [0.0057s] [ 60%] 2025-12-04T13:28:26.5239294Z test_ops.py::TestMathBitsCUDA::test_neg_view_clone_cuda_float64 PASSED [0.7344s] [ 60%] 2025-12-04T13:28:26.5239389Z test_ops.py::TestMathBitsCUDA::test_neg_view_conj_cuda_float64 PASSED [0.0063s] [ 60%] 2025-12-04T13:28:26.5239489Z test_ops.py::TestMathBitsCUDA::test_neg_view_corrcoef_cuda_float64 PASSED [0.7512s] [ 60%] 2025-12-04T13:28:26.5239584Z test_ops.py::TestMathBitsCUDA::test_neg_view_cos_cuda_float64 PASSED [0.0074s] [ 60%] 2025-12-04T13:28:26.5239678Z test_ops.py::TestMathBitsCUDA::test_neg_view_cross_cuda_float64 PASSED [0.7560s] [ 60%] 2025-12-04T13:28:26.5239788Z test_ops.py::TestMathBitsCUDA::test_neg_view_cummin_cuda_float64 PASSED [0.0044s] [ 60%] 2025-12-04T13:28:26.5239904Z test_ops.py::TestMathBitsCUDA::test_neg_view_cumulative_trapezoid_cuda_float64 PASSED [0.7539s] [ 60%] 2025-12-04T13:28:26.5240000Z test_ops.py::TestMathBitsCUDA::test_neg_view_deg2rad_cuda_float64 PASSED [0.0049s] [ 60%] 2025-12-04T13:28:26.5240109Z test_ops.py::TestMathBitsCUDA::test_neg_view_diagflat_cuda_float64 PASSED [0.7513s] [ 60%] 2025-12-04T13:28:26.5240215Z test_ops.py::TestMathBitsCUDA::test_neg_view_diagonal_copy_cuda_float64 PASSED [0.0177s] [ 60%] 2025-12-04T13:28:26.5240326Z test_ops.py::TestMathBitsCUDA::test_neg_view_diagonal_cuda_float64 PASSED [0.7535s] [ 60%] 2025-12-04T13:28:26.5240419Z test_ops.py::TestMathBitsCUDA::test_neg_view_dot_cuda_float64 PASSED [0.0052s] [ 61%] 2025-12-04T13:28:26.5240543Z test_ops.py::TestMathBitsCUDA::test_neg_view_empty_strided_cuda_float64 SKIPPED [0.0002s] (Skipped!) [ 61%] 2025-12-04T13:28:26.5240637Z test_ops.py::TestMathBitsCUDA::test_neg_view_eq_cuda_float64 PASSED [0.7594s] [ 61%] 2025-12-04T13:28:26.5240731Z test_ops.py::TestMathBitsCUDA::test_neg_view_erfc_cuda_float64 PASSED [0.0085s] [ 61%] 2025-12-04T13:28:26.5240834Z test_ops.py::TestMathBitsCUDA::test_neg_view_expand_copy_cuda_float64 PASSED [0.7489s] [ 61%] 2025-12-04T13:28:26.5240934Z test_ops.py::TestMathBitsCUDA::test_neg_view_fft_irfft_cuda_float64 PASSED [0.6588s] [ 61%] 2025-12-04T13:28:26.5241032Z test_ops.py::TestMathBitsCUDA::test_neg_view_fft_rfftn_cuda_float64 PASSED [3.4638s] [ 61%] 2025-12-04T13:28:26.5241138Z test_ops.py::TestMathBitsCUDA::test_neg_view_floor_divide_cuda_float64 PASSED [0.0078s] [ 61%] 2025-12-04T13:28:26.5241233Z test_ops.py::TestMathBitsCUDA::test_neg_view_frexp_cuda_float64 PASSED [0.0033s] [ 61%] 2025-12-04T13:28:26.5241330Z test_ops.py::TestMathBitsCUDA::test_neg_view_geometric_cuda_float64 XFAIL [0.0033s] [ 61%] 2025-12-04T13:28:26.5241453Z test_ops.py::TestMathBitsCUDA::test_neg_view_grid_sampler_3d_cuda_float64 SKIPPED [0.0001s] (Skipped!) [ 61%] 2025-12-04T13:28:26.5241556Z test_ops.py::TestMathBitsCUDA::test_neg_view_index_fill_cuda_float64 PASSED [0.7881s] [ 61%] 2025-12-04T13:28:26.5241669Z test_ops.py::TestMathBitsCUDA::test_neg_view_index_reduce_amin_cuda_float64 PASSED [0.0152s] [ 61%] 2025-12-04T13:28:26.5241766Z test_ops.py::TestMathBitsCUDA::test_neg_view_isclose_cuda_float64 PASSED [0.7905s] [ 61%] 2025-12-04T13:28:26.5241899Z test_ops.py::TestMathBitsCUDA::test_neg_view_isinf_cuda_float64 PASSED [0.0035s] [ 61%] 2025-12-04T13:28:26.5241993Z test_ops.py::TestMathBitsCUDA::test_neg_view_isnan_cuda_float64 PASSED [0.7895s] [ 61%] 2025-12-04T13:28:26.5242089Z test_ops.py::TestMathBitsCUDA::test_neg_view_isreal_cuda_float64 PASSED [0.0042s] [ 61%] 2025-12-04T13:28:26.5242210Z test_ops.py::TestMathBitsCUDA::test_neg_view_jiterator_2inputs_2outputs_cuda_float64 XFAIL [0.0045s] [ 61%] 2025-12-04T13:28:26.5242319Z test_ops.py::TestMathBitsCUDA::test_neg_view_le_cuda_float64 PASSED [0.0043s] [ 61%] 2025-12-04T13:28:26.5242424Z test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_cross_cuda_float64 PASSED [0.7927s] [ 61%] 2025-12-04T13:28:26.5242525Z test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_eig_cuda_float64 PASSED [0.0775s] [ 61%] 2025-12-04T13:28:26.5242632Z test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_lu_solve_cuda_float64 PASSED [0.3867s] [ 61%] 2025-12-04T13:28:26.5242759Z test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_matrix_rank_hermitian_cuda_float64 PASSED [0.0078s] [ 61%] 2025-12-04T13:28:26.5242870Z test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_multi_dot_cuda_float64 PASSED [0.0108s] [ 61%] 2025-12-04T13:28:26.5242975Z test_ops.py::TestMathBitsCUDA::test_neg_view_linalg_vecdot_cuda_float64 PASSED [1.3636s] [ 61%] 2025-12-04T13:28:26.5243074Z test_ops.py::TestMathBitsCUDA::test_neg_view_linspace_cuda_float64 XFAIL [0.0031s] [ 62%] 2025-12-04T13:28:26.5243174Z test_ops.py::TestMathBitsCUDA::test_neg_view_logaddexp_cuda_float64 PASSED [0.0140s] [ 62%] 2025-12-04T13:28:26.5243294Z test_ops.py::TestMathBitsCUDA::test_neg_view_logical_and_cuda_float64 PASSED [0.0042s] [ 62%] 2025-12-04T13:28:26.5243395Z test_ops.py::TestMathBitsCUDA::test_neg_view_logical_or_cuda_float64 PASSED [0.0042s] [ 62%] 2025-12-04T13:28:26.5243499Z test_ops.py::TestMathBitsCUDA::test_neg_view_logical_xor_cuda_float64 PASSED [0.0041s] [ 62%] 2025-12-04T13:28:26.5243608Z test_ops.py::TestMathBitsCUDA::test_neg_view_logspace_cuda_float64 XFAIL [0.0021s] [ 62%] 2025-12-04T13:28:26.5243701Z test_ops.py::TestMathBitsCUDA::test_neg_view_lt_cuda_float64 PASSED [0.0041s] [ 62%] 2025-12-04T13:28:26.5243806Z test_ops.py::TestMathBitsCUDA::test_neg_view_mH_cuda_float64 PASSED [1.2942s] [ 62%] 2025-12-04T13:28:26.5243909Z test_ops.py::TestMathBitsCUDA::test_neg_view_masked_fill_cuda_float64 PASSED [0.0181s] [ 62%] 2025-12-04T13:28:26.5244019Z test_ops.py::TestMathBitsCUDA::test_neg_view_masked_log_softmax_cuda_float64 PASSED [0.0255s] [ 62%] 2025-12-04T13:28:26.5244130Z test_ops.py::TestMathBitsCUDA::test_neg_view_masked_logaddexp_cuda_float64 PASSED [0.0168s] [ 62%] 2025-12-04T13:28:26.5244232Z test_ops.py::TestMathBitsCUDA::test_neg_view_masked_var_cuda_float64 PASSED [0.0973s] [ 62%] 2025-12-04T13:28:26.5244328Z test_ops.py::TestMathBitsCUDA::test_neg_view_matmul_cuda_float64 PASSED [1.2733s] [ 62%] 2025-12-04T13:28:26.5244426Z test_ops.py::TestMathBitsCUDA::test_neg_view_max_binary_cuda_float64 PASSED [0.0155s] [ 62%] 2025-12-04T13:28:26.5244541Z test_ops.py::TestMathBitsCUDA::test_neg_view_max_reduction_no_dim_cuda_float64 PASSED [1.2873s] [ 62%] 2025-12-04T13:28:26.5244658Z test_ops.py::TestMathBitsCUDA::test_neg_view_min_reduction_with_dim_cuda_float64 PASSED [0.0054s] [ 62%] 2025-12-04T13:28:26.5244775Z test_ops.py::TestMathBitsCUDA::test_neg_view_mvlgamma_mvlgamma_p_3_cuda_float64 PASSED [1.2714s] [ 62%] 2025-12-04T13:28:26.5244949Z test_ops.py::TestMathBitsCUDA::test_neg_view_new_empty_strided_cuda_float64 SKIPPED [0.0002s] (Expected: new_empty_strided is not comparable) [ 62%] 2025-12-04T13:28:26.5245047Z test_ops.py::TestMathBitsCUDA::test_neg_view_new_ones_cuda_float64 PASSED [1.2671s] [ 62%] 2025-12-04T13:28:26.5245149Z test_ops.py::TestMathBitsCUDA::test_neg_view_nextafter_cuda_float64 PASSED [0.0073s] [ 62%] 2025-12-04T13:28:26.5245282Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_adaptive_avg_pool3d_cuda_float64 PASSED [1.2779s] [ 62%] 2025-12-04T13:28:26.5245403Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_batch_norm_cuda_float64 PASSED [0.0165s] [ 62%] 2025-12-04T13:28:26.5245530Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_conv_transpose3d_cuda_float64 PASSED [1.2756s] [ 62%] 2025-12-04T13:28:26.5245650Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_dropout2d_cuda_float64 PASSED [0.0242s] [ 62%] 2025-12-04T13:28:26.5245872Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_fractional_max_pool3d_cuda_float64 SKIPPED [0.0011s] (Operation not tested with tensors with negative bit.) [ 62%] 2025-12-04T13:28:26.5245994Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hardsigmoid_cuda_float64 PASSED [1.2695s] [ 63%] 2025-12-04T13:28:26.5246110Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_hardtanh_cuda_float64 PASSED [0.0088s] [ 63%] 2025-12-04T13:28:26.5246230Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_huber_loss_cuda_float64 PASSED [1.2687s] [ 63%] 2025-12-04T13:28:26.5246357Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_area_cuda_float64 PASSED [0.0260s] [ 63%] 2025-12-04T13:28:26.5246494Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_interpolate_trilinear_cuda_float64 PASSED [1.3088s] [ 63%] 2025-12-04T13:28:26.5246613Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_layer_norm_cuda_float64 PASSED [0.0100s] [ 63%] 2025-12-04T13:28:26.5246745Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_local_response_norm_cuda_float64 PASSED [0.0171s] [ 63%] 2025-12-04T13:28:26.5246855Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_mish_cuda_float64 PASSED [1.2706s] [ 63%] 2025-12-04T13:28:26.5246999Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_multi_margin_loss_cuda_float64 PASSED [0.0140s] [ 63%] 2025-12-04T13:28:26.5247117Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_normalize_cuda_float64 PASSED [1.2746s] [ 63%] 2025-12-04T13:28:26.5247238Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pad_replicate_cuda_float64 PASSED [0.0229s] [ 63%] 2025-12-04T13:28:26.5247365Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pdist_cuda_float64 PASSED [1.2673s] [ 63%] 2025-12-04T13:28:26.5247496Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_pixel_shuffle_cuda_float64 PASSED [0.0084s] [ 63%] 2025-12-04T13:28:26.5247607Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_prelu_cuda_float64 PASSED [1.3045s] [ 63%] 2025-12-04T13:28:26.5247731Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_smooth_l1_loss_cuda_float64 PASSED [0.0130s] [ 63%] 2025-12-04T13:28:26.5247850Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softshrink_cuda_float64 PASSED [1.2768s] [ 63%] 2025-12-04T13:28:26.5247967Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_softsign_cuda_float64 PASSED [0.0077s] [ 63%] 2025-12-04T13:28:26.5248083Z test_ops.py::TestMathBitsCUDA::test_neg_view_nn_functional_threshold_cuda_float64 PASSED [1.2828s] [ 63%] 2025-12-04T13:28:26.5248184Z test_ops.py::TestMathBitsCUDA::test_neg_view_nonzero_cuda_float64 PASSED [0.0107s] [ 63%] 2025-12-04T13:28:26.5248279Z test_ops.py::TestMathBitsCUDA::test_neg_view_norm_cuda_float64 PASSED [1.3144s] [ 63%] 2025-12-04T13:28:26.5248377Z test_ops.py::TestMathBitsCUDA::test_neg_view_outer_cuda_float64 PASSED [0.0051s] [ 63%] 2025-12-04T13:28:26.5248481Z test_ops.py::TestMathBitsCUDA::test_neg_view_permute_copy_cuda_float64 PASSED [1.2560s] [ 63%] 2025-12-04T13:28:26.5248603Z test_ops.py::TestMathBitsCUDA::test_neg_view_polygamma_polygamma_n_3_cuda_float64 PASSED [0.0180s] [ 63%] 2025-12-04T13:28:26.5248696Z test_ops.py::TestMathBitsCUDA::test_neg_view_qr_cuda_float64 PASSED [0.0231s] [ 63%] 2025-12-04T13:28:26.5248798Z test_ops.py::TestMathBitsCUDA::test_neg_view_rand_like_cuda_float64 PASSED [1.2827s] [ 63%] 2025-12-04T13:28:26.5248892Z test_ops.py::TestMathBitsCUDA::test_neg_view_randn_cuda_float64 XFAIL [0.0040s] [ 64%] 2025-12-04T13:28:26.5248989Z test_ops.py::TestMathBitsCUDA::test_neg_view_renorm_cuda_float64 PASSED [2.5393s] [ 64%] 2025-12-04T13:28:26.5249125Z test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_hamming_cuda_float64 SKIPPED [0.0002s] (Skipped!) [ 64%] 2025-12-04T13:28:26.5249254Z test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_hann_cuda_float64 SKIPPED [0.0001s] (Skipped!) [ 64%] 2025-12-04T13:28:26.5249385Z test_ops.py::TestMathBitsCUDA::test_neg_view_signal_windows_kaiser_cuda_float64 SKIPPED [0.0001s] (Skipped!) [ 64%] 2025-12-04T13:28:26.5249502Z test_ops.py::TestMathBitsCUDA::test_neg_view_signbit_cuda_float64 PASSED [1.2882s] [ 64%] 2025-12-04T13:28:26.5249596Z test_ops.py::TestMathBitsCUDA::test_neg_view_sinh_cuda_float64 PASSED [0.0053s] [ 64%] 2025-12-04T13:28:26.5249690Z test_ops.py::TestMathBitsCUDA::test_neg_view_sort_cuda_float64 PASSED [1.3079s] [ 64%] 2025-12-04T13:28:26.5249798Z test_ops.py::TestMathBitsCUDA::test_neg_view_special_airy_ai_cuda_float64 PASSED [0.0058s] [ 64%] 2025-12-04T13:28:26.5249911Z test_ops.py::TestMathBitsCUDA::test_neg_view_special_bessel_y1_cuda_float64 PASSED [1.2572s] [ 64%] 2025-12-04T13:28:26.5250042Z test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_t_cuda_float64 PASSED [0.0088s] [ 64%] 2025-12-04T13:28:26.5250172Z test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_u_cuda_float64 PASSED [0.0062s] [ 64%] 2025-12-04T13:28:26.5250300Z test_ops.py::TestMathBitsCUDA::test_neg_view_special_chebyshev_polynomial_w_cuda_float64 PASSED [0.0057s] [ 64%] 2025-12-04T13:28:26.5250405Z test_ops.py::TestMathBitsCUDA::test_neg_view_special_i0e_cuda_float64 PASSED [1.2857s] [ 64%] 2025-12-04T13:28:26.5250547Z test_ops.py::TestMathBitsCUDA::test_neg_view_special_shifted_chebyshev_polynomial_w_cuda_float64 PASSED [0.0089s] [ 64%] 2025-12-04T13:28:26.5250681Z test_ops.py::TestMathBitsCUDA::test_neg_view_special_spherical_bessel_j0_cuda_float64 PASSED [1.2822s] [ 64%] 2025-12-04T13:28:26.5250777Z test_ops.py::TestMathBitsCUDA::test_neg_view_split_cuda_float64 PASSED [0.0046s] [ 64%] 2025-12-04T13:28:26.5250909Z test_ops.py::TestMathBitsCUDA::test_neg_view_split_list_args_cuda_float64 PASSED [1.2971s] [ 64%] 2025-12-04T13:28:26.5251019Z test_ops.py::TestMathBitsCUDA::test_neg_view_split_with_sizes_cuda_float64 PASSED [0.0059s] [ 64%] 2025-12-04T13:28:26.5251126Z test_ops.py::TestMathBitsCUDA::test_neg_view_sqrt_cuda_float64 PASSED [1.2603s] [ 64%] 2025-12-04T13:28:26.5251232Z test_ops.py::TestMathBitsCUDA::test_neg_view_squeeze_copy_cuda_float64 PASSED [0.0114s] [ 64%] 2025-12-04T13:28:26.5251329Z test_ops.py::TestMathBitsCUDA::test_neg_view_std_mean_cuda_float64 PASSED [1.2783s] [ 64%] 2025-12-04T13:28:26.5251438Z test_ops.py::TestMathBitsCUDA::test_neg_view_std_mean_unbiased_cuda_float64 PASSED [0.0046s] [ 64%] 2025-12-04T13:28:26.5251541Z test_ops.py::TestMathBitsCUDA::test_neg_view_std_unbiased_cuda_float64 PASSED [1.2920s] [ 64%] 2025-12-04T13:28:26.5251637Z test_ops.py::TestMathBitsCUDA::test_neg_view_t_copy_cuda_float64 PASSED [0.0070s] [ 64%] 2025-12-04T13:28:26.5251730Z test_ops.py::TestMathBitsCUDA::test_neg_view_tanh_cuda_float64 PASSED [1.2727s] [ 65%] 2025-12-04T13:28:26.5251823Z test_ops.py::TestMathBitsCUDA::test_neg_view_topk_cuda_float64 PASSED [1.3160s] [ 65%] 2025-12-04T13:28:26.5251994Z test_ops.py::TestMathBitsCUDA::test_neg_view_torch_ops_aten__safe_softmax_default_cuda_float64 PASSED [1.2804s] [ 65%] 2025-12-04T13:28:26.5252097Z test_ops.py::TestMathBitsCUDA::test_neg_view_true_divide_cuda_float64 PASSED [0.0151s] [ 65%] 2025-12-04T13:28:26.5252770Z test_ops.py::TestMathBitsCUDA::test_neg_view_unbind_copy_cuda_float64 PASSED [1.2939s] [ 65%] 2025-12-04T13:28:26.5253589Z test_ops.py::TestMathBitsCUDA::test_neg_view_unbind_cuda_float64 PASSED [0.0082s] [ 65%] 2025-12-04T13:28:26.5253843Z test_ops.py::TestMathBitsCUDA::test_neg_view_unflatten_cuda_float64 PASSED [0.0106s] [ 65%] 2025-12-04T13:28:26.5254074Z test_ops.py::TestMathBitsCUDA::test_neg_view_unsafe_chunk_cuda_float64 PASSED [1.2656s] [ 65%] 2025-12-04T13:28:26.5254286Z test_ops.py::TestMathBitsCUDA::test_neg_view_unsqueeze_cuda_float64 PASSED [0.0135s] [ 65%] 2025-12-04T13:28:26.5254495Z test_ops.py::TestMathBitsCUDA::test_neg_view_vstack_cuda_float64 PASSED [1.2801s] [ 65%] 2025-12-04T13:28:26.5254708Z test_ops.py::TestMathBitsCUDA::test_neg_view_where_cuda_float64 PASSED [0.0107s] [ 65%] 2025-12-04T13:28:26.5254903Z test_ops.py::TestMathBitsCUDA::test_neg_view_xlogy_cuda_float64 PASSED [0.0126s] [ 65%] 2025-12-04T13:28:26.5255119Z test_ops.py::TestFakeTensorCUDA::test_fake___getitem___cuda_float32 PASSED [0.0357s] [ 65%] 2025-12-04T13:28:26.5255838Z test_ops.py::TestFakeTensorCUDA::test_fake___rxor___cuda_int64 PASSED [0.0099s] [ 65%] 2025-12-04T13:28:26.5256102Z test_ops.py::TestFakeTensorCUDA::test_fake__segment_reduce_lengths_cuda_float32 PASSED [0.0768s] [ 65%] 2025-12-04T13:28:26.5256345Z test_ops.py::TestFakeTensorCUDA::test_fake__softmax_backward_data_cuda_float32 PASSED [0.0082s] [ 65%] 2025-12-04T13:28:26.5256552Z test_ops.py::TestFakeTensorCUDA::test_fake_acosh_cuda_float32 PASSED [0.0043s] [ 65%] 2025-12-04T13:28:26.5256757Z test_ops.py::TestFakeTensorCUDA::test_fake_add_cuda_float32 PASSED [0.0114s] [ 65%] 2025-12-04T13:28:26.5256965Z test_ops.py::TestFakeTensorCUDA::test_fake_addcmul_cuda_float32 PASSED [1.3024s] [ 65%] 2025-12-04T13:28:26.5257177Z test_ops.py::TestFakeTensorCUDA::test_fake_alias_copy_cuda_float32 PASSED [0.0058s] [ 65%] 2025-12-04T13:28:26.5257378Z test_ops.py::TestFakeTensorCUDA::test_fake_all_cuda_float32 PASSED [0.0203s] [ 65%] 2025-12-04T13:28:26.5257590Z test_ops.py::TestFakeTensorCUDA::test_fake_as_strided_cuda_float32 PASSED [0.0057s] [ 65%] 2025-12-04T13:28:26.5257841Z test_ops.py::TestFakeTensorCUDA::test_fake_as_strided_partial_views_cuda_float32 PASSED [0.0052s] [ 65%] 2025-12-04T13:28:26.5258195Z test_ops.py::TestFakeTensorCUDA::test_fake_as_strided_scatter_cuda_float32 PASSED [0.0076s] [ 65%] 2025-12-04T13:28:26.5258391Z test_ops.py::TestFakeTensorCUDA::test_fake_asin_cuda_float32 PASSED [1.2846s] [ 65%] 2025-12-04T13:28:26.5258603Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_H_cuda_float32 PASSED [0.0062s] [ 66%] 2025-12-04T13:28:26.5258918Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rmatmul___cuda_float32 PASSED [0.4064s] [ 66%] 2025-12-04T13:28:26.5259201Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast___rsub___cuda_float32 PASSED [0.0104s] [ 66%] 2025-12-04T13:28:26.5259474Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast__softmax_backward_data_cuda_float32 PASSED [0.0068s] [ 66%] 2025-12-04T13:28:26.5259791Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast__unsafe_masked_index_put_accumulate_cuda_float32 PASSED [0.0265s] [ 66%] 2025-12-04T13:28:26.5260008Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_abs_cuda_float32 PASSED [1.3001s] [ 66%] 2025-12-04T13:28:26.5260272Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addmm_decomposed_cuda_float32 PASSED [0.0329s] [ 66%] 2025-12-04T13:28:26.5260492Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_addr_cuda_float32 PASSED [0.0148s] [ 66%] 2025-12-04T13:28:26.5260715Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_all_cuda_float32 PASSED [0.0199s] [ 66%] 2025-12-04T13:28:26.5260929Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_any_cuda_float32 PASSED [0.0173s] [ 66%] 2025-12-04T13:28:26.5261185Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_as_strided_copy_cuda_float32 PASSED [0.0057s] [ 66%] 2025-12-04T13:28:26.5261421Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_atleast_3d_cuda_float32 PASSED [1.2727s] [ 66%] 2025-12-04T13:28:26.5261654Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_baddbmm_cuda_float32 PASSED [0.0176s] [ 66%] 2025-12-04T13:28:26.5261968Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_broadcast_to_cuda_float32 PASSED [0.0072s] [ 66%] 2025-12-04T13:28:26.5262186Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cat_cuda_float32 PASSED [0.0099s] [ 66%] 2025-12-04T13:28:26.5262411Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_chalf_cuda_float32 PASSED [0.0076s] [ 66%] 2025-12-04T13:28:26.5262648Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_clamp_max_cuda_float32 PASSED [0.0120s] [ 66%] 2025-12-04T13:28:26.5262873Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_conj_cuda_float32 PASSED [0.0034s] [ 66%] 2025-12-04T13:28:26.5263175Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_cumulative_trapezoid_cuda_float32 PASSED [0.0265s] [ 66%] 2025-12-04T13:28:26.5263403Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_digamma_cuda_float32 PASSED [0.0043s] [ 66%] 2025-12-04T13:28:26.5263690Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_div_trunc_rounding_cuda_float32 PASSED [0.0109s] [ 66%] 2025-12-04T13:28:26.5263917Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_empty_cuda_float32 PASSED [0.0054s] [ 66%] 2025-12-04T13:28:26.5264152Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_empty_like_cuda_float32 PASSED [0.0076s] [ 66%] 2025-12-04T13:28:26.5264374Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_exp2_cuda_float32 PASSED [0.0042s] [ 66%] 2025-12-04T13:28:26.5264599Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_expand_cuda_float32 PASSED [0.0083s] [ 67%] 2025-12-04T13:28:26.5264822Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_expm1_cuda_float32 PASSED [0.0029s] [ 67%] 2025-12-04T13:28:26.5265051Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_fft2_cuda_float32 PASSED [0.0101s] [ 67%] 2025-12-04T13:28:26.5265289Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ifftn_cuda_float32 PASSED [0.0122s] [ 67%] 2025-12-04T13:28:26.5265528Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_ifftshift_cuda_float32 PASSED [0.0068s] [ 67%] 2025-12-04T13:28:26.5265735Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fft_irfftn_cuda_float32 PASSED [1.2966s] [ 67%] 2025-12-04T13:28:26.5271311Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_float_cuda_float32 PASSED [0.0079s] [ 67%] 2025-12-04T13:28:26.5271476Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_fmin_cuda_float32 PASSED [0.0107s] [ 67%] 2025-12-04T13:28:26.5271647Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_full_like_cuda_float32 PASSED [0.0080s] [ 67%] 2025-12-04T13:28:26.5271839Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_geqrf_cuda_float32 PASSED [0.0437s] [ 67%] 2025-12-04T13:28:26.5272102Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_grid_sampler_2d_cuda_float32 PASSED [1.4352s] [ 67%] 2025-12-04T13:28:26.5272274Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_igammac_cuda_float32 PASSED [0.0108s] [ 67%] 2025-12-04T13:28:26.5272441Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_imag_cuda_complex64 PASSED [0.0052s] [ 67%] 2025-12-04T13:28:26.5272615Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isfinite_cuda_float32 PASSED [0.0064s] [ 67%] 2025-12-04T13:28:26.5272778Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isnan_cuda_float32 PASSED [1.3053s] [ 67%] 2025-12-04T13:28:26.5272945Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_isreal_cuda_float32 PASSED [0.0072s] [ 67%] 2025-12-04T13:28:26.5273103Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_item_cuda_float32 XFAIL [0.0038s] [ 67%] 2025-12-04T13:28:26.5273270Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_lerp_cuda_float32 PASSED [1.2797s] [ 67%] 2025-12-04T13:28:26.5273444Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_det_cuda_float32 PASSED [0.0129s] [ 67%] 2025-12-04T13:28:26.5273626Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_inv_ex_cuda_float32 PASSED [0.0143s] [ 67%] 2025-12-04T13:28:26.5273816Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_lu_factor_ex_cuda_float32 PASSED [0.0361s] [ 67%] 2025-12-04T13:28:26.5274010Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_matrix_norm_cuda_float32 PASSED [0.0848s] [ 67%] 2025-12-04T13:28:26.5274193Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_norm_cuda_float32 PASSED [0.1235s] [ 67%] 2025-12-04T13:28:26.5274434Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_pinv_hermitian_cuda_float32 SKIPPED [0.0010s] (Skip failing test) [ 67%] 2025-12-04T13:28:26.5274755Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_pinv_singular_cuda_float32 SKIPPED [0.0007s] (test is slow; run with PYTORCH_TEST_WITH_SLOW to enable test) [ 67%] 2025-12-04T13:28:26.5274962Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_solve_triangular_cuda_float32 PASSED [0.1170s] [ 68%] 2025-12-04T13:28:26.5275147Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_svdvals_cuda_float32 PASSED [0.0321s] [ 68%] 2025-12-04T13:28:26.5275421Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linalg_tensorsolve_cuda_float32 SKIPPED [0.0009s] (Skip failing test) [ 68%] 2025-12-04T13:28:26.5275570Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_linspace_cuda_float32 PASSED [0.0324s] [ 68%] 2025-12-04T13:28:26.5275698Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log10_cuda_float32 PASSED [1.2854s] [ 68%] 2025-12-04T13:28:26.5275829Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log1p_cuda_float32 PASSED [0.0051s] [ 68%] 2025-12-04T13:28:26.5275954Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log_cuda_float32 PASSED [0.0055s] [ 68%] 2025-12-04T13:28:26.5276094Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_log_normal_cuda_float32 PASSED [0.0065s] [ 68%] 2025-12-04T13:28:26.5277004Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logical_or_cuda_float32 PASSED [0.0114s] [ 68%] 2025-12-04T13:28:26.5277406Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logspace_cuda_float32 PASSED [0.1970s] [ 68%] 2025-12-04T13:28:26.5277620Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_logspace_tensor_overload_cuda_float32 PASSED [0.8802s] [ 68%] 2025-12-04T13:28:26.5277765Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_argmax_cuda_float32 PASSED [0.0941s] [ 68%] 2025-12-04T13:28:26.5278364Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_argmin_cuda_float32 PASSED [0.0916s] [ 68%] 2025-12-04T13:28:26.5278496Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_median_cuda_float32 PASSED [0.0237s] [ 68%] 2025-12-04T13:28:26.5278704Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_prod_cuda_float32 PASSED [0.1371s] [ 68%] 2025-12-04T13:28:26.5278838Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_masked_select_cuda_float32 PASSED [0.0428s] [ 68%] 2025-12-04T13:28:26.5279016Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_matrix_exp_cuda_float32 PASSED [0.0058s] [ 68%] 2025-12-04T13:28:26.5279164Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_meshgrid_list_of_tensors_cuda_float32 PASSED [0.0274s] [ 68%] 2025-12-04T13:28:26.5279320Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_meshgrid_variadic_tensors_cuda_float32 PASSED [0.0232s] [ 68%] 2025-12-04T13:28:26.5279471Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_min_reduction_with_dim_cuda_float32 PASSED [0.0053s] [ 68%] 2025-12-04T13:28:26.5279605Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_msort_cuda_float32 PASSED [0.0038s] [ 68%] 2025-12-04T13:28:26.5279722Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_mv_cuda_float32 PASSED [0.0047s] [ 68%] 2025-12-04T13:28:26.5279887Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_narrow_cuda_float32 SKIPPED [0.0010s] (Skip failing test) [ 68%] 2025-12-04T13:28:26.5280047Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_native_batch_norm_cuda_float32 SKIPPED [0.0001s] (Skipped!) [ 68%] 2025-12-04T13:28:26.5280198Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_native_layer_norm_cuda_float32 PASSED [0.0547s] [ 68%] 2025-12-04T13:28:26.5280364Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_avg_pool2d_cuda_float32 PASSED [0.0088s] [ 69%] 2025-12-04T13:28:26.5280532Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_adaptive_max_pool2d_cuda_float32 PASSED [0.0261s] [ 69%] 2025-12-04T13:28:26.5280686Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_avg_pool1d_cuda_float32 PASSED [1.3357s] [ 69%] 2025-12-04T13:28:26.5280832Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_avg_pool3d_cuda_float32 PASSED [0.0111s] [ 69%] 2025-12-04T13:28:26.5281008Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_batch_norm_without_cudnn_cuda_float32 PASSED [0.0521s] [ 69%] 2025-12-04T13:28:26.5281167Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_conv_transpose2d_cuda_float32 PASSED [0.6598s] [ 69%] 2025-12-04T13:28:26.5281334Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_cosine_embedding_loss_cuda_float32 PASSED [0.0369s] [ 69%] 2025-12-04T13:28:26.5281534Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_cosine_similarity_cuda_float32 PASSED [0.0354s] [ 69%] 2025-12-04T13:28:26.5281688Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_dropout_cuda_float32 PASSED [0.0219s] [ 69%] 2025-12-04T13:28:26.5281918Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_feature_alpha_dropout_with_train_cuda_float32 PASSED [0.0111s] [ 69%] 2025-12-04T13:28:26.5282066Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_glu_cuda_float32 PASSED [0.0381s] [ 69%] 2025-12-04T13:28:26.5282230Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_interpolate_nearest_cuda_float32 PASSED [0.0174s] [ 69%] 2025-12-04T13:28:26.5282380Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_linear_cuda_float32 PASSED [0.0683s] [ 69%] 2025-12-04T13:28:26.5282534Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_pool2d_cuda_float32 PASSED [1.1231s] [ 69%] 2025-12-04T13:28:26.5282681Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_pool3d_cuda_float32 PASSED [0.4664s] [ 69%] 2025-12-04T13:28:26.5282843Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool1d_grad_cuda_float32 PASSED [0.0980s] [ 69%] 2025-12-04T13:28:26.5283013Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool2d_cuda_float32 PASSED [1.0256s] [ 69%] 2025-12-04T13:28:26.5283173Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool2d_grad_cuda_float32 PASSED [0.1274s] [ 69%] 2025-12-04T13:28:26.5283342Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_max_unpool3d_cuda_float32 PASSED [0.4098s] [ 69%] 2025-12-04T13:28:26.5283515Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_constant_cuda_float32 PASSED [0.0359s] [ 69%] 2025-12-04T13:28:26.5283666Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pad_replicate_cuda_float32 PASSED [0.0109s] [ 69%] 2025-12-04T13:28:26.5283828Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pairwise_distance_cuda_float32 PASSED [1.3232s] [ 69%] 2025-12-04T13:28:26.5283982Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_pixel_unshuffle_cuda_float32 PASSED [0.0084s] [ 69%] 2025-12-04T13:28:26.5284130Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_relu_cuda_float32 PASSED [0.0063s] [ 69%] 2025-12-04T13:28:26.5284277Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_rms_norm_cuda_float32 PASSED [0.0178s] [ 69%] 2025-12-04T13:28:26.5284424Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_selu_cuda_float32 PASSED [0.0053s] [ 70%] 2025-12-04T13:28:26.5284564Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_nn_functional_threshold_cuda_float32 PASSED [0.0060s] [ 70%] 2025-12-04T13:28:26.5284682Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_ones_like_cuda_float32 PASSED [0.0077s] [ 70%] 2025-12-04T13:28:26.5284807Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_pca_lowrank_cuda_float32 PASSED [0.0120s] [ 70%] 2025-12-04T13:28:26.5284922Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_permute_cuda_float32 PASSED [0.0049s] [ 70%] 2025-12-04T13:28:26.5285041Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_put_cuda_float32 PASSED [0.0247s] [ 70%] 2025-12-04T13:28:26.5285153Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_rad2deg_cuda_float32 PASSED [0.0030s] [ 70%] 2025-12-04T13:28:26.5285277Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_reshape_as_cuda_float32 PASSED [0.0063s] [ 70%] 2025-12-04T13:28:26.5285395Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_resize_as__cuda_float32 PASSED [0.0048s] [ 70%] 2025-12-04T13:28:26.5285522Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_resolve_conj_cuda_float32 PASSED [0.0030s] [ 70%] 2025-12-04T13:28:26.5285632Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_round_cuda_float32 PASSED [1.3029s] [ 70%] 2025-12-04T13:28:26.5285822Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_round_decimals_3_cuda_float32 PASSED [0.0070s] [ 70%] 2025-12-04T13:28:26.5285933Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_rsqrt_cuda_float32 PASSED [1.3212s] [ 70%] 2025-12-04T13:28:26.5286061Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_scatter_add_cuda_float32 PASSED [0.0124s] [ 70%] 2025-12-04T13:28:26.5286170Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sgn_cuda_float32 PASSED [1.3125s] [ 70%] 2025-12-04T13:28:26.5286318Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_signal_windows_exponential_cuda_float32 PASSED [0.0175s] [ 70%] 2025-12-04T13:28:26.5286431Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_sinh_cuda_float32 PASSED [0.0033s] [ 70%] 2025-12-04T13:28:26.5286578Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_modified_bessel_i1_cuda_float32 PASSED [0.0044s] [ 70%] 2025-12-04T13:28:26.5286721Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_modified_bessel_k1_cuda_float32 PASSED [0.0055s] [ 70%] 2025-12-04T13:28:26.5286841Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_special_zeta_cuda_float32 PASSED [0.0099s] [ 70%] 2025-12-04T13:28:26.5286960Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_squeeze_cuda_float32 PASSED [0.0085s] [ 70%] 2025-12-04T13:28:26.5287090Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_stack_cuda_float32 PASSED [0.0099s] [ 70%] 2025-12-04T13:28:26.5287211Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_std_mean_cuda_float32 PASSED [1.3533s] [ 70%] 2025-12-04T13:28:26.5287332Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_stft_cuda_float32 PASSED [0.0247s] [ 70%] 2025-12-04T13:28:26.5287448Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tile_cuda_float32 PASSED [0.0387s] [ 70%] 2025-12-04T13:28:26.5287609Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_to_sparse_cuda_float32 SKIPPED [0.0011s] (Skip failing test) [ 71%] 2025-12-04T13:28:26.5287724Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_topk_cuda_float32 PASSED [1.3226s] [ 71%] 2025-12-04T13:28:26.5287979Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_torch_ops_aten__efficient_attention_forward_cuda_float32 SKIPPED [0.0012s] (Efficient attention on ROCM doesn't support custom_mask_type==2) [ 71%] 2025-12-04T13:28:26.5288106Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_tril_indices_cuda_int64 PASSED [0.0142s] [ 71%] 2025-12-04T13:28:26.5288215Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_triu_cuda_float32 PASSED [1.3340s] [ 71%] 2025-12-04T13:28:26.5288338Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_triu_indices_cuda_int64 PASSED [0.0152s] [ 71%] 2025-12-04T13:28:26.5288456Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unbind_copy_cuda_float32 PASSED [0.0082s] [ 71%] 2025-12-04T13:28:26.5288580Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unflatten_cuda_float32 PASSED [0.0122s] [ 71%] 2025-12-04T13:28:26.5288705Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unsafe_split_cuda_float32 PASSED [0.0042s] [ 71%] 2025-12-04T13:28:26.5288822Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_unsqueeze_cuda_float32 PASSED [0.0081s] [ 71%] 2025-12-04T13:28:26.5288956Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_var_mean_unbiased_cuda_float32 PASSED [0.0040s] [ 71%] 2025-12-04T13:28:26.5289078Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_as_real_cuda_complex64 PASSED [0.0034s] [ 71%] 2025-12-04T13:28:26.5289200Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_view_copy_cuda_float32 PASSED [0.0102s] [ 71%] 2025-12-04T13:28:26.5289313Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_vsplit_cuda_float32 PASSED [0.0050s] [ 71%] 2025-12-04T13:28:26.5289429Z test_ops.py::TestFakeTensorCUDA::test_fake_autocast_zeros_cuda_float32 PASSED [1.3345s] [ 71%] 2025-12-04T13:28:26.5289532Z test_ops.py::TestFakeTensorCUDA::test_fake_bincount_cuda_int64 PASSED [0.0282s] [ 71%] 2025-12-04T13:28:26.5289643Z test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_and_cuda_int64 PASSED [0.0104s] [ 71%] 2025-12-04T13:28:26.5289759Z test_ops.py::TestFakeTensorCUDA::test_fake_bitwise_not_cuda_int64 PASSED [0.0043s] [ 71%] 2025-12-04T13:28:26.5289871Z test_ops.py::TestFakeTensorCUDA::test_fake_block_diag_cuda_float32 PASSED [0.0066s] [ 71%] 2025-12-04T13:28:26.5289985Z test_ops.py::TestFakeTensorCUDA::test_fake_broadcast_tensors_cuda_float32 PASSED [0.0037s] [ 71%] 2025-12-04T13:28:26.5290090Z test_ops.py::TestFakeTensorCUDA::test_fake_byte_cuda_float32 PASSED [0.0074s] [ 71%] 2025-12-04T13:28:26.5290200Z test_ops.py::TestFakeTensorCUDA::test_fake_cartesian_prod_cuda_float32 PASSED [0.0103s] [ 71%] 2025-12-04T13:28:26.5290308Z test_ops.py::TestFakeTensorCUDA::test_fake_cdouble_cuda_float32 PASSED [0.0074s] [ 71%] 2025-12-04T13:28:26.5290416Z test_ops.py::TestFakeTensorCUDA::test_fake_ceil_cuda_float32 PASSED [0.0029s] [ 71%] 2025-12-04T13:28:26.5290516Z test_ops.py::TestFakeTensorCUDA::test_fake_cfloat_cuda_float32 PASSED [0.0073s] [ 71%] 2025-12-04T13:28:26.5290618Z test_ops.py::TestFakeTensorCUDA::test_fake_chalf_cuda_float32 PASSED [0.0067s] [ 72%] 2025-12-04T13:28:26.5290716Z test_ops.py::TestFakeTensorCUDA::test_fake_clone_cuda_float32 PASSED [0.0034s] [ 72%] 2025-12-04T13:28:26.5290816Z test_ops.py::TestFakeTensorCUDA::test_fake_conj_cuda_float32 PASSED [0.0034s] [ 72%] 2025-12-04T13:28:26.5290951Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_T_cuda_float32 PASSED [0.0061s] [ 72%] 2025-12-04T13:28:26.5291086Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rmod___cuda_float32 PASSED [0.0498s] [ 72%] 2025-12-04T13:28:26.5291227Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp___rpow___cuda_float32 PASSED [0.1003s] [ 72%] 2025-12-04T13:28:26.5291381Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp__segment_reduce_lengths_cuda_float32 PASSED [0.2270s] [ 72%] 2025-12-04T13:28:26.5291548Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_addbmm_cuda_float32 PASSED [0.0952s] [ 72%] 2025-12-04T13:28:26.5291701Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_as_strided_partial_views_cuda_float32 PASSED [0.0108s] [ 72%] 2025-12-04T13:28:26.5291831Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cdouble_cuda_float32 PASSED [0.0229s] [ 72%] 2025-12-04T13:28:26.5292004Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_chalf_cuda_float32 PASSED [0.0229s] [ 72%] 2025-12-04T13:28:26.5292145Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_combinations_cuda_float32 PASSED [0.4212s] [ 72%] 2025-12-04T13:28:26.5292271Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_conj_cuda_float32 PASSED [0.0046s] [ 72%] 2025-12-04T13:28:26.5292404Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_copysign_cuda_float32 PASSED [0.0494s] [ 72%] 2025-12-04T13:28:26.5292534Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cumprod_cuda_float32 PASSED [0.2326s] [ 72%] 2025-12-04T13:28:26.5292684Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_cumulative_trapezoid_cuda_float32 PASSED [0.1334s] [ 72%] 2025-12-04T13:28:26.5292808Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_diff_cuda_float32 PASSED [1.0192s] [ 72%] 2025-12-04T13:28:26.5292955Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_div_floor_rounding_cuda_float32 PASSED [0.0437s] [ 72%] 2025-12-04T13:28:26.5293079Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_double_cuda_float32 PASSED [0.0137s] [ 72%] 2025-12-04T13:28:26.5293210Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_dstack_cuda_float32 PASSED [0.0225s] [ 72%] 2025-12-04T13:28:26.5293345Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_expand_copy_cuda_float32 PASSED [0.0198s] [ 72%] 2025-12-04T13:28:26.5293476Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_expand_cuda_float32 PASSED [0.0191s] [ 72%] 2025-12-04T13:28:26.5293605Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ifftn_cuda_float32 PASSED [0.0416s] [ 72%] 2025-12-04T13:28:26.5293758Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ihfft2_cuda_float32 PASSED [0.0562s] [ 72%] 2025-12-04T13:28:26.5293893Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_ihfft_cuda_float32 PASSED [0.0469s] [ 72%] 2025-12-04T13:28:26.5294023Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_irfft2_cuda_float32 PASSED [0.0413s] [ 73%] 2025-12-04T13:28:26.5294154Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fft_rfft2_cuda_float32 PASSED [0.0324s] [ 73%] 2025-12-04T13:28:26.5294280Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_fill_cuda_float32 PASSED [0.0068s] [ 73%] 2025-12-04T13:28:26.5294417Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_index_copy_cuda_float32 PASSED [0.0126s] [ 73%] 2025-12-04T13:28:26.5294543Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_ldexp_cuda_float32 PASSED [0.0580s] [ 73%] 2025-12-04T13:28:26.5294689Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_cholesky_cuda_float32 PASSED [0.5439s] [ 73%] 2025-12-04T13:28:26.5294838Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_cond_cuda_float32 SKIPPED [0.0002s] (Skipped!) [ 73%] 2025-12-04T13:28:26.5294993Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_eigh_cuda_float32 PASSED [0.2618s] [ 73%] 2025-12-04T13:28:26.5295144Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_lstsq_grad_oriented_cuda_float32 PASSED [0.2762s] [ 73%] 2025-12-04T13:28:26.5295331Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_pinv_singular_cuda_float32 SKIPPED [0.0002s] (Skipped!) [ 73%] 2025-12-04T13:28:26.5295509Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_qr_cuda_float32 PASSED [0.9941s] [ 73%] 2025-12-04T13:28:26.5295655Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_solve_ex_cuda_float32 PASSED [0.2582s] [ 73%] 2025-12-04T13:28:26.5295807Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_svd_cuda_float32 SKIPPED [0.0002s] (Skipped!) [ 73%] 2025-12-04T13:28:26.5295947Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_tensorinv_cuda_float32 PASSED [0.0364s] [ 73%] 2025-12-04T13:28:26.5296087Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_vander_cuda_float32 PASSED [0.2922s] [ 73%] 2025-12-04T13:28:26.5296227Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_linalg_vector_norm_cuda_float32 PASSED [1.1366s] [ 73%] 2025-12-04T13:28:26.5296357Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_log1p_cuda_float32 PASSED [1.3591s] [ 73%] 2025-12-04T13:28:26.5296487Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logaddexp_cuda_float32 PASSED [0.0741s] [ 73%] 2025-12-04T13:28:26.5296618Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_logdet_cuda_float32 PASSED [0.0813s] [ 73%] 2025-12-04T13:28:26.5296747Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_lu_solve_cuda_float32 PASSED [1.0691s] [ 73%] 2025-12-04T13:28:26.5296884Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_mean_cuda_float32 PASSED [0.8387s] [ 73%] 2025-12-04T13:28:26.5297017Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_prod_cuda_float32 PASSED [1.3302s] [ 73%] 2025-12-04T13:28:26.5297154Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_masked_select_cuda_float32 PASSED [0.0240s] [ 73%] 2025-12-04T13:28:26.5297276Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_mode_cuda_float32 PASSED [0.0224s] [ 73%] 2025-12-04T13:28:26.5297403Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nansum_cuda_float32 PASSED [0.1355s] [ 73%] 2025-12-04T13:28:26.5297556Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_avg_pool2d_cuda_float32 PASSED [1.3930s] [ 74%] 2025-12-04T13:28:26.5297716Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_batch_norm_cuda_float32 PASSED [0.1068s] [ 74%] 2025-12-04T13:28:26.5297898Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_binary_cross_entropy_with_logits_cuda_float32 PASSED [0.2523s] [ 74%] 2025-12-04T13:28:26.5298221Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_conv2d_cuda_float32 MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback AI] Solver , workspace required: 1200, provided ptr: 0x7b2e8b801000 size: 768 2025-12-04T13:28:26.5298408Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 1200, provided ptr: 0x7b2e8b801000 size: 768 2025-12-04T13:28:26.5298599Z MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback AI] Solver , workspace required: 1200, provided ptr: 0x7b2e8b801200 size: 1024 2025-12-04T13:28:26.5298783Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 1200, provided ptr: 0x7b2e8b801200 size: 1024 2025-12-04T13:28:26.5298981Z MIOpen(HIP): Warning [IsEnoughWorkspace] [GetSolutionsFallback AI] Solver , workspace required: 1200, provided ptr: 0x7b2e8b801400 size: 1024 2025-12-04T13:28:26.5299185Z MIOpen(HIP): Warning [IsEnoughWorkspace] [EvaluateInvokers] Solver , workspace required: 1200, provided ptr: 0x7b2e8b801400 size: 1024 2025-12-04T13:28:26.5299232Z PASSED [0.4364s] [ 74%] 2025-12-04T13:28:26.5299402Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_embedding_bag_cuda_float32 PASSED [0.1366s] [ 74%] 2025-12-04T13:28:26.5299570Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_fractional_max_pool2d_cuda_float32 PASSED [0.0969s] [ 74%] 2025-12-04T13:28:26.5299734Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_group_norm_cuda_float32 PASSED [1.6364s] [ 74%] 2025-12-04T13:28:26.5299884Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_layer_norm_cuda_float32 PASSED [0.0518s] [ 74%] 2025-12-04T13:28:26.5300024Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_mish_cuda_float32 PASSED [0.0158s] [ 74%] 2025-12-04T13:28:26.5300174Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_mse_loss_cuda_float32 PASSED [0.0329s] [ 74%] 2025-12-04T13:28:26.5300324Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pad_reflect_cuda_float32 PASSED [0.0229s] [ 74%] 2025-12-04T13:28:26.5300485Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pairwise_distance_cuda_float32 PASSED [0.0671s] [ 74%] 2025-12-04T13:28:26.5300628Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pdist_cuda_float32 PASSED [0.0214s] [ 74%] 2025-12-04T13:28:26.5300785Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_pixel_unshuffle_cuda_float32 PASSED [0.0103s] [ 74%] 2025-12-04T13:28:26.5300942Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_poisson_nll_loss_cuda_float32 PASSED [1.1948s] [ 74%] 2025-12-04T13:28:26.5301087Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_prelu_cuda_float32 PASSED [0.2575s] [ 74%] 2025-12-04T13:28:26.5301235Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_nn_functional_softmin_cuda_float32 PASSED [0.0345s] [ 74%] 2025-12-04T13:28:26.5301383Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_polygamma_polygamma_n_0_cuda_float32 PASSED [0.0255s] [ 74%] 2025-12-04T13:28:26.5301514Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_positive_cuda_float32 PASSED [1.3767s] [ 74%] 2025-12-04T13:28:26.5301638Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_put_cuda_float32 PASSED [0.0872s] [ 74%] 2025-12-04T13:28:26.5301768Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_quantile_cuda_float32 PASSED [1.5715s] [ 74%] 2025-12-04T13:28:26.5301954Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_rad2deg_cuda_float32 PASSED [0.0044s] [ 74%] 2025-12-04T13:28:26.5302082Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_real_cuda_float32 PASSED [1.3910s] [ 74%] 2025-12-04T13:28:26.5302211Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_reciprocal_cuda_float32 PASSED [0.0152s] [ 74%] 2025-12-04T13:28:26.5302342Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_remainder_cuda_float32 PASSED [0.0546s] [ 74%] 2025-12-04T13:28:26.5302470Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_reshape_cuda_float32 PASSED [0.0183s] [ 75%] 2025-12-04T13:28:26.5302600Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_cuda_float32 PASSED [1.4188s] [ 75%] 2025-12-04T13:28:26.5302742Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_scatter_reduce_mean_cuda_float32 PASSED [0.1480s] [ 75%] 2025-12-04T13:28:26.5302867Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_sin_cuda_float32 PASSED [0.0050s] [ 75%] 2025-12-04T13:28:26.5302995Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_softmax_cuda_float32 PASSED [0.0220s] [ 75%] 2025-12-04T13:28:26.5303149Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_softmax_with_dtype_cuda_float32 PASSED [0.0374s] [ 75%] 2025-12-04T13:28:26.5303286Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_special_ndtri_cuda_float32 PASSED [0.0162s] [ 75%] 2025-12-04T13:28:26.5303426Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_squeeze_cuda_float32 PASSED [0.0173s] [ 75%] 2025-12-04T13:28:26.5303566Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_stack_cuda_float32 PASSED [0.0263s] [ 75%] 2025-12-04T13:28:26.5303687Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_stft_cuda_float32 PASSED [0.1818s] [ 75%] 2025-12-04T13:28:26.5303813Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_svd_cuda_float32 PASSED [8.5842s] [ 75%] 2025-12-04T13:28:26.5304076Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_torch_ops_aten__efficient_attention_forward_cuda_float32 SKIPPED [0.0009s] (Efficient attention on ROCM doesn't support custom_mask_type==2) [ 75%] 2025-12-04T13:28:26.5304204Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_trace_cuda_float32 PASSED [0.0061s] [ 75%] 2025-12-04T13:28:26.5304328Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_unfold_cuda_float32 PASSED [0.0871s] [ 75%] 2025-12-04T13:28:26.5304464Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_var_unbiased_cuda_float32 PASSED [0.0093s] [ 75%] 2025-12-04T13:28:26.5304590Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_view_as_cuda_float32 PASSED [0.0100s] [ 75%] 2025-12-04T13:28:26.5304719Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_amp_vsplit_cuda_float32 PASSED [0.0160s] [ 75%] 2025-12-04T13:28:26.5304853Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp___rmod___cuda_float32 PASSED [0.0490s] [ 75%] 2025-12-04T13:28:26.5305002Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp__batch_norm_with_update_cuda_float32 PASSED [0.1768s] [ 75%] 2025-12-04T13:28:26.5305136Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_addcdiv_cuda_float32 PASSED [0.0905s] [ 75%] 2025-12-04T13:28:26.5305279Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_as_strided_scatter_cuda_float32 PASSED [0.0271s] [ 75%] 2025-12-04T13:28:26.5305417Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atleast_1d_cuda_float32 PASSED [1.3977s] [ 75%] 2025-12-04T13:28:26.5305551Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_atleast_2d_cuda_float32 PASSED [0.0166s] [ 75%] 2025-12-04T13:28:26.5305681Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_baddbmm_cuda_float32 PASSED [0.0358s] [ 75%] 2025-12-04T13:28:26.5305817Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_ceil_cuda_float32 PASSED [0.0040s] [ 75%] 2025-12-04T13:28:26.5305960Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cholesky_inverse_cuda_float32 PASSED [0.2827s] [ 76%] 2025-12-04T13:28:26.5306089Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_chunk_cuda_float32 PASSED [0.0130s] [ 76%] 2025-12-04T13:28:26.5306223Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_clamp_min_cuda_float32 PASSED [0.0568s] [ 76%] 2025-12-04T13:28:26.5306355Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_complex_cuda_float32 PASSED [0.0520s] [ 76%] 2025-12-04T13:28:26.5306503Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_cumulative_trapezoid_cuda_float32 PASSED [0.1310s] [ 76%] 2025-12-04T13:28:26.5306633Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_deg2rad_cuda_float32 PASSED [0.0042s] [ 76%] 2025-12-04T13:28:26.5306760Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_double_cuda_float32 PASSED [0.0124s] [ 76%] 2025-12-04T13:28:26.5306891Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_dstack_cuda_float32 PASSED [0.0216s] [ 76%] 2025-12-04T13:28:26.5307027Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_erf_cuda_float32 PASSED [1.3636s] [ 76%] 2025-12-04T13:28:26.5307161Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_expand_as_cuda_float32 PASSED [0.0104s] [ 76%] 2025-12-04T13:28:26.5307318Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_expand_copy_cuda_float32 PASSED [0.0219s] [ 76%] 2025-12-04T13:28:26.5307461Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_fft2_cuda_float32 PASSED [0.0332s] [ 76%] 2025-12-04T13:28:26.5307590Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_hfftn_cuda_float32 PASSED [0.0783s] [ 76%] 2025-12-04T13:28:26.5307724Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fft_ifft2_cuda_float32 PASSED [0.0324s] [ 76%] 2025-12-04T13:28:26.5307848Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fill_cuda_float32 PASSED [1.4502s] [ 76%] 2025-12-04T13:28:26.5307980Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_flatten_cuda_float32 PASSED [0.0150s] [ 76%] 2025-12-04T13:28:26.5308105Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_flip_cuda_float32 PASSED [0.0190s] [ 76%] 2025-12-04T13:28:26.5308235Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_float_cuda_float32 PASSED [0.0066s] [ 76%] 2025-12-04T13:28:26.5308362Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_fmax_cuda_float32 PASSED [0.0655s] [ 76%] 2025-12-04T13:28:26.5308488Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_frac_cuda_float32 PASSED [0.0038s] [ 76%] 2025-12-04T13:28:26.5308623Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_put_cuda_float32 PASSED [0.0124s] [ 76%] 2025-12-04T13:28:26.5308758Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_index_select_cuda_float32 PASSED [0.0104s] [ 76%] 2025-12-04T13:28:26.5308888Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_inner_cuda_float32 PASSED [0.0131s] [ 76%] 2025-12-04T13:28:26.5309018Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_kthvalue_cuda_float32 PASSED [0.0301s] [ 76%] 2025-12-04T13:28:26.5309178Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lstsq_grad_oriented_cuda_float32 PASSED [0.2331s] [ 76%] 2025-12-04T13:28:26.5309318Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_lu_factor_cuda_float32 PASSED [0.9494s] [ 77%] 2025-12-04T13:28:26.5309468Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_pinv_hermitian_cuda_float32 PASSED [0.3646s] [ 77%] 2025-12-04T13:28:26.5309630Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_solve_triangular_cuda_float32 PASSED [2.9033s] [ 77%] 2025-12-04T13:28:26.5309769Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_linalg_vecdot_cuda_float32 PASSED [0.2041s] [ 77%] 2025-12-04T13:28:26.5309905Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_logsumexp_cuda_float32 PASSED [0.0805s] [ 77%] 2025-12-04T13:28:26.5310035Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_lu_solve_cuda_float32 PASSED [0.9462s] [ 77%] 2025-12-04T13:28:26.5310169Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_lu_unpack_cuda_float32 PASSED [0.2382s] [ 77%] 2025-12-04T13:28:26.5310303Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_amax_cuda_float32 PASSED [0.8870s] [ 77%] 2025-12-04T13:28:26.5310451Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_log_softmax_cuda_float32 PASSED [0.2124s] [ 77%] 2025-12-04T13:28:26.5310586Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_norm_cuda_float32 PASSED [4.7513s] [ 77%] 2025-12-04T13:28:26.5310729Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_softmin_cuda_float32 PASSED [0.2222s] [ 77%] 2025-12-04T13:28:26.5310875Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_masked_std_cuda_float32 PASSED [1.6341s] [ 77%] 2025-12-04T13:28:26.5311007Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_matmul_cuda_float32 PASSED [0.1356s] [ 77%] 2025-12-04T13:28:26.5311151Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_matrix_exp_cuda_float32 PASSED [1.4785s] [ 77%] 2025-12-04T13:28:26.5311305Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_max_reduction_with_dim_cuda_float32 PASSED [0.0165s] [ 77%] 2025-12-04T13:28:26.5311447Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_maximum_cuda_float32 PASSED [0.0755s] [ 77%] 2025-12-04T13:28:26.5311577Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_median_cuda_float32 PASSED [0.0530s] [ 77%] 2025-12-04T13:28:26.5311731Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_meshgrid_list_of_tensors_cuda_float32 PASSED [0.0772s] [ 77%] 2025-12-04T13:28:26.5311916Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_min_reduction_no_dim_cuda_float32 PASSED [0.0158s] [ 77%] 2025-12-04T13:28:26.5312049Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_minimum_cuda_float32 PASSED [0.0755s] [ 77%] 2025-12-04T13:28:26.5312172Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mm_cuda_float32 PASSED [0.0141s] [ 77%] 2025-12-04T13:28:26.5312302Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_msort_cuda_float32 PASSED [0.0071s] [ 77%] 2025-12-04T13:28:26.5312426Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mul_cuda_float32 PASSED [0.0326s] [ 77%] 2025-12-04T13:28:26.5312552Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mv_cuda_float32 PASSED [0.0072s] [ 77%] 2025-12-04T13:28:26.5312701Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mvlgamma_mvlgamma_p_1_cuda_float32 PASSED [0.0521s] [ 77%] 2025-12-04T13:28:26.5312852Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_mvlgamma_mvlgamma_p_3_cuda_float32 PASSED [0.0518s] [ 78%] 2025-12-04T13:28:26.5312983Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nanmedian_cuda_float32 PASSED [0.0526s] [ 78%] 2025-12-04T13:28:26.5313114Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nansum_cuda_float32 PASSED [0.1383s] [ 78%] 2025-12-04T13:28:26.5313282Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_adaptive_max_pool3d_cuda_float32 PASSED [0.0398s] [ 78%] 2025-12-04T13:28:26.5313438Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_avg_pool2d_cuda_float32 PASSED [0.0149s] [ 78%] 2025-12-04T13:28:26.5313637Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_binary_cross_entropy_with_logits_cuda_float32 PASSED [1.6970s] [ 78%] 2025-12-04T13:28:26.5313781Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_celu_cuda_float32 PASSED [0.0171s] [ 78%] 2025-12-04T13:28:26.5313929Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_conv1d_cuda_float32 PASSED [0.0640s] [ 78%] 2025-12-04T13:28:26.5314085Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_cross_entropy_cuda_float32 PASSED [0.2574s] [ 78%] 2025-12-04T13:28:26.5314234Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_dropout_cuda_float32 PASSED [0.0339s] [ 78%] 2025-12-04T13:28:26.5314376Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_elu_cuda_float32 PASSED [0.0145s] [ 78%] 2025-12-04T13:28:26.5314541Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_gaussian_nll_loss_cuda_float32 PASSED [6.9462s] [ 78%] 2025-12-04T13:28:26.5314706Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_interpolate_trilinear_cuda_float32 PASSED [0.4608s] [ 78%] 2025-12-04T13:28:26.5314876Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_leaky_relu_cuda_float32 PASSED [0.0258s] [ 78%] 2025-12-04T13:28:26.5315037Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_local_response_norm_cuda_float32 PASSED [0.2154s] [ 78%] 2025-12-04T13:28:26.5315207Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_pool3d_cuda_float32 PASSED [1.1638s] [ 78%] 2025-12-04T13:28:26.5317523Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_max_unpool2d_grad_cuda_float32 PASSED [0.1522s] [ 78%] 2025-12-04T13:28:26.5317687Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pad_reflect_cuda_float32 PASSED [0.0229s] [ 78%] 2025-12-04T13:28:26.5317848Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_pixel_unshuffle_cuda_float32 PASSED [0.0104s] [ 78%] 2025-12-04T13:28:26.5317994Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_rrelu_cuda_float32 PASSED [0.0192s] [ 78%] 2025-12-04T13:28:26.5318148Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_threshold_cuda_float32 PASSED [0.0135s] [ 78%] 2025-12-04T13:28:26.5318311Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_nn_functional_triplet_margin_loss_cuda_float32 PASSED [0.2002s] [ 78%] 2025-12-04T13:28:26.5318453Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_permute_copy_cuda_float32 PASSED [0.0151s] [ 78%] 2025-12-04T13:28:26.5318615Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_permute_cuda_float32 PASSED [0.0093s] [ 78%] 2025-12-04T13:28:26.5318744Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polar_cuda_float32 PASSED [0.0989s] [ 78%] 2025-12-04T13:28:26.5318898Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polygamma_polygamma_n_0_cuda_float32 PASSED [0.0243s] [ 79%] 2025-12-04T13:28:26.5319051Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_polygamma_polygamma_n_2_cuda_float32 PASSED [0.0240s] [ 79%] 2025-12-04T13:28:26.5319181Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_ravel_cuda_float32 PASSED [0.0092s] [ 79%] 2025-12-04T13:28:26.5319311Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_real_cuda_float32 PASSED [0.0039s] [ 79%] 2025-12-04T13:28:26.5319443Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_remainder_cuda_float32 PASSED [0.0499s] [ 79%] 2025-12-04T13:28:26.5319575Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_reshape_cuda_float32 PASSED [0.0156s] [ 79%] 2025-12-04T13:28:26.5319717Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_round_decimals_0_cuda_float32 PASSED [0.0070s] [ 79%] 2025-12-04T13:28:26.5319865Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_scatter_reduce_mean_cuda_float32 PASSED [0.1422s] [ 79%] 2025-12-04T13:28:26.5319992Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sinc_cuda_float32 PASSED [0.0257s] [ 79%] 2025-12-04T13:28:26.5320120Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sinh_cuda_float32 PASSED [0.0046s] [ 79%] 2025-12-04T13:28:26.5320248Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_slice_cuda_float32 PASSED [1.4570s] [ 79%] 2025-12-04T13:28:26.5320412Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sparse_sampled_addmm_cuda_float32 SKIPPED [0.0003s] (Skipped!) [ 79%] 2025-12-04T13:28:26.5320543Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_squeeze_cuda_float32 PASSED [1.4752s] [ 79%] 2025-12-04T13:28:26.5320669Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_std_cuda_float32 PASSED [0.1006s] [ 79%] 2025-12-04T13:28:26.5320797Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_sub_cuda_float32 PASSED [0.0364s] [ 79%] 2025-12-04T13:28:26.5320940Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_t_copy_cuda_float32 PASSED [0.0074s] [ 79%] 2025-12-04T13:28:26.5321075Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tensordot_cuda_float32 PASSED [0.0452s] [ 79%] 2025-12-04T13:28:26.5321211Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_tril_cuda_float32 PASSED [0.0197s] [ 79%] 2025-12-04T13:28:26.5321339Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_triu_cuda_float32 PASSED [0.0195s] [ 79%] 2025-12-04T13:28:26.5321545Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unfold_cuda_float32 PASSED [0.0630s] [ 79%] 2025-12-04T13:28:26.5321684Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unsafe_chunk_cuda_float32 PASSED [0.0125s] [ 79%] 2025-12-04T13:28:26.5321820Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_unsafe_split_cuda_float32 PASSED [0.0069s] [ 79%] 2025-12-04T13:28:26.5322006Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_var_mean_cuda_float32 PASSED [0.0922s] [ 79%] 2025-12-04T13:28:26.5322150Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_var_mean_unbiased_cuda_float32 PASSED [0.0139s] [ 79%] 2025-12-04T13:28:26.5322292Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_var_unbiased_cuda_float32 PASSED [0.0091s] [ 79%] 2025-12-04T13:28:26.5322416Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_vdot_cuda_float32 PASSED [1.4997s] [ 80%] 2025-12-04T13:28:26.5322560Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_view_as_complex_cuda_float32 PASSED [0.0062s] [ 80%] 2025-12-04T13:28:26.5322691Z test_ops.py::TestFakeTensorCUDA::test_fake_crossref_backward_no_amp_view_as_cuda_float32 PASSED [0.0109s] [ 80%] 2025-12-04T13:28:26.5322790Z test_ops.py::TestFakeTensorCUDA::test_fake_deg2rad_cuda_float32 PASSED [0.0032s] [ 80%] 2025-12-04T13:28:26.5322926Z test_ops.py::TestFakeTensorCUDA::test_fake_diag_cuda_float32 PASSED [0.0128s] [ 80%] 2025-12-04T13:28:26.5323036Z test_ops.py::TestFakeTensorCUDA::test_fake_diag_embed_cuda_float32 PASSED [0.0166s] [ 80%] 2025-12-04T13:28:26.5323137Z test_ops.py::TestFakeTensorCUDA::test_fake_diagflat_cuda_float32 PASSED [1.4753s] [ 80%] 2025-12-04T13:28:26.5323253Z test_ops.py::TestFakeTensorCUDA::test_fake_diagonal_copy_cuda_float32 PASSED [0.0139s] [ 80%] 2025-12-04T13:28:26.5323350Z test_ops.py::TestFakeTensorCUDA::test_fake_diff_cuda_float32 PASSED [0.1693s] [ 80%] 2025-12-04T13:28:26.5323452Z test_ops.py::TestFakeTensorCUDA::test_fake_dot_cuda_float32 PASSED [0.0037s] [ 80%] 2025-12-04T13:28:26.5323551Z test_ops.py::TestFakeTensorCUDA::test_fake_dsplit_cuda_float32 PASSED [0.0051s] [ 80%] 2025-12-04T13:28:26.5323651Z test_ops.py::TestFakeTensorCUDA::test_fake_einsum_cuda_float32 PASSED [0.0323s] [ 80%] 2025-12-04T13:28:26.5323755Z test_ops.py::TestFakeTensorCUDA::test_fake_empty_like_cuda_float32 PASSED [0.0076s] [ 80%] 2025-12-04T13:28:26.5323869Z test_ops.py::TestFakeTensorCUDA::test_fake_expand_copy_cuda_float32 PASSED [0.0081s] [ 80%] 2025-12-04T13:28:26.5323967Z test_ops.py::TestFakeTensorCUDA::test_fake_fft_fft2_cuda_float32 PASSED [0.0100s] [ 80%] 2025-12-04T13:28:26.5324071Z test_ops.py::TestFakeTensorCUDA::test_fake_fft_fftn_cuda_float32 PASSED [0.0120s] [ 80%] 2025-12-04T13:28:26.5324180Z test_ops.py::TestFakeTensorCUDA::test_fake_fft_ifftn_cuda_float32 PASSED [0.0120s] [ 80%] 2025-12-04T13:28:26.5324284Z test_ops.py::TestFakeTensorCUDA::test_fake_fft_irfftn_cuda_float32 PASSED [0.0123s] [ 80%] 2025-12-04T13:28:26.5324389Z test_ops.py::TestFakeTensorCUDA::test_fake_fft_rfftn_cuda_float32 PASSED [0.0088s] [ 80%] 2025-12-04T13:28:26.5324490Z test_ops.py::TestFakeTensorCUDA::test_fake_full_like_cuda_float32 PASSED [0.0075s] [ 80%] 2025-12-04T13:28:26.5324589Z test_ops.py::TestFakeTensorCUDA::test_fake_gcd_cuda_int64 PASSED [0.0113s] [ 80%] 2025-12-04T13:28:26.5324685Z test_ops.py::TestFakeTensorCUDA::test_fake_ge_cuda_float32 PASSED [0.0096s] [ 80%] 2025-12-04T13:28:26.5324824Z test_ops.py::TestFakeTensorCUDA::test_fake_geometric_cuda_float32 PASSED [1.4914s] [ 80%] 2025-12-04T13:28:26.5324917Z test_ops.py::TestFakeTensorCUDA::test_fake_gt_cuda_float32 PASSED [0.0126s] [ 80%] 2025-12-04T13:28:26.5325018Z test_ops.py::TestFakeTensorCUDA::test_fake_half_cuda_float32 PASSED [0.0078s] [ 80%] 2025-12-04T13:28:26.5325133Z test_ops.py::TestFakeTensorCUDA::test_fake_index_put_cuda_float32 PASSED [0.0061s] [ 81%] 2025-12-04T13:28:26.5325266Z test_ops.py::TestFakeTensorCUDA::test_fake_index_reduce_mean_cuda_float32 PASSED [0.0095s] [ 81%] 2025-12-04T13:28:26.5325391Z test_ops.py::TestFakeTensorCUDA::test_fake_isclose_cuda_float32 PASSED [0.0612s] [ 81%] 2025-12-04T13:28:26.5325495Z test_ops.py::TestFakeTensorCUDA::test_fake_isin_cuda_float32 PASSED [0.0041s] [ 81%] 2025-12-04T13:28:26.5325594Z test_ops.py::TestFakeTensorCUDA::test_fake_isinf_cuda_float32 PASSED [0.0030s] [ 81%] 2025-12-04T13:28:26.5325700Z test_ops.py::TestFakeTensorCUDA::test_fake_isneginf_cuda_float32 PASSED [0.0029s] [ 81%] 2025-12-04T13:28:26.5325830Z test_ops.py::TestFakeTensorCUDA::test_fake_istft_cuda_complex64 SKIPPED [0.0010s] (Skip failing test) [ 81%] 2025-12-04T13:28:26.5325933Z test_ops.py::TestFakeTensorCUDA::test_fake_kthvalue_cuda_float32 PASSED [0.0095s] [ 81%] 2025-12-04T13:28:26.5326031Z test_ops.py::TestFakeTensorCUDA::test_fake_lgamma_cuda_float32 PASSED [0.0042s] [ 81%] 2025-12-04T13:28:26.5326249Z test_ops.py::TestFakeTensorCUDA::test_fake_linalg_householder_product_cuda_float32 SKIPPED [0.0005s] (skipCUDAIfRocm: test doesn't currently work on the ROCm stack) [ 81%] 2025-12-04T13:28:26.5326367Z test_ops.py::TestFakeTensorCUDA::test_fake_linalg_lu_solve_cuda_float32 PASSED [0.0982s] [ 81%] 2025-12-04T13:28:26.5326479Z test_ops.py::TestFakeTensorCUDA::test_fake_linalg_multi_dot_cuda_float32 PASSED [0.0123s] [ 81%] 2025-12-04T13:28:26.5326631Z test_ops.py::TestFakeTensorCUDA::test_fake_linalg_pinv_hermitian_cuda_float32 SKIPPED [0.0009s] (Skip failing test) [ 81%] 2025-12-04T13:28:26.5326737Z test_ops.py::TestFakeTensorCUDA::test_fake_log_softmax_cuda_float32 PASSED [0.0163s] [ 81%] 2025-12-04T13:28:26.5326845Z test_ops.py::TestFakeTensorCUDA::test_fake_logical_xor_cuda_float32 PASSED [0.0111s] [ 81%] 2025-12-04T13:28:26.5326941Z test_ops.py::TestFakeTensorCUDA::test_fake_long_cuda_float32 PASSED [0.0074s] [ 81%] 2025-12-04T13:28:26.5327055Z test_ops.py::TestFakeTensorCUDA::test_fake_masked_argmin_cuda_float32 PASSED [0.0906s] [ 81%] 2025-12-04T13:28:26.5327167Z test_ops.py::TestFakeTensorCUDA::test_fake_masked_logaddexp_cuda_float32 PASSED [0.0342s] [ 81%] 2025-12-04T13:28:26.5327283Z test_ops.py::TestFakeTensorCUDA::test_fake_masked_normalize_cuda_float32 PASSED [0.0519s] [ 81%] 2025-12-04T13:28:26.5327386Z test_ops.py::TestFakeTensorCUDA::test_fake_masked_prod_cuda_float32 PASSED [0.1354s] [ 81%] 2025-12-04T13:28:26.5327499Z test_ops.py::TestFakeTensorCUDA::test_fake_masked_softmin_cuda_float32 PASSED [0.0383s] [ 81%] 2025-12-04T13:28:26.5327601Z test_ops.py::TestFakeTensorCUDA::test_fake_max_binary_cuda_float32 PASSED [0.0098s] [ 81%] 2025-12-04T13:28:26.5327705Z test_ops.py::TestFakeTensorCUDA::test_fake_minimum_cuda_float32 PASSED [0.0097s] [ 81%] 2025-12-04T13:28:26.5327838Z test_ops.py::TestFakeTensorCUDA::test_fake_multinomial_cuda_float32 SKIPPED [0.0010s] (Skip failing test) [ 81%] 2025-12-04T13:28:26.5327946Z test_ops.py::TestFakeTensorCUDA::test_fake_nanmedian_cuda_float32 PASSED [0.0117s] [ 81%] 2025-12-04T13:28:26.5328049Z test_ops.py::TestFakeTensorCUDA::test_fake_new_full_cuda_float32 PASSED [0.0077s] [ 82%] 2025-12-04T13:28:26.5328148Z test_ops.py::TestFakeTensorCUDA::test_fake_new_ones_cuda_float32 PASSED [0.0075s] [ 82%] 2025-12-04T13:28:26.5328254Z test_ops.py::TestFakeTensorCUDA::test_fake_nextafter_cuda_float32 PASSED [0.0097s] [ 82%] 2025-12-04T13:28:26.5328389Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_avg_pool2d_cuda_float32 PASSED [0.0088s] [ 82%] 2025-12-04T13:28:26.5328526Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_avg_pool3d_cuda_float32 PASSED [0.0106s] [ 82%] 2025-12-04T13:28:26.5328671Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_adaptive_max_pool2d_cuda_float32 PASSED [0.0202s] [ 82%] 2025-12-04T13:28:26.5328792Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv2d_cuda_float32 PASSED [0.0388s] [ 82%] 2025-12-04T13:28:26.5328938Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_conv_transpose3d_cuda_float32 PASSED [0.0147s] [ 82%] 2025-12-04T13:28:26.5329091Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_cosine_embedding_loss_cuda_float32 PASSED [0.0355s] [ 82%] 2025-12-04T13:28:26.5329230Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_cross_entropy_cuda_float32 PASSED [0.0909s] [ 82%] 2025-12-04T13:28:26.5329346Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_elu_cuda_float32 PASSED [0.0051s] [ 82%] 2025-12-04T13:28:26.5329496Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_feature_alpha_dropout_with_train_cuda_float32 PASSED [0.0114s] [ 82%] 2025-12-04T13:28:26.5329637Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_fractional_max_pool3d_cuda_float32 PASSED [0.0612s] [ 82%] 2025-12-04T13:28:26.5329757Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_group_norm_cuda_float32 PASSED [0.0344s] [ 82%] 2025-12-04T13:28:26.5329882Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardshrink_cuda_float32 PASSED [0.0074s] [ 82%] 2025-12-04T13:28:26.5330008Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardsigmoid_cuda_float32 PASSED [0.0084s] [ 82%] 2025-12-04T13:28:26.5330129Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_hardswish_cuda_float32 PASSED [0.0104s] [ 82%] 2025-12-04T13:28:26.5330251Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_huber_loss_cuda_float32 PASSED [0.0136s] [ 82%] 2025-12-04T13:28:26.5330379Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_area_cuda_float32 PASSED [0.0197s] [ 82%] 2025-12-04T13:28:26.5330517Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_interpolate_bicubic_cuda_float32 PASSED [1.0542s] [ 82%] 2025-12-04T13:28:26.5330648Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_local_response_norm_cuda_float32 PASSED [0.0288s] [ 82%] 2025-12-04T13:28:26.5330781Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_max_unpool2d_grad_cuda_float32 PASSED [0.0893s] [ 82%] 2025-12-04T13:28:26.5330903Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_normalize_cuda_float32 PASSED [0.0175s] [ 82%] 2025-12-04T13:28:26.5331033Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pad_replicate_cuda_float32 PASSED [0.0102s] [ 82%] 2025-12-04T13:28:26.5331158Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_pixel_shuffle_cuda_float32 PASSED [0.0055s] [ 82%] 2025-12-04T13:28:26.5331278Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_rms_norm_cuda_float32 PASSED [0.0131s] [ 83%] 2025-12-04T13:28:26.5331405Z test_ops.py::TestFakeTensorCUDA::test_fake_nn_functional_soft_margin_loss_cuda_float32 PASSED [0.0145s] [ 83%] 2025-12-04T13:28:26.5331511Z test_ops.py::TestFakeTensorCUDA::test_fake_norm_fro_cuda_float32 PASSED [0.0048s] [ 83%] 2025-12-04T13:28:26.5331609Z test_ops.py::TestFakeTensorCUDA::test_fake_outer_cuda_float32 PASSED [0.0033s] [ 83%] 2025-12-04T13:28:26.5331710Z test_ops.py::TestFakeTensorCUDA::test_fake_qr_cuda_float32 PASSED [0.0413s] [ 83%] 2025-12-04T13:28:26.5331815Z test_ops.py::TestFakeTensorCUDA::test_fake_rand_like_cuda_float32 PASSED [0.0112s] [ 83%] 2025-12-04T13:28:26.5331959Z test_ops.py::TestFakeTensorCUDA::test_fake_reshape_cuda_float32 PASSED [0.0083s] [ 83%] 2025-12-04T13:28:26.5332062Z test_ops.py::TestFakeTensorCUDA::test_fake_rsqrt_cuda_float32 PASSED [0.0041s] [ 83%] 2025-12-04T13:28:26.5332158Z test_ops.py::TestFakeTensorCUDA::test_fake_short_cuda_float32 PASSED [0.0073s] [ 83%] 2025-12-04T13:28:26.5332282Z test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_bartlett_cuda_float32 PASSED [0.0122s] [ 83%] 2025-12-04T13:28:26.5332421Z test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_gaussian_cuda_float32 PASSED [0.0153s] [ 83%] 2025-12-04T13:28:26.5332542Z test_ops.py::TestFakeTensorCUDA::test_fake_signal_windows_hamming_cuda_float32 PASSED [0.0292s] [ 83%] 2025-12-04T13:28:26.5332639Z test_ops.py::TestFakeTensorCUDA::test_fake_sin_cuda_float32 PASSED [1.4691s] [ 83%] 2025-12-04T13:28:26.5332754Z test_ops.py::TestFakeTensorCUDA::test_fake_sort_cuda_float32 PASSED [0.0324s] [ 83%] 2025-12-04T13:28:26.5332868Z test_ops.py::TestFakeTensorCUDA::test_fake_special_bessel_y0_cuda_float32 PASSED [0.0047s] [ 83%] 2025-12-04T13:28:26.5333034Z test_ops.py::TestFakeTensorCUDA::test_fake_special_hermite_polynomial_h_cuda_float32 PASSED [0.0112s] [ 83%] 2025-12-04T13:28:26.5333140Z test_ops.py::TestFakeTensorCUDA::test_fake_special_i1e_cuda_float32 PASSED [0.0037s] [ 83%] 2025-12-04T13:28:26.5333275Z test_ops.py::TestFakeTensorCUDA::test_fake_special_laguerre_polynomial_l_cuda_float32 PASSED [0.0094s] [ 83%] 2025-12-04T13:28:26.5333399Z test_ops.py::TestFakeTensorCUDA::test_fake_special_modified_bessel_i1_cuda_float32 PASSED [0.0042s] [ 83%] 2025-12-04T13:28:26.5333545Z test_ops.py::TestFakeTensorCUDA::test_fake_special_shifted_chebyshev_polynomial_w_cuda_float32 PASSED [0.0094s] [ 83%] 2025-12-04T13:28:26.5333643Z test_ops.py::TestFakeTensorCUDA::test_fake_split_cuda_float32 PASSED [0.0039s] [ 83%] 2025-12-04T13:28:26.5333747Z test_ops.py::TestFakeTensorCUDA::test_fake_square_cuda_float32 PASSED [0.0044s] [ 83%] 2025-12-04T13:28:26.5333851Z test_ops.py::TestFakeTensorCUDA::test_fake_squeeze_cuda_float32 PASSED [0.0078s] [ 83%] 2025-12-04T13:28:26.5333949Z test_ops.py::TestFakeTensorCUDA::test_fake_std_cuda_float32 PASSED [0.0127s] [ 83%] 2025-12-04T13:28:26.5334049Z test_ops.py::TestFakeTensorCUDA::test_fake_svd_cuda_float32 PASSED [0.2760s] [ 84%] 2025-12-04T13:28:26.5334146Z test_ops.py::TestFakeTensorCUDA::test_fake_t_copy_cuda_float32 PASSED [0.0041s] [ 84%] 2025-12-04T13:28:26.5334247Z test_ops.py::TestFakeTensorCUDA::test_fake_tanh_cuda_float32 PASSED [0.0028s] [ 84%] 2025-12-04T13:28:26.5334343Z test_ops.py::TestFakeTensorCUDA::test_fake_trapz_cuda_float32 PASSED [0.0251s] [ 84%] 2025-12-04T13:28:26.5334454Z test_ops.py::TestFakeTensorCUDA::test_fake_unfold_copy_cuda_float32 PASSED [0.0161s] [ 84%] 2025-12-04T13:28:26.5334551Z test_ops.py::TestFakeTensorCUDA::test_fake_uniform_cuda_float32 PASSED [0.0060s] [ 84%] 2025-12-04T13:28:26.5334654Z test_ops.py::TestFakeTensorCUDA::test_fake_view_as_cuda_float32 PASSED [0.0058s] [ 84%] 2025-12-04T13:28:26.5334751Z test_ops.py::TestFakeTensorCUDA::test_fake_vstack_cuda_float32 PASSED [0.0046s] [ 84%] 2025-12-04T13:28:26.5334852Z test_ops.py::TestFakeTensorCUDA::test_fake_where_cuda_float32 PASSED [0.0082s] [ 84%] 2025-12-04T13:28:26.5334948Z test_ops.py::TestFakeTensorCUDA::test_fake_zeros_cuda_float32 PASSED [0.0032s] [ 84%] 2025-12-04T13:28:26.5335056Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_T_cuda_float32 PASSED [0.0036s] [ 84%] 2025-12-04T13:28:26.5335174Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___getitem___cuda_float32 PASSED [0.0152s] [ 84%] 2025-12-04T13:28:26.5335292Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rand___cuda_int64 PASSED [0.0115s] [ 84%] 2025-12-04T13:28:26.5335405Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops___rxor___cuda_int64 PASSED [0.0116s] [ 84%] 2025-12-04T13:28:26.5335543Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__softmax_backward_data_cuda_float32 PASSED [0.0086s] [ 84%] 2025-12-04T13:28:26.5335671Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__unsafe_masked_index_cuda_float32 PASSED [0.0349s] [ 84%] 2025-12-04T13:28:26.5335811Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops__upsample_bilinear2d_aa_cuda_float32 PASSED [0.0074s] [ 84%] 2025-12-04T13:28:26.5335927Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addcdiv_cuda_float32 PASSED [0.0161s] [ 84%] 2025-12-04T13:28:26.5336037Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_addr_cuda_float32 PASSED [0.0105s] [ 84%] 2025-12-04T13:28:26.5336161Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_amax_cuda_float32 PASSED [0.0165s] [ 84%] 2025-12-04T13:28:26.5336268Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_amin_cuda_float32 PASSED [0.0165s] [ 84%] 2025-12-04T13:28:26.5336386Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_argwhere_cuda_float32 PASSED [0.0052s] [ 84%] 2025-12-04T13:28:26.5336514Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_atleast_2d_cuda_float32 PASSED [0.0072s] [ 84%] 2025-12-04T13:28:26.5336628Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bincount_cuda_int64 PASSED [1.4903s] [ 84%] 2025-12-04T13:28:26.5336783Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bitwise_left_shift_cuda_int64 PASSED [0.0142s] [ 84%] 2025-12-04T13:28:26.5336896Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_bool_cuda_float32 PASSED [0.0094s] [ 85%] 2025-12-04T13:28:26.5337022Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_broadcast_tensors_cuda_float32 PASSED [0.0044s] [ 85%] 2025-12-04T13:28:26.5337134Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cat_cuda_float32 PASSED [0.0107s] [ 85%] 2025-12-04T13:28:26.5337244Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cfloat_cuda_float32 PASSED [1.4992s] [ 85%] 2025-12-04T13:28:26.5337357Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_chunk_cuda_float32 PASSED [0.0081s] [ 85%] 2025-12-04T13:28:26.5337473Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_copysign_cuda_float32 PASSED [0.0164s] [ 85%] 2025-12-04T13:28:26.5337612Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cov_cuda_float32 SKIPPED [0.0012s] (Skip failing test) [ 85%] 2025-12-04T13:28:26.5337724Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_cummax_cuda_float32 PASSED [0.0047s] [ 85%] 2025-12-04T13:28:26.5337841Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diagflat_cuda_float32 PASSED [0.0117s] [ 85%] 2025-12-04T13:28:26.5337969Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diagonal_scatter_cuda_float32 PASSED [0.0146s] [ 85%] 2025-12-04T13:28:26.5338077Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_diff_cuda_float32 PASSED [0.2956s] [ 85%] 2025-12-04T13:28:26.5338193Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_digamma_cuda_float32 PASSED [0.0048s] [ 85%] 2025-12-04T13:28:26.5338308Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_expand_as_cuda_float32 PASSED [0.0047s] [ 85%] 2025-12-04T13:28:26.5338422Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_expm1_cuda_float32 PASSED [0.0030s] [ 85%] 2025-12-04T13:28:26.5338529Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_eye_cuda_float32 PASSED [0.0481s] [ 85%] 2025-12-04T13:28:26.5338653Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_fftshift_cuda_float32 PASSED [0.0082s] [ 85%] 2025-12-04T13:28:26.5338765Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_fft_hfft_cuda_float32 PASSED [0.0191s] [ 85%] 2025-12-04T13:28:26.5338878Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_floor_cuda_float32 PASSED [0.0031s] [ 85%] 2025-12-04T13:28:26.5338983Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_gcd_cuda_int64 PASSED [0.0117s] [ 85%] 2025-12-04T13:28:26.5339099Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_gradient_cuda_float32 PASSED [0.2331s] [ 85%] 2025-12-04T13:28:26.5339237Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_grid_sampler_3d_cuda_float32 SKIPPED [0.0002s] (Skipped!) [ 85%] 2025-12-04T13:28:26.5339350Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_half_cuda_float32 PASSED [0.0091s] [ 85%] 2025-12-04T13:28:26.5339459Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_histc_cuda_float32 PASSED [0.0646s] [ 85%] 2025-12-04T13:28:26.5339570Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_i0_cuda_float32 PASSED [1.5135s] [ 85%] 2025-12-04T13:28:26.5339682Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_igammac_cuda_float32 PASSED [0.0144s] [ 85%] 2025-12-04T13:28:26.5339816Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_index_reduce_mean_cuda_float32 PASSED [0.0100s] [ 86%] 2025-12-04T13:28:26.5340003Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_jiterator_binary_return_by_ref_cuda_float32 SKIPPED [0.0011s] (Skip failing test) [ 86%] 2025-12-04T13:28:26.5340116Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_kthvalue_cuda_float32 PASSED [0.0104s] [ 86%] 2025-12-04T13:28:26.5340226Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_le_cuda_float32 PASSED [0.0117s] [ 86%] 2025-12-04T13:28:26.5340360Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_cholesky_cuda_float32 PASSED [0.0200s] [ 86%] 2025-12-04T13:28:26.5340501Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_diagonal_cuda_float32 PASSED [0.0127s] [ 86%] 2025-12-04T13:28:26.5340632Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_eig_cuda_float32 PASSED [0.0112s] [ 86%] 2025-12-04T13:28:26.5340753Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_eigh_cuda_float32 PASSED [0.0106s] [ 86%] 2025-12-04T13:28:26.5340976Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_householder_product_cuda_float32 SKIPPED [0.0006s] (skipCUDAIfRocm: test doesn't currently work on the ROCm stack) [ 86%] 2025-12-04T13:28:26.5341112Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_ldl_factor_ex_cuda_float32 PASSED [0.0062s] [ 86%] 2025-12-04T13:28:26.5341239Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_lu_factor_ex_cuda_float32 PASSED [0.0389s] [ 86%] 2025-12-04T13:28:26.5341400Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_matrix_power_cuda_float32 SKIPPED [0.0010s] (Skip failing test) [ 86%] 2025-12-04T13:28:26.5341527Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_matrix_rank_cuda_float32 PASSED [0.1162s] [ 86%] 2025-12-04T13:28:26.5341698Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_matrix_rank_hermitian_cuda_float32 SKIPPED [0.0010s] (Skip failing test) [ 86%] 2025-12-04T13:28:26.5341820Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_norm_cuda_float32 PASSED [0.1555s] [ 86%] 2025-12-04T13:28:26.5341986Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_tensorinv_cuda_float32 PASSED [0.0118s] [ 86%] 2025-12-04T13:28:26.5342150Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_linalg_tensorsolve_cuda_float32 SKIPPED [0.0009s] (Skip failing test) [ 86%] 2025-12-04T13:28:26.5342267Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_log_softmax_cuda_float32 PASSED [0.0182s] [ 86%] 2025-12-04T13:28:26.5342390Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logcumsumexp_cuda_float32 PASSED [1.5101s] [ 86%] 2025-12-04T13:28:26.5342506Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logical_not_cuda_float32 PASSED [0.0083s] [ 86%] 2025-12-04T13:28:26.5342647Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_logspace_tensor_overload_cuda_float32 PASSED [0.8897s] [ 86%] 2025-12-04T13:28:26.5342787Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_lu_solve_cuda_float32 SKIPPED [0.0015s] (Skip failing test) [ 86%] 2025-12-04T13:28:26.5342907Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_lu_unpack_cuda_float32 PASSED [0.0450s] [ 86%] 2025-12-04T13:28:26.5343024Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_amin_cuda_float32 PASSED [0.0920s] [ 86%] 2025-12-04T13:28:26.5343148Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_cumprod_cuda_float32 PASSED [0.0236s] [ 86%] 2025-12-04T13:28:26.5343263Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_fill_cuda_float32 PASSED [0.0123s] [ 87%] 2025-12-04T13:28:26.5343384Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_prod_cuda_float32 PASSED [0.1069s] [ 87%] 2025-12-04T13:28:26.5343508Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_softmax_cuda_float32 PASSED [0.0264s] [ 87%] 2025-12-04T13:28:26.5343624Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_std_cuda_float32 PASSED [0.0845s] [ 87%] 2025-12-04T13:28:26.5343742Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_masked_sum_cuda_float32 PASSED [0.0907s] [ 87%] 2025-12-04T13:28:26.5343855Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_max_binary_cuda_float32 PASSED [0.0116s] [ 87%] 2025-12-04T13:28:26.5344027Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_max_pool2d_with_indices_backward_cuda_float32 PASSED [2.6623s] [ 87%] 2025-12-04T13:28:26.5344160Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_min_reduction_with_dim_cuda_float32 PASSED [0.0059s] [ 87%] 2025-12-04T13:28:26.5344288Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mul_cuda_float32 PASSED [0.0116s] [ 87%] 2025-12-04T13:28:26.5344446Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mvlgamma_mvlgamma_p_1_cuda_float32 SKIPPED [0.0010s] (Skip failing test) [ 87%] 2025-12-04T13:28:26.5344640Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_mvlgamma_mvlgamma_p_3_cuda_float32 SKIPPED [0.0009s] (Skip failing test) [ 87%] 2025-12-04T13:28:26.5344755Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_empty_cuda_float32 PASSED [1.5229s] [ 87%] 2025-12-04T13:28:26.5344883Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_new_empty_strided_cuda_float32 PASSED [0.0112s] [ 87%] 2025-12-04T13:28:26.5345031Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_adaptive_avg_pool3d_cuda_float32 PASSED [0.0122s] [ 87%] 2025-12-04T13:28:26.5345168Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_batch_norm_cuda_float32 PASSED [0.0420s] [ 87%] 2025-12-04T13:28:26.5345310Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_channel_shuffle_cuda_float32 PASSED [0.0047s] [ 87%] 2025-12-04T13:28:26.5345441Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv1d_cuda_float32 PASSED [0.0314s] [ 87%] 2025-12-04T13:28:26.5345574Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv2d_cuda_float32 PASSED [0.0465s] [ 87%] 2025-12-04T13:28:26.5345716Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_conv_transpose3d_cuda_float32 PASSED [0.0169s] [ 87%] 2025-12-04T13:28:26.5345863Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_cosine_similarity_cuda_float32 PASSED [0.0652s] [ 87%] 2025-12-04T13:28:26.5346027Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_feature_alpha_dropout_with_train_cuda_float32 PASSED [0.0140s] [ 87%] 2025-12-04T13:28:26.5346179Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_fractional_max_pool3d_cuda_float32 PASSED [0.0376s] [ 87%] 2025-12-04T13:28:26.5346313Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_hardshrink_cuda_float32 PASSED [1.4949s] [ 87%] 2025-12-04T13:28:26.5346467Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_hinge_embedding_loss_cuda_float32 PASSED [0.0664s] [ 87%] 2025-12-04T13:28:26.5346626Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_interpolate_nearest-exact_cuda_float32 PASSED [0.0187s] [ 87%] 2025-12-04T13:28:26.5346763Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_leaky_relu_cuda_float32 PASSED [1.5061s] [ 88%] 2025-12-04T13:28:26.5346908Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_local_response_norm_cuda_float32 PASSED [0.0538s] [ 88%] 2025-12-04T13:28:26.5347046Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_logsigmoid_cuda_float32 PASSED [0.0133s] [ 88%] 2025-12-04T13:28:26.5347195Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_margin_ranking_loss_cuda_float32 PASSED [0.0901s] [ 88%] 2025-12-04T13:28:26.5347332Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_max_unpool1d_cuda_float32 PASSED [0.9629s] [ 88%] 2025-12-04T13:28:26.5347463Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_mish_cuda_float32 PASSED [0.0069s] [ 88%] 2025-12-04T13:28:26.5347623Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_multi_head_attention_forward_cuda_float32 PASSED [5.2130s] [ 88%] 2025-12-04T13:28:26.5347777Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_multilabel_margin_loss_cuda_float32 PASSED [0.0743s] [ 88%] 2025-12-04T13:28:26.5347908Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_normalize_cuda_float32 PASSED [0.0290s] [ 88%] 2025-12-04T13:28:26.5348073Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_pad_replicate_negative_cuda_float32 PASSED [0.0066s] [ 88%] 2025-12-04T13:28:26.5348198Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_rrelu_cuda_float32 PASSED [0.0092s] [ 88%] 2025-12-04T13:28:26.5348353Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_silu_complex_cuda_complex64 PASSED [0.0043s] [ 88%] 2025-12-04T13:28:26.5348494Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_softplus_cuda_float32 PASSED [0.0061s] [ 88%] 2025-12-04T13:28:26.5348639Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_nn_functional_unfold_cuda_float32 PASSED [1.0386s] [ 88%] 2025-12-04T13:28:26.5348754Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_norm_inf_cuda_float32 PASSED [0.0065s] [ 88%] 2025-12-04T13:28:26.5348873Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ormqr_cuda_float32 PASSED [0.1351s] [ 88%] 2025-12-04T13:28:26.5348989Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_polar_cuda_float32 PASSED [0.0146s] [ 88%] 2025-12-04T13:28:26.5349122Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_polygamma_polygamma_n_2_cuda_float32 PASSED [0.0097s] [ 88%] 2025-12-04T13:28:26.5349233Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_qr_cuda_float32 PASSED [0.0403s] [ 88%] 2025-12-04T13:28:26.5349351Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_randint_like_cuda_float32 PASSED [1.5225s] [ 88%] 2025-12-04T13:28:26.5349465Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_randn_cuda_float32 PASSED [0.0058s] [ 88%] 2025-12-04T13:28:26.5349574Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_ravel_cuda_float32 PASSED [0.0070s] [ 88%] 2025-12-04T13:28:26.5349689Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_renorm_cuda_float32 PASSED [0.0101s] [ 88%] 2025-12-04T13:28:26.5349799Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_repeat_cuda_float32 PASSED [0.0297s] [ 88%] 2025-12-04T13:28:26.5349932Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_round_decimals_neg_3_cuda_float32 PASSED [0.0047s] [ 88%] 2025-12-04T13:28:26.5350042Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_rsub_cuda_float32 PASSED [0.0131s] [ 89%] 2025-12-04T13:28:26.5350158Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_cuda_float32 PASSED [0.0268s] [ 89%] 2025-12-04T13:28:26.5350286Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_scatter_reduce_amin_cuda_float32 PASSED [0.0223s] [ 89%] 2025-12-04T13:28:26.5350398Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sgn_cuda_float32 PASSED [0.0031s] [ 89%] 2025-12-04T13:28:26.5350531Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_blackman_cuda_float32 PASSED [0.0253s] [ 89%] 2025-12-04T13:28:26.5350676Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_signal_windows_exponential_cuda_float32 PASSED [0.0156s] [ 89%] 2025-12-04T13:28:26.5350782Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sin_cuda_float32 PASSED [1.4944s] [ 89%] 2025-12-04T13:28:26.5350899Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_softmax_cuda_float32 PASSED [0.0094s] [ 89%] 2025-12-04T13:28:26.5351063Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sparse_sampled_addmm_cuda_float32 SKIPPED [0.0014s] (Skip failing test) [ 89%] 2025-12-04T13:28:26.5351202Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_modified_bessel_k0_cuda_float32 PASSED [0.0051s] [ 89%] 2025-12-04T13:28:26.5351362Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_polygamma_special_polygamma_n_0_cuda_float32 PASSED [0.0127s] [ 89%] 2025-12-04T13:28:26.5351486Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_special_xlog1py_cuda_float32 PASSED [0.0175s] [ 89%] 2025-12-04T13:28:26.5351599Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sqrt_cuda_float32 PASSED [1.5046s] [ 89%] 2025-12-04T13:28:26.5351717Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_std_unbiased_cuda_float32 PASSED [0.0061s] [ 89%] 2025-12-04T13:28:26.5351887Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_sum_to_size_cuda_float32 PASSED [1.5371s] [ 89%] 2025-12-04T13:28:26.5351995Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_svd_cuda_float32 PASSED [0.3240s] [ 89%] 2025-12-04T13:28:26.5352109Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_t_copy_cuda_float32 PASSED [0.0045s] [ 89%] 2025-12-04T13:28:26.5352232Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_take_cuda_float32 PASSED [0.0099s] [ 89%] 2025-12-04T13:28:26.5352406Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_torch__scaled_mm_cuda_float8_e4m3fn SKIPPED [0.0006s] (Requires CUDA SM >= 8.9) [ 89%] 2025-12-04T13:28:26.5352609Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_torch__scaled_mm_v2_cuda_float8_e4m3fn SKIPPED [0.0007s] (Requires CUDA SM >= 8.9) [ 89%] 2025-12-04T13:28:26.5352733Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unbind_copy_cuda_float32 PASSED [0.0065s] [ 89%] 2025-12-04T13:28:26.5352848Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unflatten_cuda_float32 PASSED [0.0126s] [ 89%] 2025-12-04T13:28:26.5352969Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unfold_copy_cuda_float32 PASSED [0.0171s] [ 89%] 2025-12-04T13:28:26.5353084Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_unfold_cuda_float32 PASSED [0.0174s] [ 89%] 2025-12-04T13:28:26.5353191Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_var_cuda_float32 PASSED [0.0129s] [ 89%] 2025-12-04T13:28:26.5353315Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_view_as_real_cuda_complex64 PASSED [0.0034s] [ 90%] 2025-12-04T13:28:26.5353427Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_vsplit_cuda_float32 PASSED [0.0055s] [ 90%] 2025-12-04T13:28:26.5353540Z test_ops.py::TestFakeTensorCUDA::test_pointwise_ops_xlogy_cuda_float32 PASSED [0.0166s] [ 90%] 2025-12-04T13:28:26.5353659Z test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_bfloat16 PASSED [0.0069s] [ 90%] 2025-12-04T13:28:26.5353779Z test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_arange_cuda_int32 PASSED [0.0060s] [ 90%] 2025-12-04T13:28:26.5353897Z test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_cuda_int8 PASSED [0.0115s] [ 90%] 2025-12-04T13:28:26.5354046Z test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_linspace_tensor_overload_cuda_float32 PASSED [0.0391s] [ 90%] 2025-12-04T13:28:26.5354165Z test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_cuda_int8 PASSED [0.0236s] [ 90%] 2025-12-04T13:28:26.5354312Z test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_logspace_tensor_overload_cuda_float32 PASSED [0.2417s] [ 90%] 2025-12-04T13:28:26.5354430Z test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_bfloat16 PASSED [1.5078s] [ 90%] 2025-12-04T13:28:26.5354548Z test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_float64 PASSED [0.0034s] [ 90%] 2025-12-04T13:28:26.5354661Z test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_ones_cuda_int64 PASSED [1.4989s] [ 90%] 2025-12-04T13:28:26.5354783Z test_ops.py::TestFakeTensorCUDA::test_strided_layout__refs_zeros_cuda_int16 PASSED [0.0036s] [ 90%] 2025-12-04T13:28:26.5354897Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_int16 PASSED [0.0059s] [ 90%] 2025-12-04T13:28:26.5355007Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_int32 PASSED [0.0057s] [ 90%] 2025-12-04T13:28:26.5355121Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_arange_cuda_int8 PASSED [0.0055s] [ 90%] 2025-12-04T13:28:26.5355232Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_complex32 PASSED [1.5232s] [ 90%] 2025-12-04T13:28:26.5355347Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_float64 PASSED [0.0038s] [ 90%] 2025-12-04T13:28:26.5355453Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_full_cuda_int32 PASSED [1.5010s] [ 90%] 2025-12-04T13:28:26.5355577Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_bfloat16 PASSED [0.0109s] [ 90%] 2025-12-04T13:28:26.5355711Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_complex128 PASSED [0.0099s] [ 90%] 2025-12-04T13:28:26.5355832Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_complex64 PASSED [0.0098s] [ 90%] 2025-12-04T13:28:26.5355947Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_float64 PASSED [0.0097s] [ 90%] 2025-12-04T13:28:26.5356073Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_cuda_uint8 PASSED [0.0067s] [ 90%] 2025-12-04T13:28:26.5356211Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_bfloat16 PASSED [0.0368s] [ 90%] 2025-12-04T13:28:26.5356383Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_complex128 PASSED [0.0348s] [ 91%] 2025-12-04T13:28:26.5356521Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_float64 PASSED [0.0348s] [ 91%] 2025-12-04T13:28:26.5356658Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_linspace_tensor_overload_cuda_uint8 PASSED [0.0222s] [ 91%] 2025-12-04T13:28:26.5356804Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_complex128 PASSED [0.2215s] [ 91%] 2025-12-04T13:28:26.5356934Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_logspace_tensor_overload_cuda_int32 PASSED [0.2027s] [ 91%] 2025-12-04T13:28:26.5357051Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_bfloat16 PASSED [1.5130s] [ 91%] 2025-12-04T13:28:26.5357158Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_int32 PASSED [0.0035s] [ 91%] 2025-12-04T13:28:26.5357269Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_ones_cuda_int8 PASSED [1.5069s] [ 91%] 2025-12-04T13:28:26.5357381Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_bfloat16 PASSED [0.0036s] [ 91%] 2025-12-04T13:28:26.5357494Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_int16 PASSED [1.5089s] [ 91%] 2025-12-04T13:28:26.5357602Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_int32 PASSED [0.0035s] [ 91%] 2025-12-04T13:28:26.5357717Z test_ops.py::TestFakeTensorCUDA::test_strided_layout_zeros_cuda_uint8 PASSED [1.5157s] [ 91%] 2025-12-04T13:28:26.5357825Z test_ops.py::TestTagsCUDA::test_tags_H_cuda_float32 SKIPPED [0.0017s] (Only runs on cpu) [ 91%] 2025-12-04T13:28:26.5357943Z test_ops.py::TestTagsCUDA::test_tags___radd___cuda_float32 SKIPPED [0.0012s] (Only runs on cpu) [ 91%] 2025-12-04T13:28:26.5358062Z test_ops.py::TestTagsCUDA::test_tags___rmatmul___cuda_float32 SKIPPED [0.0013s] (Only runs on cpu) [ 91%] 2025-12-04T13:28:26.5358176Z test_ops.py::TestTagsCUDA::test_tags___rpow___cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 91%] 2025-12-04T13:28:26.5358292Z test_ops.py::TestTagsCUDA::test_tags__chunk_cat_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 91%] 2025-12-04T13:28:26.5358431Z test_ops.py::TestTagsCUDA::test_tags__native_batch_norm_legit_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 91%] 2025-12-04T13:28:26.5358540Z test_ops.py::TestTagsCUDA::test_tags__refs_T_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 91%] 2025-12-04T13:28:26.5358679Z test_ops.py::TestTagsCUDA::test_tags__refs__conversions_byte_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 91%] 2025-12-04T13:28:26.5358821Z test_ops.py::TestTagsCUDA::test_tags__refs__conversions_cdouble_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 91%] 2025-12-04T13:28:26.5358957Z test_ops.py::TestTagsCUDA::test_tags__refs__conversions_chalf_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 91%] 2025-12-04T13:28:26.5359094Z test_ops.py::TestTagsCUDA::test_tags__refs__conversions_float_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 91%] 2025-12-04T13:28:26.5359209Z test_ops.py::TestTagsCUDA::test_tags__refs_abs_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 91%] 2025-12-04T13:28:26.5359336Z test_ops.py::TestTagsCUDA::test_tags__refs_atleast_1d_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 91%] 2025-12-04T13:28:26.5359458Z test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_not_cuda_int64 SKIPPED [0.0011s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5359597Z test_ops.py::TestTagsCUDA::test_tags__refs_bitwise_xor_cuda_int64 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5359719Z test_ops.py::TestTagsCUDA::test_tags__refs_bucketize_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5359866Z test_ops.py::TestTagsCUDA::test_tags__refs_count_nonzero_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5359984Z test_ops.py::TestTagsCUDA::test_tags__refs_equal_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5360132Z test_ops.py::TestTagsCUDA::test_tags__refs_fft_hfft_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5360252Z test_ops.py::TestTagsCUDA::test_tags__refs_fft_ifftn_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5360375Z test_ops.py::TestTagsCUDA::test_tags__refs_fft_ihfft_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5360501Z test_ops.py::TestTagsCUDA::test_tags__refs_fft_irfftn_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5360619Z test_ops.py::TestTagsCUDA::test_tags__refs_flatten_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5360740Z test_ops.py::TestTagsCUDA::test_tags__refs_flipud_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5360852Z test_ops.py::TestTagsCUDA::test_tags__refs_gcd_cuda_int64 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5360977Z test_ops.py::TestTagsCUDA::test_tags__refs_geometric_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5361094Z test_ops.py::TestTagsCUDA::test_tags__refs_hypot_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5361207Z test_ops.py::TestTagsCUDA::test_tags__refs_i0_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5361324Z test_ops.py::TestTagsCUDA::test_tags__refs_imag_cuda_complex64 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5361450Z test_ops.py::TestTagsCUDA::test_tags__refs_index_copy_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5361568Z test_ops.py::TestTagsCUDA::test_tags__refs_isclose_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5361697Z test_ops.py::TestTagsCUDA::test_tags__refs_isposinf_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5361807Z test_ops.py::TestTagsCUDA::test_tags__refs_lcm_cuda_int64 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5361973Z test_ops.py::TestTagsCUDA::test_tags__refs_lgamma_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5362103Z test_ops.py::TestTagsCUDA::test_tags__refs_linalg_cross_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5362226Z test_ops.py::TestTagsCUDA::test_tags__refs_linalg_norm_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5362345Z test_ops.py::TestTagsCUDA::test_tags__refs_log10_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5362468Z test_ops.py::TestTagsCUDA::test_tags__refs_logical_and_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 92%] 2025-12-04T13:28:26.5362592Z test_ops.py::TestTagsCUDA::test_tags__refs_logspace_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5362715Z test_ops.py::TestTagsCUDA::test_tags__refs_logsumexp_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5362862Z test_ops.py::TestTagsCUDA::test_tags__refs_meshgrid_variadic_tensors_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5362981Z test_ops.py::TestTagsCUDA::test_tags__refs_minimum_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5363102Z test_ops.py::TestTagsCUDA::test_tags__refs_movedim_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5363241Z test_ops.py::TestTagsCUDA::test_tags__refs_nan_to_num_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5363364Z test_ops.py::TestTagsCUDA::test_tags__refs_narrow_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5363497Z test_ops.py::TestTagsCUDA::test_tags__refs_new_empty_strided_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5363658Z test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_leaky_relu_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5363808Z test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_prelu_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5363983Z test_ops.py::TestTagsCUDA::test_tags__refs_nn_functional_triplet_margin_loss_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5364118Z test_ops.py::TestTagsCUDA::test_tags__refs_normal__in_place_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5364234Z test_ops.py::TestTagsCUDA::test_tags__refs_ones_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5364357Z test_ops.py::TestTagsCUDA::test_tags__refs_pow_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5364470Z test_ops.py::TestTagsCUDA::test_tags__refs_sin_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5364590Z test_ops.py::TestTagsCUDA::test_tags__refs_sinh_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5364715Z test_ops.py::TestTagsCUDA::test_tags__refs_special_i0e_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5364878Z test_ops.py::TestTagsCUDA::test_tags__refs_special_multigammaln_mvlgamma_p_5_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5365010Z test_ops.py::TestTagsCUDA::test_tags__refs_split_with_sizes_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5365128Z test_ops.py::TestTagsCUDA::test_tags__refs_sqrt_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5365248Z test_ops.py::TestTagsCUDA::test_tags__refs_square_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5365371Z test_ops.py::TestTagsCUDA::test_tags__refs_std_mean_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5365485Z test_ops.py::TestTagsCUDA::test_tags__refs_stft_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5365611Z test_ops.py::TestTagsCUDA::test_tags__refs_sum_to_size_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5365746Z test_ops.py::TestTagsCUDA::test_tags__refs_take_along_dim_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 93%] 2025-12-04T13:28:26.5365856Z test_ops.py::TestTagsCUDA::test_tags__refs_to_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5365978Z test_ops.py::TestTagsCUDA::test_tags__refs_trace_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5366104Z test_ops.py::TestTagsCUDA::test_tags__refs_transpose_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5366231Z test_ops.py::TestTagsCUDA::test_tags__refs_unbind_copy_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5366354Z test_ops.py::TestTagsCUDA::test_tags__refs_unfold_copy_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5366487Z test_ops.py::TestTagsCUDA::test_tags__refs_unsqueeze_copy_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5366605Z test_ops.py::TestTagsCUDA::test_tags__refs_vstack_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5366745Z test_ops.py::TestTagsCUDA::test_tags__segment_reduce_offsets_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5366860Z test_ops.py::TestTagsCUDA::test_tags_addcmul_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5366993Z test_ops.py::TestTagsCUDA::test_tags_alias_copy_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5367103Z test_ops.py::TestTagsCUDA::test_tags_amax_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5367220Z test_ops.py::TestTagsCUDA::test_tags_argmin_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5367378Z test_ops.py::TestTagsCUDA::test_tags_as_strided_partial_views_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5367500Z test_ops.py::TestTagsCUDA::test_tags_atanh_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5367634Z test_ops.py::TestTagsCUDA::test_tags_bitwise_and_cuda_int64 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5367749Z test_ops.py::TestTagsCUDA::test_tags_bitwise_xor_cuda_int64 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5367864Z test_ops.py::TestTagsCUDA::test_tags_bool_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5367991Z test_ops.py::TestTagsCUDA::test_tags_broadcast_shapes_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5368123Z test_ops.py::TestTagsCUDA::test_tags_broadcast_tensors_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5368236Z test_ops.py::TestTagsCUDA::test_tags_cauchy_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5368353Z test_ops.py::TestTagsCUDA::test_tags_chalf_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5368469Z test_ops.py::TestTagsCUDA::test_tags_clamp_max_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5368590Z test_ops.py::TestTagsCUDA::test_tags_clamp_min_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5368700Z test_ops.py::TestTagsCUDA::test_tags_clone_cuda_float32 SKIPPED [0.0012s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5368827Z test_ops.py::TestTagsCUDA::test_tags_constant_pad_nd_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 94%] 2025-12-04T13:28:26.5368947Z test_ops.py::TestTagsCUDA::test_tags_contiguous_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5369061Z test_ops.py::TestTagsCUDA::test_tags_cov_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5369175Z test_ops.py::TestTagsCUDA::test_tags_cummin_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5369302Z test_ops.py::TestTagsCUDA::test_tags_diagonal_copy_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5369421Z test_ops.py::TestTagsCUDA::test_tags_digamma_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5369533Z test_ops.py::TestTagsCUDA::test_tags_einsum_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5369662Z test_ops.py::TestTagsCUDA::test_tags_empty_permuted_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5369774Z test_ops.py::TestTagsCUDA::test_tags_expand_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5369889Z test_ops.py::TestTagsCUDA::test_tags_eye_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5370001Z test_ops.py::TestTagsCUDA::test_tags_fft_fft2_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5370127Z test_ops.py::TestTagsCUDA::test_tags_fft_fftshift_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5370240Z test_ops.py::TestTagsCUDA::test_tags_fft_hfft_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5370359Z test_ops.py::TestTagsCUDA::test_tags_fft_ifft_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5370473Z test_ops.py::TestTagsCUDA::test_tags_fft_ifftn_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5370591Z test_ops.py::TestTagsCUDA::test_tags_fft_ihfft_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5370720Z test_ops.py::TestTagsCUDA::test_tags_fft_ihfftn_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5370842Z test_ops.py::TestTagsCUDA::test_tags_fft_irfftn_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5370951Z test_ops.py::TestTagsCUDA::test_tags_fill_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5371085Z test_ops.py::TestTagsCUDA::test_tags_float_power_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5371211Z test_ops.py::TestTagsCUDA::test_tags_fmax_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5371652Z test_ops.py::TestTagsCUDA::test_tags_fmod_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5371767Z test_ops.py::TestTagsCUDA::test_tags_frac_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5371960Z test_ops.py::TestTagsCUDA::test_tags_grid_sampler_3d_cuda_float32 SKIPPED [0.0001s] (Skipped!) [ 95%] 2025-12-04T13:28:26.5372081Z test_ops.py::TestTagsCUDA::test_tags_heaviside_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5372188Z test_ops.py::TestTagsCUDA::test_tags_i0_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 95%] 2025-12-04T13:28:26.5372305Z test_ops.py::TestTagsCUDA::test_tags_igamma_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5372418Z test_ops.py::TestTagsCUDA::test_tags_igammac_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5372533Z test_ops.py::TestTagsCUDA::test_tags_imag_cuda_complex64 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5372651Z test_ops.py::TestTagsCUDA::test_tags_index_copy_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5372782Z test_ops.py::TestTagsCUDA::test_tags_index_reduce_amin_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5372891Z test_ops.py::TestTagsCUDA::test_tags_int_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5373005Z test_ops.py::TestTagsCUDA::test_tags_isin_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5373116Z test_ops.py::TestTagsCUDA::test_tags_isinf_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5373233Z test_ops.py::TestTagsCUDA::test_tags_istft_cuda_complex64 SKIPPED [0.0011s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5373387Z test_ops.py::TestTagsCUDA::test_tags_jiterator_4inputs_with_extra_args_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5373512Z test_ops.py::TestTagsCUDA::test_tags_jiterator_unary_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5373625Z test_ops.py::TestTagsCUDA::test_tags_lcm_cuda_int64 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5373753Z test_ops.py::TestTagsCUDA::test_tags_linalg_cholesky_ex_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5373882Z test_ops.py::TestTagsCUDA::test_tags_linalg_eigvalsh_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5374030Z test_ops.py::TestTagsCUDA::test_tags_linalg_norm_subgradients_at_zero_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5374149Z test_ops.py::TestTagsCUDA::test_tags_linalg_qr_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5374272Z test_ops.py::TestTagsCUDA::test_tags_linalg_svdvals_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5374399Z test_ops.py::TestTagsCUDA::test_tags_linalg_vander_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5374520Z test_ops.py::TestTagsCUDA::test_tags_linalg_vecdot_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5374651Z test_ops.py::TestTagsCUDA::test_tags_linalg_vector_norm_cuda_float32 SKIPPED [0.0012s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5374763Z test_ops.py::TestTagsCUDA::test_tags_log10_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5374901Z test_ops.py::TestTagsCUDA::test_tags_log1p_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5375024Z test_ops.py::TestTagsCUDA::test_tags_log_softmax_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5375157Z test_ops.py::TestTagsCUDA::test_tags_logdet_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5375279Z test_ops.py::TestTagsCUDA::test_tags_logsumexp_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 96%] 2025-12-04T13:28:26.5375416Z test_ops.py::TestTagsCUDA::test_tags_masked_argmax_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5375562Z test_ops.py::TestTagsCUDA::test_tags_masked_cumprod_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5375681Z test_ops.py::TestTagsCUDA::test_tags_masked_cumsum_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5375814Z test_ops.py::TestTagsCUDA::test_tags_masked_log_softmax_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5375939Z test_ops.py::TestTagsCUDA::test_tags_masked_normalize_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5376061Z test_ops.py::TestTagsCUDA::test_tags_masked_prod_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5376175Z test_ops.py::TestTagsCUDA::test_tags_maximum_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5376309Z test_ops.py::TestTagsCUDA::test_tags_min_reduction_no_dim_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5376417Z test_ops.py::TestTagsCUDA::test_tags_mm_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5376533Z test_ops.py::TestTagsCUDA::test_tags_nansum_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5376645Z test_ops.py::TestTagsCUDA::test_tags_narrow_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5376776Z test_ops.py::TestTagsCUDA::test_tags_native_batch_norm_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5376891Z test_ops.py::TestTagsCUDA::test_tags_new_empty_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5377033Z test_ops.py::TestTagsCUDA::test_tags_nn_functional_batch_norm_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5377189Z test_ops.py::TestTagsCUDA::test_tags_nn_functional_binary_cross_entropy_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5377324Z test_ops.py::TestTagsCUDA::test_tags_nn_functional_ctc_loss_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5377478Z test_ops.py::TestTagsCUDA::test_tags_nn_functional_hinge_embedding_loss_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5377628Z test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_bicubic_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5377781Z test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_linear_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5377930Z test_ops.py::TestTagsCUDA::test_tags_nn_functional_interpolate_nearest_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5378070Z test_ops.py::TestTagsCUDA::test_tags_nn_functional_leaky_relu_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5378206Z test_ops.py::TestTagsCUDA::test_tags_nn_functional_logsigmoid_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5378359Z test_ops.py::TestTagsCUDA::test_tags_nn_functional_margin_ranking_loss_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5378503Z test_ops.py::TestTagsCUDA::test_tags_nn_functional_max_unpool1d_grad_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5378640Z test_ops.py::TestTagsCUDA::test_tags_nn_functional_mse_loss_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 97%] 2025-12-04T13:28:26.5378815Z test_ops.py::TestTagsCUDA::test_tags_nn_functional_multi_head_attention_forward_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5378956Z test_ops.py::TestTagsCUDA::test_tags_nn_functional_pad_replicate_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5379101Z test_ops.py::TestTagsCUDA::test_tags_nn_functional_relu6_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5379275Z test_ops.py::TestTagsCUDA::test_tags_nn_functional_scaled_dot_product_attention_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5379407Z test_ops.py::TestTagsCUDA::test_tags_norm_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5379543Z test_ops.py::TestTagsCUDA::test_tags_polygamma_polygamma_n_0_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5379681Z test_ops.py::TestTagsCUDA::test_tags_polygamma_polygamma_n_3_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5379799Z test_ops.py::TestTagsCUDA::test_tags_positive_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5379911Z test_ops.py::TestTagsCUDA::test_tags_qr_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5380026Z test_ops.py::TestTagsCUDA::test_tags_rad2deg_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5380150Z test_ops.py::TestTagsCUDA::test_tags_randint_like_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5380269Z test_ops.py::TestTagsCUDA::test_tags_randn_like_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5380386Z test_ops.py::TestTagsCUDA::test_tags_ravel_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5380510Z test_ops.py::TestTagsCUDA::test_tags_resolve_conj_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5380623Z test_ops.py::TestTagsCUDA::test_tags_rot90_cuda_float32 SKIPPED [0.0012s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5380754Z test_ops.py::TestTagsCUDA::test_tags_round_decimals_3_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5380863Z test_ops.py::TestTagsCUDA::test_tags_rsub_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5380998Z test_ops.py::TestTagsCUDA::test_tags_scatter_reduce_mean_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5381110Z test_ops.py::TestTagsCUDA::test_tags_select_cuda_float32 SKIPPED [0.0010s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5381224Z test_ops.py::TestTagsCUDA::test_tags_sign_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5381363Z test_ops.py::TestTagsCUDA::test_tags_signal_windows_gaussian_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5381496Z test_ops.py::TestTagsCUDA::test_tags_signal_windows_hann_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5381606Z test_ops.py::TestTagsCUDA::test_tags_sin_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5381733Z test_ops.py::TestTagsCUDA::test_tags_slice_scatter_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5381846Z test_ops.py::TestTagsCUDA::test_tags_softmax_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 98%] 2025-12-04T13:28:26.5382006Z test_ops.py::TestTagsCUDA::test_tags_sparse_mm_reduce_cuda_float32 SKIPPED [0.0001s] (Skipped!) [ 99%] 2025-12-04T13:28:26.5382130Z test_ops.py::TestTagsCUDA::test_tags_sparse_sampled_addmm_cuda_float32 SKIPPED [0.0001s] (Skipped!) [ 99%] 2025-12-04T13:28:26.5382254Z test_ops.py::TestTagsCUDA::test_tags_special_i0e_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5382378Z test_ops.py::TestTagsCUDA::test_tags_special_i1e_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5382540Z test_ops.py::TestTagsCUDA::test_tags_special_modified_bessel_k1_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5382666Z test_ops.py::TestTagsCUDA::test_tags_special_ndtri_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5382785Z test_ops.py::TestTagsCUDA::test_tags_special_zeta_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5382925Z test_ops.py::TestTagsCUDA::test_tags_split_list_args_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5383058Z test_ops.py::TestTagsCUDA::test_tags_squeeze_copy_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5383197Z test_ops.py::TestTagsCUDA::test_tags_squeeze_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5383308Z test_ops.py::TestTagsCUDA::test_tags_stack_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5383427Z test_ops.py::TestTagsCUDA::test_tags_std_mean_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5383536Z test_ops.py::TestTagsCUDA::test_tags_svd_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5383661Z test_ops.py::TestTagsCUDA::test_tags_svd_lowrank_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5383784Z test_ops.py::TestTagsCUDA::test_tags_tensor_split_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5383898Z test_ops.py::TestTagsCUDA::test_tags_topk_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5384017Z test_ops.py::TestTagsCUDA::test_tags_trapezoid_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5384135Z test_ops.py::TestTagsCUDA::test_tags_unfold_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5384252Z test_ops.py::TestTagsCUDA::test_tags_uniform_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5384373Z test_ops.py::TestTagsCUDA::test_tags_unravel_index_cuda_int64 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5384492Z test_ops.py::TestTagsCUDA::test_tags_var_mean_cuda_float32 SKIPPED [0.0011s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5384605Z test_ops.py::TestTagsCUDA::test_tags_where_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5384724Z test_ops.py::TestTagsCUDA::test_tags_zeros_cuda_float32 SKIPPED [0.0009s] (Only runs on cpu) [ 99%] 2025-12-04T13:28:26.5384896Z test_ops.py::TestForwardADWithScalarsCUDA::test_0d_tensor_with_python_scalar_div_no_rounding_mode_cuda_float32 PASSED [0.0020s] [ 99%] 2025-12-04T13:28:26.5385070Z test_ops.py::TestForwardADWithScalarsCUDA::test_0d_tensor_with_python_scalar_div_trunc_rounding_cuda_float32 PASSED [0.0014s] [100%] 2025-12-04T13:28:26.5385074Z 2025-12-04T13:28:26.5385258Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/test_ops/test_ops-f555603e316361f2.xml - 2025-12-04T13:28:26.5385358Z == 2128 passed, 337 skipped, 4208 deselected, 29 xfailed in 816.17s (0:13:36) == 2025-12-04T13:28:26.5385588Z The following tests failed and then succeeded when run in a new process['test/test_ops.py::TestCommonCUDA::test_python_ref_torch_fallback__refs_linalg_diagonal_cuda_complex32'] 2025-12-04T13:28:26.5385595Z 2025-12-04T13:28:26.5385718Z FINISHED PRINTING LOG FILE of test_ops 3/5 (test/test-reports/test_ops_3.5_ea3c4bc91b7c0df0_.log) 2025-12-04T13:28:26.5385722Z 2025-12-04T13:28:26.5385820Z Finished test_ops 3/5 ... [2025-12-04 13:28:26.283781][3581414.808591433], took 44.16min 2025-12-04T13:28:26.5386064Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T13:28:26.5386160Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:28:26.5386258Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T13:28:26.5386313Z Uploading artifacts took 0.00 seconds 2025-12-04T13:28:26.5386417Z Running test_jit_llga_fuser 1/1 ... [2025-12-04 13:28:26.291413][3581414.816226229] 2025-12-04T13:28:26.5386472Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:28:26.5386787Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_jit_llga_fuser.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:28:26.291627] 2025-12-04T13:28:52.2494393Z 2025-12-04T13:28:52.2495155Z test_jit_llga_fuser 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_jit_llga_fuser_1.1_a92e6e46e811279d_.log 2025-12-04T13:28:52.2510221Z Running 107 items in this shard: test/test_jit_llga_fuser.py::TestEnableDisableLlgaFuser::test_context_manager, test/test_jit_llga_fuser.py::TestDynamoAOT::test_dynamo_aot_ts_onednn, test/test_jit_llga_fuser.py::TestModel::test_vision_alexnet_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_alexnet_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_densenet121_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_densenet121_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_densenet161_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_densenet161_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_densenet169_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_densenet169_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_densenet201_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_densenet201_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b0_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b0_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b1_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b1_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b2_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b2_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b3_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b3_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b4_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b4_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b5_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b5_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b6_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b6_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b7_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_efficientnet_b7_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_googlenet_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_googlenet_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_mnasnet1_0_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_mnasnet1_0_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_mobilenet_v2_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_mobilenet_v2_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_mobilenet_v3_large_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_mobilenet_v3_large_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_regnet_y_400mf_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_regnet_y_400mf_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_resnet50_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_resnet50_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_resnext101_32x8d_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_resnext101_32x8d_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_resnext50_32x4d_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_resnext50_32x4d_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_shufflenet_v2_x1_0_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_shufflenet_v2_x1_0_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_squeezenet1_0_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_squeezenet1_0_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_vgg16_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_vgg16_float32, test/test_jit_llga_fuser.py::TestModel::test_vision_wide_resnet50_2_bfloat16, test/test_jit_llga_fuser.py::TestModel::test_vision_wide_resnet50_2_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_bn2d_eltwise_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_bn2d_eltwise_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_bn_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_bn_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_bn_relu_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_bn_relu_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_clamp_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_clamp_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_eltwise_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_eltwise_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_silu_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_silu_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_sum_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_conv2d_sum_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_ensure_tensor_is_rewrapped_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_ensure_tensor_is_rewrapped_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_linear_eltwise_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_linear_eltwise_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_rewrap_tensor_input_to_pytorch_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_rewrap_tensor_input_to_pytorch_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_wildcard_cuda_bfloat16, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_wildcard_cuda_float32, test/test_jit_llga_fuser.py::TestFusionPatternCUDA::test_wildcard_unsupported_dtype_cuda_int32, test/test_jit_llga_fuser.py::TestOpCUDA::test_add_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_add_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_add_scalar_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_add_scalar_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_addmm_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_addmm_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_avg_pool2d_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_avg_pool2d_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_bn2d_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_bn2d_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_cat_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_cat_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_conv2d_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_conv2d_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_eltwise_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_eltwise_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_identity_binary_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_identity_binary_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_layer_norm_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_layer_norm_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_linear_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_linear_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_max_pool2d_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_max_pool2d_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_mul_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_mul_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_softmax_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_softmax_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_typecheck_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_typecheck_cuda_float32, test/test_jit_llga_fuser.py::TestOpCUDA::test_variable_kernel_avg_pool2d_cuda_bfloat16, test/test_jit_llga_fuser.py::TestOpCUDA::test_variable_kernel_avg_pool2d_cuda_float32 2025-12-04T13:28:52.2521412Z 2025-12-04T13:28:52.2521527Z Finished test_jit_llga_fuser 1/1 ... [2025-12-04 13:28:52.249131][3581440.773939537], took 0.43min 2025-12-04T13:28:52.2521974Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T13:28:52.2569742Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:28:52.2572708Z Running test_sparse_csr 2/2 ... [2025-12-04 13:28:52.257047][3581440.781860687] 2025-12-04T13:28:52.2573025Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:28:52.2574795Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_sparse_csr.py', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:28:52.257275] 2025-12-04T13:39:42.1872946Z 2025-12-04T13:39:42.1873754Z test_sparse_csr 2/2 was successful, full logs can be found in artifacts with path test/test-reports/test_sparse_csr_2.2_2b9c1a10cfbae0b8_.log 2025-12-04T13:39:42.2206003Z Running 2417 items in this shard: test/test_sparse_csr.py::TestSparseCSRCUDA::test_add_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_add_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_all_sparse_csr_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_all_sparse_csr_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_dense_result_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_dense_result_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_dense_result_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_dense_result_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_dense_result_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_errors_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_0_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_25_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_25_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_0_m_25_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_0_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_0_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_0_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_0_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_25_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_25_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_10_m_25_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_0_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_0_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_0_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_0_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_25_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_25_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_25_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_0_n_1_m_25_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_0_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_0_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_0_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_25_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_25_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_0_m_25_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_0_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_0_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_10_m_25_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_0_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_0_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_1_n_1_m_1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_0_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_0_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_0_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_25_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_25_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_0_m_25_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_0_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_10_m_25_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_0_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_0_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmm_sizes_all_sparse_csr_k_8_n_1_m_1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_11x9_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_11x9_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_11x9_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_3x3_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_5x7_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_addmv_shape_5x7_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_abs_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_angle_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_angle_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_asin_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_asinh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_atan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_atanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_ceil_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_erf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_erfinv_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_expm1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_isnan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_isneginf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_neg_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_positive_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_positive_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_rad2deg_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_sgn_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_sgn_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_signbit_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_sin_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_sin_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_sinh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_autograd_sparse_csr_unary_trunc_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_baddbmm_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_baddbmm_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_baddbmm_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_True_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int32_noncontiguous_True_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_False_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_True_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_True_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_2_int64_noncontiguous_True_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_False_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_False_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int32_noncontiguous_True_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_False_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_True_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_True_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmm_block_size_3_int64_noncontiguous_True_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int32_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int32_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int32_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int32_noncontiguous_True_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int64_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int64_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_2_int64_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int32_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int32_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int32_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int64_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int64_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int64_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_addmv_block_size_3_int64_noncontiguous_True_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int32_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int32_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int32_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int32_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int64_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int64_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int64_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int64_noncontiguous_False_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int64_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_2_int64_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int32_noncontiguous_False_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int32_noncontiguous_False_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int32_noncontiguous_True_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int64_noncontiguous_False_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int64_noncontiguous_True_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_block_triangular_solve_block_size_3_int64_noncontiguous_True_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_bmm_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_bmm_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseBSC_SparseBSC_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseBSC_SparseCSR_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseBSR_SparseBSR_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseBSR_SparseCSC_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseCSC_SparseCSC_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseCSR_SparseBSC_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseCSR_SparseCSC_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_compressed_layout_conversions_coverage_SparseCSR_SparseCSR_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_csr_conversion_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_coo_to_csr_convert_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_coo_conversion_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_matvec_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_matvec_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_matvec_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_matvec_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_storage_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_stride_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_to_block_csr_blocksize_2_cuda_float64_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_csr_to_block_csr_blocksize_4_cuda_float64_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseBSC_NonBatched_Hybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseBSC_NonBatched_NonHybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseBSR_Batched_NonHybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseBSR_NonBatched_Hybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseBSR_NonBatched_NonHybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_dense_to_from_sparse_compressed_SparseCSC_NonBatched_NonHybrid_cuda, test/test_sparse_csr.py::TestSparseCSRCUDA::test_direct_coo_csr_conversion_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_direct_coo_csr_conversion_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_direct_coo_csr_conversion_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_exercise_detach_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_exercise_detach_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_exercise_detach_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_exercise_detach_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_matmul_device_mismatch_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mm_errors_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_mul_scalar_enable_hybrid_False_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_as_sparse_compressed_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_as_sparse_compressed_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_as_sparse_compressed_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_as_sparse_compressed_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_as_sparse_compressed_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_as_sparse_compressed_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_resize_errors_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_autograd_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_autograd_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_errors_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_errors_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_zero_sized_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_zero_sized_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sampled_addmm_zero_sized_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSC_int64_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int32_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseBSR_int64_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int32_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int32_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int32_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSC_int64_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int32_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_select_SparseCSR_int64_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_add_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_add_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_add_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csc_to_dense_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_from_dense_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_to_dense_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_abs_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_angle_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_angle_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_angle_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_angle_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_angle_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asin_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_asinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atan_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_atanh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_ceil_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_ceil_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_ceil_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_ceil_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_ceil_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_conj_physical_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_deg2rad_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_deg2rad_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_deg2rad_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_deg2rad_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erfinv_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erfinv_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erfinv_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_erfinv_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_expm1_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_floor_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_floor_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_floor_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_floor_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_frac_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_frac_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_frac_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isinf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isnan_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isneginf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isneginf_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isneginf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isneginf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isneginf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isposinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isposinf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isposinf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_isposinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_log1p_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_neg_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_nn_functional_relu_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_positive_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_rad2deg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_rad2deg_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_rad2deg_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_rad2deg_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_round_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_round_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_round_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_round_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_round_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sgn_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sign_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sign_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_signbit_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_signbit_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_signbit_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_signbit_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_signbit_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sin_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sin_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sin_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sinh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_sqrt_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tan_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_tanh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_trunc_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_trunc_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_trunc_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_inplace_trunc_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_abs_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_angle_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_angle_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_angle_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_angle_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asin_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_asinh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atan_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_atanh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_ceil_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_ceil_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_ceil_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_conj_physical_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_deg2rad_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_erfinv_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_expm1_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_expm1_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_floor_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_floor_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_floor_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_floor_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_floor_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_frac_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_frac_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_frac_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_frac_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isinf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isnan_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isneginf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isneginf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isneginf_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isposinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isposinf_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isposinf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isposinf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isposinf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_isposinf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_log1p_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_neg_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_nn_functional_relu_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_nn_functional_relu_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_nn_functional_relu_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_positive_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_rad2deg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_rad2deg_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_rad2deg_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_rad2deg_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_rad2deg_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_round_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_round_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_round_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_round_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_round_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_round_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sgn_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sign_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sign_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sign_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sign_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sign_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_signbit_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_signbit_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_signbit_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_signbit_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_signbit_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sin_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sinh_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_sqrt_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tan_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_tanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_trunc_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_trunc_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_trunc_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_trunc_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_trunc_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_csr_unary_out_trunc_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_reduce_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_mm_reduce_sum_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_to_sparse_compressed_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_to_sparse_compressed_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sparse_triangular_solve_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_sum_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_transpose_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_abs_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_angle_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asin_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_asinh_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atan_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_atanh_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_ceil_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_ceil_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_ceil_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_ceil_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_ceil_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_conj_physical_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_deg2rad_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_deg2rad_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_deg2rad_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_deg2rad_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erf_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erf_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erf_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erf_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erfinv_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erfinv_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erfinv_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erfinv_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erfinv_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_erfinv_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_expm1_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_floor_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_floor_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_floor_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_floor_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_floor_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_floor_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_frac_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_frac_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_frac_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isinf_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isinf_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isinf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isnan_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isneginf_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isneginf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isneginf_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isposinf_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isposinf_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_isposinf_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_log1p_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_neg_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_nn_functional_relu_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_positive_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_positive_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_positive_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_positive_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_rad2deg_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_round_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_round_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_round_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sgn_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sign_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sign_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sign_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_signbit_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_signbit_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_signbit_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_int64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sin_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_complex64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_int16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_sqrt_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_complex128, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_int8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tan_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_bool, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_float16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_float32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_int32, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_tanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_trunc_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_trunc_cuda_float64, test/test_sparse_csr.py::TestSparseCSRCUDA::test_zero_to_zero_correspondence_unary_trunc_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_clone_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_abs_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_angle_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asin_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_asinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_atanh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_ceil_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_ceil_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_ceil_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_ceil_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_ceil_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_ceil_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_conj_physical_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_deg2rad_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_deg2rad_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_deg2rad_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_deg2rad_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_deg2rad_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_deg2rad_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_erfinv_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_expm1_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_floor_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_floor_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_floor_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_floor_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_frac_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_frac_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isinf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isnan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isneginf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isneginf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isneginf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isneginf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isneginf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isposinf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isposinf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isposinf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isposinf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isposinf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_isposinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_log1p_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_log1p_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_log1p_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amax_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amax_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amax_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amax_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amax_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amax_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amax_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amin_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amin_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_amin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_mean_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_mean_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_mean_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_mean_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_mean_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_prod_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_masked_sum_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_mul_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_neg_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_nn_functional_relu_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_nn_functional_relu_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_nn_functional_relu_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_positive_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_rad2deg_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_rad2deg_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_rad2deg_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_rad2deg_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_rad2deg_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_rad2deg_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_randn_like_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_randn_like_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_randn_like_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_randn_like_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_round_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_round_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_round_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_round_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_round_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_round_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sgn_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sign_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sign_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sign_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_signbit_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_signbit_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_signbit_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_signbit_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_signbit_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_signbit_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sqrt_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_sum_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_tanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_to_sparse_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_trunc_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_trunc_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_trunc_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_trunc_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_trunc_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_zeros_like_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_zeros_like_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSC_zeros_like_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_abs_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_abs_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_angle_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_angle_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_angle_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_angle_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_asinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_atanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_ceil_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_ceil_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_conj_physical_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_deg2rad_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_deg2rad_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_deg2rad_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erfinv_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erfinv_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erfinv_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erfinv_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_erfinv_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_expm1_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_floor_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_floor_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_floor_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_floor_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_floor_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_frac_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_frac_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isnan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isnan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isnan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isnan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isneginf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isneginf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isneginf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isneginf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isposinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isposinf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isposinf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_isposinf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_log1p_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amax_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amax_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amax_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amax_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amax_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_amin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_mean_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_mean_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_prod_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_sum_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_sum_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_sum_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_masked_sum_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_mul_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_neg_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_nn_functional_relu_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_nn_functional_relu_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_nn_functional_relu_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_nn_functional_relu_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_nn_functional_relu_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_positive_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_rad2deg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_rad2deg_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_rad2deg_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_rad2deg_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_rad2deg_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_randn_like_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_randn_like_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_round_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_round_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_round_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_round_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_round_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sgn_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sign_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sign_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sign_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sign_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_signbit_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_signbit_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_signbit_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_signbit_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_signbit_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sqrt_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_sum_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_tanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_to_sparse_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_trunc_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_trunc_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseBSR_zeros_like_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_abs_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_angle_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_angle_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_angle_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_angle_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asin_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_asinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atan_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_atanh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_ceil_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_ceil_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_ceil_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_ceil_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_ceil_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_conj_physical_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_deg2rad_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_deg2rad_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_deg2rad_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_deg2rad_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_erfinv_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_expm1_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_expm1_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_expm1_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_floor_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_floor_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_floor_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_floor_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_floor_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_floor_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_frac_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isnan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isnan_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isnan_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isnan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isneginf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isneginf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isneginf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isneginf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isneginf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isneginf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isposinf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isposinf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isposinf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_isposinf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_log1p_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amax_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amax_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amax_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amax_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amax_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_amin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_mean_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_mean_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_mean_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_prod_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_masked_sum_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_mul_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_neg_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_nn_functional_relu_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_positive_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_rad2deg_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_rad2deg_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_rad2deg_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_rad2deg_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_rad2deg_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_randn_like_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_randn_like_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_randn_like_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_randn_like_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_round_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_round_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_round_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sgn_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sign_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sign_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sign_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sign_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_signbit_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_signbit_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_signbit_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sinh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sqrt_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_sum_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tan_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_tanh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_to_sparse_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_to_sparse_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_to_sparse_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_to_sparse_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_trunc_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_trunc_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_trunc_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_trunc_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_trunc_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_trunc_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSC_zeros_like_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_abs_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_angle_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_asinh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_atanh_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_ceil_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_ceil_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_ceil_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_ceil_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_conj_physical_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_deg2rad_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_deg2rad_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_deg2rad_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erf_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erf_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_erfinv_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_expm1_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_floor_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_floor_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_floor_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_floor_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_floor_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_frac_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_frac_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isnan_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isnan_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isnan_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isnan_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isneginf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isposinf_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isposinf_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isposinf_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_isposinf_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_log1p_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amax_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amax_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amax_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amax_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amax_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amin_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amin_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amin_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_amin_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_mean_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_mean_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_prod_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_prod_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_prod_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_prod_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_masked_sum_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_mul_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_neg_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_nn_functional_relu_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_nn_functional_relu_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_nn_functional_relu_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_nn_functional_relu_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_positive_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_rad2deg_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_rad2deg_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_rad2deg_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_rad2deg_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_rad2deg_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_randn_like_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_randn_like_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_randn_like_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_randn_like_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_round_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_round_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_round_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_round_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_round_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sgn_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sign_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sign_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sign_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sign_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sign_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sign_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_signbit_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_signbit_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_signbit_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_signbit_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_signbit_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_complex32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sin_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sinh_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sqrt_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_sum_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tan_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_tanh_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_to_sparse_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_trunc_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_trunc_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_trunc_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_trunc_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_consistency_SparseCSR_zeros_like_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_copy_errors_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_dim_SparseBSC_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_dim_SparseBSR_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_dim_SparseCSC_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_dim_SparseCSR_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_errors_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSC_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseBSR_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSC_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_empty_like_SparseCSR_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_invalid_input_SparseBSC_target_sparse_compressed_tensor_no_size_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_invalid_input_SparseBSR_target_sparse_compressed_tensor_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_invalid_input_SparseBSR_target_sparse_compressed_tensor_no_size_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_layout_SparseBSR_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_layout_SparseCSR_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_pickle_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_pickle_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_pickle_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_print_SparseBSR_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_print_SparseCSR_cuda, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int32_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSC_int64_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int32_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseBSR_int64_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int32_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int32_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSC_int64_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int32_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_select_copy_SparseCSR_int64_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_list_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_____from_tensor_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_list_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor___factory_from_tensor_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_list_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference___from_tensor_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_list_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_constructor_shape_and_device_inference_factory_from_tensor_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSC_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_sparse_compressed_tensor_with_dims_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_complex128, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_to_dtype_SparseCSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSC_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSR_cuda_bool, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSR_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseBSR_cuda_uint8, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_float64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSC_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_complex64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_int16, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_int32, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_int64, test/test_sparse_csr.py::TestSparseCompressedCUDA::test_validate_SparseCSR_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_16_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_16_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_16_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_32_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_32_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_32_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_32_int64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_32_int64_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_64_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_64_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_dense_bmm_block_size_64_int64_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_16_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_2_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_2_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_2x3_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_scatter_mm_blocksize_32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_softmax_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_softmax_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_bsr_softmax_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16x32_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16x32_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16x32_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16x32_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16x32_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_16x32_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_32_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_32_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_32_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op__int_bsr_dense_addmm_blocksize_32_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16x32_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16x32_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_16x32_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_32_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_32_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_32_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_32_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_addmm_blocksize_32_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16x32_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16x32_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_16x32_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_32_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_32_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_linear_blocksize_32_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16x32_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16x32_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_16x32_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_32_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_32_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_32_out_dtype_unspecified_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_kernel_op_bsr_dense_mm_blocksize_32_out_dtype_unspecified_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_sampled_addmm_block_size_64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_sampled_addmm_block_size_64_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_sampled_addmm_block_size_64_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scaled_dot_product_attention_block_size_16_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scaled_dot_product_attention_block_size_32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scaled_dot_product_attention_block_size_32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scaled_dot_product_attention_block_size_64_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scaled_dot_product_attention_block_size_64_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_scatter_mm_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op__int_bsr_dense_addmm_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op__int_bsr_dense_addmm_out_dtype_int32_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op__int_bsr_dense_addmm_out_dtype_int32_cuda_float32, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op__int_bsr_dense_addmm_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op__int_bsr_dense_addmm_out_dtype_unspecified_cuda_float16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op_bsr_dense_addmm_out_dtype_int32_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op_bsr_dense_addmm_out_dtype_int32_cuda_int8, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op_bsr_dense_addmm_out_dtype_unspecified_cuda_bfloat16, test/test_sparse_csr.py::TestSparseCompressedTritonKernelsCUDA::test_triton_tune_op_bsr_dense_addmm_out_dtype_unspecified_cuda_int8 2025-12-04T13:39:42.2516870Z 2025-12-04T13:39:42.2516994Z Finished test_sparse_csr 2/2 ... [2025-12-04 13:39:42.189296][3582090.714105984], took 10.83min 2025-12-04T13:39:42.2517437Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T13:39:42.2517789Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:39:42.2517998Z Running optim/test_optim 1/1 ... [2025-12-04 13:39:42.196829][3582090.721643021] 2025-12-04T13:39:42.2518170Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:39:42.2518544Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'optim/test_optim.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:39:42.197049] 2025-12-04T13:39:43.9554140Z 2025-12-04T13:39:43.9555112Z optim/test_optim 1/1 was successful, full logs can be found in artifacts with path test/test-reports/optim.test_optim_1.1_7f561f31974ac048_.log 2025-12-04T13:39:43.9555539Z 2025-12-04T13:39:43.9555743Z Finished optim/test_optim 1/1 ... [2025-12-04 13:39:43.955105][3582092.479914526], took 0.03min 2025-12-04T13:39:43.9573570Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T13:39:43.9633640Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:39:43.9634653Z Running torch_np/numpy_tests/core/test_getlimits 1/1 ... [2025-12-04 13:39:43.963292][3582092.488105551] 2025-12-04T13:39:43.9634994Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:39:43.9637211Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_getlimits.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:39:43.963514] 2025-12-04T13:39:46.2318839Z 2025-12-04T13:39:46.2319872Z torch_np/numpy_tests/core/test_getlimits 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_getlimits_1.1_77167229c1845c3f_.log 2025-12-04T13:39:46.2324142Z Running 17 items in this shard: test/torch_np/numpy_tests/core/test_getlimits.py::TestPythonFloat::test_singleton, test/torch_np/numpy_tests/core/test_getlimits.py::TestHalf::test_singleton, test/torch_np/numpy_tests/core/test_getlimits.py::TestSingle::test_singleton, test/torch_np/numpy_tests/core/test_getlimits.py::TestDouble::test_singleton, test/torch_np/numpy_tests/core/test_getlimits.py::TestFinfo::test_basic, test/torch_np/numpy_tests/core/test_getlimits.py::TestFinfo::test_basic_missing, test/torch_np/numpy_tests/core/test_getlimits.py::TestIinfo::test_basic, test/torch_np/numpy_tests/core/test_getlimits.py::TestIinfo::test_unsigned_max_T0, test/torch_np/numpy_tests/core/test_getlimits.py::TestIinfo::test_unsigned_max_T1, test/torch_np/numpy_tests/core/test_getlimits.py::TestIinfo::test_unsigned_max_T2, test/torch_np/numpy_tests/core/test_getlimits.py::TestIinfo::test_unsigned_max_T3, test/torch_np/numpy_tests/core/test_getlimits.py::TestRepr::test_finfo_repr, test/torch_np/numpy_tests/core/test_getlimits.py::TestRepr::test_iinfo_repr, test/torch_np/numpy_tests/core/test_getlimits.py::TestMisc::test_instances, test/torch_np/numpy_tests/core/test_getlimits.py::TestMisc::test_known_types, test/torch_np/numpy_tests/core/test_getlimits.py::TestMisc::test_plausible_finfo, test/torch_np/numpy_tests/core/test_getlimits.py::TestMisc::test_subnormal_warning 2025-12-04T13:39:46.2328229Z 2025-12-04T13:39:46.2328497Z Finished torch_np/numpy_tests/core/test_getlimits 1/1 ... [2025-12-04 13:39:46.231575][3582094.75638403], took 0.04min 2025-12-04T13:39:46.2339857Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T13:39:46.2398623Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:39:46.2400950Z Running torch_np/test_ndarray_methods 1/1 ... [2025-12-04 13:39:46.239932][3582094.764745342] 2025-12-04T13:39:46.2401219Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:39:46.2402645Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_ndarray_methods.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:39:46.240152] 2025-12-04T13:39:49.8603182Z 2025-12-04T13:39:49.8604822Z torch_np/test_ndarray_methods 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_ndarray_methods_1.1_e66a900311bb4307_.log 2025-12-04T13:39:49.8650364Z Running 342 items in this shard: test/torch_np/test_ndarray_methods.py::TestIndexing::test_indexing_simple, test/torch_np/test_ndarray_methods.py::TestIndexing::test_setitem, test/torch_np/test_ndarray_methods.py::TestReshape::test_reshape_function, test/torch_np/test_ndarray_methods.py::TestReshape::test_reshape_method, test/torch_np/test_ndarray_methods.py::TestTranspose::test_transpose_function, test/torch_np/test_ndarray_methods.py::TestTranspose::test_transpose_method, test/torch_np/test_ndarray_methods.py::TestRavel::test_ravel_function, test/torch_np/test_ndarray_methods.py::TestRavel::test_ravel_method, test/torch_np/test_ndarray_methods.py::TestNonzero::test_array_method, test/torch_np/test_ndarray_methods.py::TestNonzero::test_nonzero_onedim, test/torch_np/test_ndarray_methods.py::TestNonzero::test_nonzero_trivial, test/torch_np/test_ndarray_methods.py::TestNonzero::test_nonzero_twodim, test/torch_np/test_ndarray_methods.py::TestNonzero::test_sparse, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_all_method_max, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_all_method_min, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size0_axis0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size0_axis0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size10_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size10_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size11_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size11_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size12_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size12_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size13_axis13_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size13_axis13_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size14_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size14_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size15_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size15_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size16_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size16_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size17_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size17_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size18_axis18_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size18_axis18_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size19_axis_-3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size19_axis_-3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size1_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size1_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size20_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size20_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size21_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size21_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size22_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size22_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size23_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size23_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size24_axis_2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size24_axis_2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size25_axis25_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size25_axis25_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size26_axis_-3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size26_axis_-3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size27_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size27_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size28_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size28_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size29_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size29_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size2_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size2_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size30_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size30_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size31_axis_2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size31_axis_2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size32_axis32_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size32_axis32_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size33_axis_-4_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size33_axis_-4_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size34_axis_-3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size34_axis_-3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size35_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size35_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size36_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size36_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size37_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size37_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size38_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size38_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size39_axis_2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size39_axis_2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size3_axis3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size3_axis3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size40_axis_3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size40_axis_3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size41_axis41_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size41_axis41_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size42_axis_-4_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size42_axis_-4_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size43_axis_-3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size43_axis_-3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size44_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size44_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size45_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size45_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size46_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size46_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size47_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size47_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size48_axis_2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size48_axis_2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size49_axis_3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size49_axis_3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size4_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size4_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size50_axis50_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size50_axis50_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size51_axis_-4_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size51_axis_-4_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size52_axis_-3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size52_axis_-3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size53_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size53_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size54_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size54_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size55_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size55_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size56_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size56_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size57_axis_2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size57_axis_2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size58_axis_3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size58_axis_3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size59_axis59_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size59_axis59_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size5_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size5_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size60_axis_-4_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size60_axis_-4_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size61_axis_-3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size61_axis_-3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size62_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size62_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size63_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size63_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size64_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size64_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size65_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size65_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size66_axis_2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size66_axis_2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size67_axis_3_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size67_axis_3_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size68_axis68_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size68_axis68_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size69_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size69_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size6_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size6_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size70_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size70_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size71_axis71_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size71_axis71_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size72_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size72_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size73_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size73_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size74_axis74_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size74_axis74_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size75_axis_-1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size75_axis_-1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size76_axis_0_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size76_axis_0_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size77_axis77_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size77_axis77_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size7_axis_1_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size7_axis_1_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size8_axis8_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size8_axis8_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size9_axis_-2_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_argmin_argmax_keepdims_size9_axis_-2_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_vs_ndarray_arr_method_argmax_np_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_vs_ndarray_arr_method_argmin_np_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_vs_ndarray_positional_arr_method_argmax_np_method0, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_np_vs_ndarray_positional_arr_method_argmin_np_method1, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_output_shape_method_argmax, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_output_shape_method_argmin, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_ret_is_out_ndim_0_method_argmax, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_ret_is_out_ndim_0_method_argmin, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_ret_is_out_ndim_1_method_argmax, test/torch_np/test_ndarray_methods.py::TestArgmaxArgminCommon::test_ret_is_out_ndim_1_method_argmin, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data0, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data1, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data10, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data11, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data12, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data13, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data14, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data15, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data16, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data17, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data18, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data19, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data2, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data20, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data21, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data22, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data23, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data24, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data25, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data26, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data27, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data28, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data29, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data3, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data30, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data31, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data32, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data33, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data34, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data35, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data36, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data37, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data38, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data39, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data4, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data40, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data41, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data42, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data43, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data44, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data45, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data46, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data47, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data48, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data49, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data5, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data50, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data51, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data52, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data53, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data54, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data55, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data56, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data57, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data58, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data59, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data6, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data60, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data61, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data62, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data63, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data64, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data65, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data66, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data67, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data68, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data69, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data7, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data70, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data71, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data72, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data73, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data8, test/torch_np/test_ndarray_methods.py::TestArgmax::test_combinations_data9, test/torch_np/test_ndarray_methods.py::TestArgmax::test_maximum_signed_integers, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data0, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data1, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data10, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data11, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data12, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data13, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data14, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data15, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data16, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data17, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data18, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data19, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data2, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data20, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data21, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data22, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data23, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data24, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data25, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data26, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data27, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data28, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data29, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data3, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data30, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data31, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data32, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data33, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data34, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data35, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data36, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data37, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data38, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data39, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data4, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data40, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data41, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data42, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data43, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data44, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data45, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data46, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data47, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data48, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data49, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data5, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data50, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data51, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data52, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data53, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data54, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data55, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data56, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data57, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data58, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data59, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data6, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data60, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data61, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data62, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data63, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data64, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data65, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data66, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data67, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data68, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data69, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data7, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data70, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data71, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data72, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data73, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data8, test/torch_np/test_ndarray_methods.py::TestArgmin::test_combinations_data9, test/torch_np/test_ndarray_methods.py::TestArgmin::test_minimum_signed_integers, test/torch_np/test_ndarray_methods.py::TestAmax::test_basic, test/torch_np/test_ndarray_methods.py::TestAmin::test_basic, test/torch_np/test_ndarray_methods.py::TestContains::test_contains, test/torch_np/test_ndarray_methods.py::TestNoExtraMethods::test_extra_methods_name_fn, test/torch_np/test_ndarray_methods.py::TestNoExtraMethods::test_extra_methods_name_ivar, test/torch_np/test_ndarray_methods.py::TestNoExtraMethods::test_extra_methods_name_method, test/torch_np/test_ndarray_methods.py::TestNoExtraMethods::test_extra_methods_name_name, test/torch_np/test_ndarray_methods.py::TestNoExtraMethods::test_extra_methods_name_plain, test/torch_np/test_ndarray_methods.py::TestNoExtraMethods::test_extra_methods_name_rvar, test/torch_np/test_ndarray_methods.py::TestIter::test_iter_1d, test/torch_np/test_ndarray_methods.py::TestIter::test_iter_2d 2025-12-04T13:39:49.8692100Z 2025-12-04T13:39:49.8692228Z Finished torch_np/test_ndarray_methods 1/1 ... [2025-12-04 13:39:49.860296][3582098.385105793], took 0.06min 2025-12-04T13:39:49.8692647Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T13:39:49.8693003Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:39:49.8693209Z Running test_view_ops 1/1 ... [2025-12-04 13:39:49.867936][3582098.392749678] 2025-12-04T13:39:49.8693378Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:39:49.8693753Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_view_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:39:49.868159] 2025-12-04T13:39:58.3497225Z 2025-12-04T13:39:58.3498100Z test_view_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_view_ops_1.1_309ea56f5b09c06f_.log 2025-12-04T13:39:58.3529623Z Running 279 items in this shard: test/test_view_ops.py::TestViewOpsCUDA::test_T_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_advanced_indexing_assignment_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_advanced_indexing_nonview_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_as_strided_gradients_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_as_strided_inplace_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_as_strided_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_basic_indexing_ellipses_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_basic_indexing_newaxis_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_basic_indexing_slice_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_chunk_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_conj_imag_view_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_conj_imag_view_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_float16, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_float32, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_float64, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_int16, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_int32, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_int64, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_int8, test/test_view_ops.py::TestViewOpsCUDA::test_conj_self_cuda_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_conj_view_with_shared_memory_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_contiguous_nonview_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_contiguous_self_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_diagonal_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_expand_as_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_expand_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_flatten_nonview_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_flatten_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_float16, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_float32, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_float64, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_int16, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_int32, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_int64, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_int8, test/test_view_ops.py::TestViewOpsCUDA::test_imag_noncomplex_cuda_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_movedim_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_narrow_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_permute_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_real_imag_view_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_real_imag_view_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_reshape_as_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_reshape_nonview_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_reshape_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_select_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_bool, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_float16, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_float32, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_float64, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_int16, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_int32, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_int64, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_int8, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex128_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_bool, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_float16, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_float32, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_float64, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_int16, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_int32, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_int64, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_int8, test/test_view_ops.py::TestViewOpsCUDA::test_set_real_imag_cuda_complex64_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_split_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_squeeze_inplace_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_squeeze_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_t_inplace_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_t_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_transpose_inplace_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_transpose_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_unbind_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_unbind_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_unfold_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_unsqueeze_inplace_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_unsqueeze_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_view_as_complex_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_view_as_real_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_view_as_real_cuda_complex32, test/test_view_ops.py::TestViewOpsCUDA::test_view_as_real_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_view_as_view_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_view_copy_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_view_copy_out_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_view_copy_output_contiguous_cuda, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_bool, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_float16, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_float32, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_float64, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_int16, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_int32, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_int64, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_int8, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_new_cuda_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_bool, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_float16, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_float32, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_float64, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_int16, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_int32, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_int64, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_int8, test/test_view_ops.py::TestViewOpsCUDA::test_view_dtype_upsize_errors_cuda_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_bool, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_float16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_float32, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_float64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_int16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_int32, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_int64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_int8, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_dsplit_cuda_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_bool, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_float16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_float32, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_float64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_int16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_int32, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_int64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_int8, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_hsplit_cuda_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_bool, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_float16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_float32, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_float64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_int16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_int32, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_int64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_int8, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_split_cuda_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_bfloat16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_bool, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_complex128, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_complex64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_float16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_float32, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_float64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_int16, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_int32, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_int64, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_int8, test/test_view_ops.py::TestViewOpsCUDA::test_view_tensor_vsplit_cuda_uint8, test/test_view_ops.py::TestViewOpsCUDA::test_view_view_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_T_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_as_strided_overflow_storage_offset_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_complex64, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_float16, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_float64, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_int16, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_int32, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_int8, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_cuda_uint8, test/test_view_ops.py::TestOldViewOpsCUDA::test_atleast_gradient_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_big_transpose_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_shapes_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_tensors_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_bool, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_complex64, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_float16, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_float64, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_int16, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_int32, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_int8, test/test_view_ops.py::TestOldViewOpsCUDA::test_broadcast_to_cuda_uint8, test/test_view_ops.py::TestOldViewOpsCUDA::test_chunk_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_conj_neg_view_numpy_error_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_contiguous_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_crow_col_indices_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_empty_reshape_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_expand_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_flatten_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_memory_format_resize__cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_memory_format_resize_as_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_narrow_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_narrow_tensor_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_python_types_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_ravel_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_bfloat16, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_bool, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_complex64, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_float16, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_float64, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_int16, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_int32, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_int8, test/test_view_ops.py::TestOldViewOpsCUDA::test_reshape_view_semantics_cuda_uint8, test/test_view_ops.py::TestOldViewOpsCUDA::test_resize_all_dtypes_and_devices_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_resize_as_all_dtypes_and_devices_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_resize_as_preserves_strides_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_resize_overflow_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_split_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_t_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_errors_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_bool, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_complex64, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_float16, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_float64, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_int16, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_int32, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_int8, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_indices_cuda_uint8, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_bool, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_complex64, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_float16, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_float64, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_int16, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_int32, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_int8, test/test_view_ops.py::TestOldViewOpsCUDA::test_tensor_split_sections_cuda_uint8, test/test_view_ops.py::TestOldViewOpsCUDA::test_transpose_invalid_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_transpose_invalid_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_transpose_invalid_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_transpose_vs_numpy_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_transpose_vs_numpy_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_transpose_vs_numpy_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_bfloat16, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_bool, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_complex64, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_float16, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_float64, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_int16, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_int32, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_int8, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_cuda_uint8, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_bfloat16, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_bool, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_complex128, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_complex64, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_float16, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_float32, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_float64, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_int16, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_int32, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_int64, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_int8, test/test_view_ops.py::TestOldViewOpsCUDA::test_transposes_errors_cuda_uint8, test/test_view_ops.py::TestOldViewOpsCUDA::test_unsqueeze_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_view_all_dtypes_and_devices_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_view_cuda, test/test_view_ops.py::TestOldViewOpsCUDA::test_view_empty_cuda 2025-12-04T13:39:58.3555557Z 2025-12-04T13:39:58.3555670Z Finished test_view_ops 1/1 ... [2025-12-04 13:39:58.349757][3582106.874565321], took 0.14min 2025-12-04T13:39:58.3556056Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T13:39:58.3569121Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:39:58.3570962Z Running test_type_info 1/1 ... [2025-12-04 13:39:58.356992][3582106.881805632] 2025-12-04T13:39:58.3571146Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:39:58.3572868Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_type_info.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:39:58.357172] 2025-12-04T13:40:00.4748741Z 2025-12-04T13:40:00.4749725Z test_type_info 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_type_info_1.1_b3562e73c5e3157c_.log 2025-12-04T13:40:00.4751630Z Running 5 items in this shard: test/test_type_info.py::TestDTypeInfo::test_finfo, test/test_type_info.py::TestDTypeInfo::test_iinfo, test/test_type_info.py::TestDTypeInfo::test_invalid_input, test/test_type_info.py::TestDTypeInfo::test_to_complex, test/test_type_info.py::TestDTypeInfo::test_to_real 2025-12-04T13:40:00.4752908Z 2025-12-04T13:40:00.4753233Z Finished test_type_info 1/1 ... [2025-12-04 13:40:00.474588][3582108.999395623], took 0.04min 2025-12-04T13:40:00.4768313Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T13:40:00.4820935Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:40:00.4825091Z Running functorch/test_aotdispatch 1/1 ... [2025-12-04 13:40:00.482176][3582109.006990328] 2025-12-04T13:40:00.4825463Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:40:00.4826184Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'functorch/test_aotdispatch.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:40:00.482367] 2025-12-04T13:41:07.1134090Z 2025-12-04T13:41:07.1135132Z functorch/test_aotdispatch 1/1 was successful, full logs can be found in artifacts with path test/test-reports/functorch.test_aotdispatch_1.1_0086231914a4431b_.log 2025-12-04T13:41:07.1211054Z Running 537 items in this shard: test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_autocast_disable_guard, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_data, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_forward_inputs, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_forward_inputs_create_graph, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_mutation_on_grad_out, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_pass_autocast_custom, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_pass_autocast_off, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_backward_pass_autocast_on, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_batch_norm_amp, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_batchnorm_inference, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_buffer_batch_norm, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_buffer_copied_in_graph, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_buffer_copied_in_graph_with_different_shapes, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_compilation_context, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_complex_linear, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_composite_impl_compile, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_custom_autograd, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_custom_tensor_metadata, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_default_partitioner_saves_symints_not_tensors_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dupe_arg, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dupe_arg_returned_as_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dupe_arg_torture, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_duplicated_arguments_on_tensor_overlap, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dynamic_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_dynamic_shape_output_not_in_bw_graph, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_embedding_bag_view_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_fw_bw_mutation_no_functionalization1, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_fw_bw_mutation_no_functionalization2, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_grad_context, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_inference_mode, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_inner_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_aliased_with_mutation_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_data_and_metadata_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_inplace_requires_grad_true, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_metadata_mutation_aliases, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_alias_everything, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_and_none_require_gradients, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_and_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_bases_out_of_order, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_aliases_other_input2, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_and_output_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_false_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_hidden_from_autograd_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_is_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_metadata2, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_modifies_autograd_meta_of_aliases, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_output_view_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_requires_grad_detach, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_requires_grad_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_requires_grad_no_grad_detach_mixed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_requires_grad_no_grad_inference_graph, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_return, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_set__input_mutation, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_set__nop, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_simple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_simple_with_none_and_nontensor, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_storage_resize_before_set_, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_storage_resize_down, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_storage_resize_down_and_set_, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_storage_resize_up, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_output_aliase_custom_autograd_function, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_output_view_metadata_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_output_view_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_input_output_view_simple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_dupe, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_dupe_fake, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_dupe_left_bias, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_requires_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_invalid_requires_grad_fake, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_list_codegen, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mark_activations_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mark_activations_dynamic_with_nested, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mark_outputs_dynamic_use_autograd_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mark_outputs_dynamic_use_autograd_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mem_leak_from_save_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_module, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_multi_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_multi_output_list, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_mutates_input_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses_complicated_inps, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses_complicated_inps_mixed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses_non_homogenous, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nested_subclasses_non_nested_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_new_inp_requires_grad_now, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_no_grad_input_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_non_tensor_and_none_inputs, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_nonidempotent_amp, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_input_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_input_multi_output_view_should_raise_autograd_error, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_and_returned, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_and_returned_different_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_and_returned_flipped, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_inplace_view_and_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_inplace_view_with_detach, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_multiple_mixed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_mutation_linear, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_returned_multiple_times, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_single, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_intermediate_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_multiple_inputs_get_correct_one, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_aliases_output_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_all_alias_types, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_dict, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_output_op_depending_on_symint, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_outputs_are_aliased, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_real_weights_in_symbolic_mode, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_real_weights_in_symbolic_mode_with_inplace_ops, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_saved_tensors_hooks_mutations_raise, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_set__and_data_mutation_bad, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_set__and_data_mutation_good, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_set__not_allowed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_set__steals_view_chain, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_single_output, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_some_output_requires_grad_input_doesnt, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_some_outputs_dont_require_grad_non_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_some_outputs_dont_require_grad_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_squeeze_mutation, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_subclass_metadata_mutation_req_grad_False, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_subclass_metadata_mutation_req_grad_True, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_subclasses_mixed, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_subclasses_mixed_mode, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_synthetic_base_base_attribute_is_none, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_view_and_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutograd::test_view_detach, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_ban_dropout_mut_pre_dispatch, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_forward_mutation_multiple_mut, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_forward_mutation_no_buffer_mut, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_functionalized_rng_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_input_dupes_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_input_mutation, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_input_mutation_on_input_requiring_grad_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_input_mutation_on_parameter_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_metadata_mutation_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_module_joint, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_multiple_outputs_require_grad_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_buffer_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_composite_implicit_inplace, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_composite_implicit_linear, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_contiguous, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_conv_and_bn, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_func_composite_implicit, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_func_simple, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_func_view, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_map_1, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_map_2, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_outdtype, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_reshape, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_with_autograd_op, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_with_cond, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_predispatch_with_cond_nested, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_simplified_basic, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_simplified_pytrees_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_synthetic_bases_banned, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_unbacked_arg, test/functorch/test_aotdispatch.py::TestAOTExport::test_aot_export_with_torch_cond, test/functorch/test_aotdispatch.py::TestPartitioning::test_autocast, test/functorch/test_aotdispatch.py::TestPartitioning::test_contiguous, test/functorch/test_aotdispatch.py::TestPartitioning::test_custom_partitioner_fn, test/functorch/test_aotdispatch.py::TestPartitioning::test_default_partitioner_getitem, test/functorch/test_aotdispatch.py::TestPartitioning::test_default_partitioner_output_tensor_shape_tensor, test/functorch/test_aotdispatch.py::TestPartitioning::test_generate_gives_inference_graph, test/functorch/test_aotdispatch.py::TestPartitioning::test_meta_tensor_inplace_op, test/functorch/test_aotdispatch.py::TestPartitioning::test_min_cut_partitioner, test/functorch/test_aotdispatch.py::TestPartitioning::test_min_cut_partitioner_output_tensor_shape_tensor, test/functorch/test_aotdispatch.py::TestPartitioning::test_min_cut_partitioner_raise_getitems, test/functorch/test_aotdispatch.py::TestPartitioning::test_min_cut_partitioner_save_shape, test/functorch/test_aotdispatch.py::TestPartitioning::test_preserve_random, test/functorch/test_aotdispatch.py::TestPartitioning::test_quantize_activation_duplicate_nodes, test/functorch/test_aotdispatch.py::TestPartitioning::test_recompute_partitioning, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_incorrect_backward, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_inference, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_input_data_and_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_input_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_input_mutation, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_input_mutation_and_output_alias, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_output_alias, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_output_requires_grad_in_no_grad, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_output_requires_grad_in_no_grad_views, test/functorch/test_aotdispatch.py::TestAOTDispatch::test_aot_dispatch_simple, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified_dynamic, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified_fake_tensor_gm_raises, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified_preserves_stack_trace, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_module_simplified_preserves_stack_trace_from_mutation, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_aot_test_subclasses_with_tensor_factories, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_flex_attn_noncontiguous_tangents, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_grads_no_force_contiguous_dense, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_grads_no_force_contiguous_nested_subclass, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_grads_no_force_contiguous_nested_tensor_tangent, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_grads_no_force_contiguous_subclass, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_inductor_freezing_with_subclasses, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_inference_python_dispatcher, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_layer_norm, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_lift_fresh_copy_in_graph, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_False_test_subclasses_False_device_cpu, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_False_test_subclasses_False_device_cuda, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_False_test_subclasses_True_device_cpu, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_False_test_subclasses_True_device_cuda, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_True_test_subclasses_False_device_cpu, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_True_test_subclasses_False_device_cuda, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_True_test_subclasses_True_device_cpu, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_noncontig_nonmemformat_tangents_dynamic_shapes_True_test_subclasses_True_device_cuda, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_rms_norm, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_rrelu, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_rrelu_with_noise_mutation, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_base_saved_tensors_hooks_filtering_mode_all, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_base_saved_tensors_hooks_filtering_mode_donated, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_base_saved_tensors_hooks_filtering_mode_no_static, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_donated_buffers, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_params, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_saved_tensors_hooks_recompile, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_subclass_parameters, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_subclass_parameters_torture_case, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_tangent_type_coercion, test/functorch/test_aotdispatch.py::TestAOTModuleSimplified::test_wrong_guess_tangent_type, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_autocast_disable_guard, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_data, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_forward_inputs, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_forward_inputs_create_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_mutation_on_grad_out, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_pass_autocast_custom, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_pass_autocast_off, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_backward_pass_autocast_on, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_batch_norm_amp, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_batchnorm_inference, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_buffer_batch_norm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_buffer_copied_in_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_buffer_copied_in_graph_with_different_shapes, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_compilation_context, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_complex_linear, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_composite_impl_compile, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_custom_autograd, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_custom_tensor_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_default_partitioner_saves_symints_not_tensors_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dupe_arg, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dupe_arg_returned_as_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dupe_arg_torture, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_duplicated_arguments_on_tensor_overlap, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dynamic_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_dynamic_shape_output_not_in_bw_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_embedding_bag_view_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_fw_bw_mutation_no_functionalization1, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_fw_bw_mutation_no_functionalization2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_grad_context, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_inference_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_inner_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_aliased_with_mutation_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_data_and_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_data_and_metadata_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_inplace_requires_grad_true, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_metadata_mutation_aliases, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_alias_everything, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_and_none_require_gradients, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_and_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_bases_out_of_order, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_aliases_other_input2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_and_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_false_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_hidden_from_autograd_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_is_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_metadata2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_modifies_autograd_meta_of_aliases, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_output_view_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_requires_grad_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_requires_grad_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_requires_grad_no_grad_detach_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_requires_grad_no_grad_inference_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_return, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_set__input_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_set__nop, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_simple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_simple_with_none_and_nontensor, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_storage_resize_before_set_, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_storage_resize_down, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_storage_resize_down_and_set_, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_mutation_storage_resize_up, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_output_aliase_custom_autograd_function, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_output_view_metadata_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_output_view_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_input_output_view_simple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_inputs_overlapping_unsqueeze_with_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_inputs_overlapping_with_mutation_guard_base, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_dupe, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_dupe_fake, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_dupe_left_bias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_requires_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_invalid_requires_grad_fake, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_list_codegen, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mark_activations_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mark_activations_dynamic_with_nested, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mark_outputs_dynamic_use_autograd_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mark_outputs_dynamic_use_autograd_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mem_leak_from_save_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_module, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_multi_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_multi_output_list, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mutates_input_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mutation_of_input_in_fw_and_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_mutations_in_bw_detached_from_tangent, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses_complicated_inps, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses_complicated_inps_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses_non_homogenous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nested_subclasses_non_nested_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_new_inp_requires_grad_now, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_no_grad_input_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_non_tensor_and_none_inputs, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_nonidempotent_amp, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_input_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_input_multi_output_view_should_raise_autograd_error, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_and_returned, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_and_returned_different_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_and_returned_flipped, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_inplace_view_and_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_inplace_view_with_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_multiple_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_mutation_linear, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_returned_multiple_times, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_single, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_intermediate_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_multiple_inputs_get_correct_one, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_aliases_output_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_all_alias_types, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_dict, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_output_op_depending_on_symint, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_outputs_are_aliased, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_real_weights_in_symbolic_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_real_weights_in_symbolic_mode_with_inplace_ops, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_saved_tensors_hooks_mutations_raise, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_set__and_data_mutation_bad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_set__and_data_mutation_good, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_set__not_allowed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_set__steals_view_chain, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_single_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_some_output_requires_grad_input_doesnt, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_some_outputs_dont_require_grad_non_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_some_outputs_dont_require_grad_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_squeeze_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_subclass_metadata_mutation_req_grad_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_subclass_metadata_mutation_req_grad_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_subclasses_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_subclasses_mixed_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_synthetic_base_base_attribute_is_none, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_view_and_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithDynamo::test_view_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_aot_eager_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_False_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_alias_of_intermediate_detach_backend_inductor_view_replay_for_aliased_outputs_True_dynamic_shapes_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_autocast_disable_guard, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_data, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_forward_inputs, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_forward_inputs_create_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_mutation_on_grad_out, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_pass_autocast_custom, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_pass_autocast_off, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_backward_pass_autocast_on, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_batch_norm_amp, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_batchnorm_inference, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_buffer_batch_norm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_buffer_copied_in_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_buffer_copied_in_graph_with_different_shapes, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_compilation_context, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_complex_linear, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_composite_impl_compile, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_custom_autograd, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_custom_tensor_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_default_partitioner_saves_symints_not_tensors_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dupe_arg, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dupe_arg_returned_as_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dupe_arg_torture, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_duplicated_arguments_on_tensor_overlap, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dynamic_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_dynamic_shape_output_not_in_bw_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_embedding_bag_view_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_fw_bw_mutation_no_functionalization1, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_fw_bw_mutation_no_functionalization2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_grad_context, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inference_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inner_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_aliased_with_mutation_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_data_and_metadata_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_data_and_metadata_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_inplace_requires_grad_true, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_metadata_mutation_aliases, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_alias_everything, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_and_none_require_gradients, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_and_output_alias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_bases_out_of_order, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_other_input, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_aliases_other_input2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_and_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_batchnorm, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_false_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_hidden_from_autograd_aliasing, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_is_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_metadata, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_metadata2, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_modifies_autograd_meta_of_aliases, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_output_view_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_requires_grad_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_requires_grad_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_requires_grad_no_grad_detach_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_requires_grad_no_grad_inference_graph, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_return, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_set__input_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_set__nop, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_simple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_simple_with_none_and_nontensor, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_storage_resize_before_set_, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_storage_resize_down, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_storage_resize_down_and_set_, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_mutation_storage_resize_up, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_output_aliase_custom_autograd_function, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_output_view_metadata_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_output_view_mutate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_input_output_view_simple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inputs_overlapping_unsqueeze_with_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inputs_overlapping_with_mutation_guard_base, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_dupe, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_dupe_fake, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_dupe_left_bias, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_requires_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_invalid_requires_grad_fake, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_list_codegen, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mark_activations_dynamic, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mark_activations_dynamic_with_nested, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mark_outputs_dynamic_use_autograd_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mark_outputs_dynamic_use_autograd_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mem_leak_from_save_for_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_module, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_multi_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_multi_output_list, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mutates_input_noncontiguous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mutation_of_input_in_fw_and_bw, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_mutations_in_bw_detached_from_tangent, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses_complicated_inps, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses_complicated_inps_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses_non_homogenous, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nested_subclasses_non_nested_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_new_inp_requires_grad_now, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_no_grad_input_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_non_tensor_and_none_inputs, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_nonidempotent_amp, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_input_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_input_multi_output_view_should_raise_autograd_error, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_input_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_and_returned, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_and_returned_different_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_and_returned_flipped, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_inplace_view_and_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_inplace_view_with_detach, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_multi_output_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_multiple, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_multiple_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_mutation_linear, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_no_grad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_returned_multiple_times, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_single, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_intermediate_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_multiple_inputs_get_correct_one, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_aliases_output_view_meta_replay, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_all_alias_types, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_dict, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_output_op_depending_on_symint, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_outputs_are_aliased, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_real_weights_in_symbolic_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_real_weights_in_symbolic_mode_with_inplace_ops, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_saved_tensors_hooks_mutations_raise, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_set__and_data_mutation_bad, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_set__and_data_mutation_good, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_set__not_allowed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_set__steals_view_chain, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_single_output, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_some_output_requires_grad_input_doesnt, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_some_outputs_dont_require_grad_non_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_some_outputs_dont_require_grad_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_squeeze_mutation, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_subclass_metadata_mutation_req_grad_False, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_subclass_metadata_mutation_req_grad_True, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_subclasses_mixed, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_subclasses_mixed_mode, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_synthetic_base_base_attribute_is_none, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_view_and_inplace_view, test/functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_view_detach 2025-12-04T13:41:07.1283874Z 2025-12-04T13:41:07.1284005Z Finished functorch/test_aotdispatch 1/1 ... [2025-12-04 13:41:07.113772][3582175.63858141], took 1.11min 2025-12-04T13:41:07.1284402Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T13:41:07.1284762Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:41:07.1284983Z Running test_nn 1/2 ... [2025-12-04 13:41:07.121034][3582175.645848311] 2025-12-04T13:41:07.1285141Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:41:07.1285500Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_nn.py', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:41:07.121220] 2025-12-04T13:46:34.2824323Z 2025-12-04T13:46:34.2824971Z test_nn 1/2 was successful, full logs can be found in artifacts with path test/test-reports/test_nn_1.2_96630dbaed30b28a_.log 2025-12-04T13:46:34.3035537Z Running 1196 items in this shard: test/test_nn.py::TestNN::test_AdaptiveLogSoftmax, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_none, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_none_cuda_fp32, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_sum_cuda_fp32, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_BCELoss_no_batch_dim_sum_cuda_tf32, test/test_nn.py::TestNN::test_BCELoss_no_reduce_cuda, test/test_nn.py::TestNN::test_BCELoss_weights_no_reduce, test/test_nn.py::TestNN::test_BCELoss_weights_no_reduce_cuda, test/test_nn.py::TestNN::test_BCELoss_weights_no_reduce_scalar_cuda, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_legacy_enum_cuda, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_mean_cuda_tf32, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_none_cuda_fp32, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_none_cuda_tf32, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_batch_dim_sum_cuda_fp32, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_reduce_cuda, test/test_nn.py::TestNN::test_BCEWithLogitsLoss_no_reduce_scalar_cuda, test/test_nn.py::TestNN::test_CELU_no_batch_dim, test/test_nn.py::TestNN::test_CELU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_CTCLoss_zero_lengths, test/test_nn.py::TestNN::test_Conv1d, test/test_nn.py::TestNN::test_Conv1d_dilated, test/test_nn.py::TestNN::test_Conv1d_groups_cuda_fp32, test/test_nn.py::TestNN::test_Conv1d_groups_cuda_tf32, test/test_nn.py::TestNN::test_Conv1d_pad1, test/test_nn.py::TestNN::test_Conv1d_pad1size1_cuda_tf32, test/test_nn.py::TestNN::test_Conv1d_pad2_cuda_fp32, test/test_nn.py::TestNN::test_Conv1d_pad2_cuda_tf32, test/test_nn.py::TestNN::test_Conv1d_pad2size1_cuda_fp32, test/test_nn.py::TestNN::test_Conv1d_pad_same2, test/test_nn.py::TestNN::test_Conv1d_pad_same2_cuda_tf32, test/test_nn.py::TestNN::test_Conv1d_pad_same_cuda_fp32, test/test_nn.py::TestNN::test_Conv1d_pad_same_dilated, test/test_nn.py::TestNN::test_Conv1d_pad_same_dilated_cuda_tf32, test/test_nn.py::TestNN::test_Conv1d_reflect_stride2_pad2_cuda_fp32, test/test_nn.py::TestNN::test_Conv1d_stride, test/test_nn.py::TestNN::test_Conv1d_stride_cuda_fp32, test/test_nn.py::TestNN::test_Conv1d_stride_cuda_tf32, test/test_nn.py::TestNN::test_Conv1d_zero_batch_cuda_fp32, test/test_nn.py::TestNN::test_Conv1d_zeros_stride2_pad2, test/test_nn.py::TestNN::test_Conv1d_zeros_stride2_pad2_cuda_fp32, test/test_nn.py::TestNN::test_Conv1d_zeros_stride2_pad2_cuda_tf32, test/test_nn.py::TestNN::test_Conv2d_depthwise, test/test_nn.py::TestNN::test_Conv2d_depthwise_cuda_tf32, test/test_nn.py::TestNN::test_Conv2d_depthwise_dilated, test/test_nn.py::TestNN::test_Conv2d_depthwise_dilated_cuda_fp32, test/test_nn.py::TestNN::test_Conv2d_depthwise_strided, test/test_nn.py::TestNN::test_Conv2d_depthwise_strided_cuda_fp32, test/test_nn.py::TestNN::test_Conv2d_depthwise_with_multiplier_cuda_fp32, test/test_nn.py::TestNN::test_Conv2d_dilated, test/test_nn.py::TestNN::test_Conv2d_dilated_cuda_fp32, test/test_nn.py::TestNN::test_Conv2d_dilated_with_long_tensor, test/test_nn.py::TestNN::test_Conv2d_dilated_with_long_tensor_cuda_tf32, test/test_nn.py::TestNN::test_Conv2d_groups, test/test_nn.py::TestNN::test_Conv2d_groups_cuda_fp32, test/test_nn.py::TestNN::test_Conv2d_groups_thnn, test/test_nn.py::TestNN::test_Conv2d_groups_thnn_cuda_tf32, test/test_nn.py::TestNN::test_Conv2d_groups_thnn_with_long_tensor, test/test_nn.py::TestNN::test_Conv2d_groups_with_long_tensor, test/test_nn.py::TestNN::test_Conv2d_groups_with_long_tensor_cuda_fp32, test/test_nn.py::TestNN::test_Conv2d_groups_with_long_tensor_cuda_tf32, test/test_nn.py::TestNN::test_Conv2d_no_bias, test/test_nn.py::TestNN::test_Conv2d_no_bias_with_long_tensor, test/test_nn.py::TestNN::test_Conv2d_pad_same, test/test_nn.py::TestNN::test_Conv2d_pad_same_cuda_fp32, test/test_nn.py::TestNN::test_Conv2d_pad_same_cuda_tf32, test/test_nn.py::TestNN::test_Conv2d_pad_same_dilated_cuda_fp32, test/test_nn.py::TestNN::test_Conv2d_pad_same_dilated_cuda_tf32, test/test_nn.py::TestNN::test_Conv2d_padding_cuda_fp32, test/test_nn.py::TestNN::test_Conv2d_padding_cuda_tf32, test/test_nn.py::TestNN::test_Conv2d_reflect_stride2_pad2, test/test_nn.py::TestNN::test_Conv2d_reflect_stride2_pad2_cuda_fp32, test/test_nn.py::TestNN::test_Conv2d_reflect_stride2_pad2_cuda_tf32, test/test_nn.py::TestNN::test_Conv2d_replicate_stride2_pad2_cuda_tf32, test/test_nn.py::TestNN::test_Conv2d_strided_cuda_fp32, test/test_nn.py::TestNN::test_Conv2d_strided_cuda_tf32, test/test_nn.py::TestNN::test_Conv2d_zero_batch_cuda_fp32, test/test_nn.py::TestNN::test_Conv2d_zero_batch_with_long_tensor, test/test_nn.py::TestNN::test_Conv2d_zero_batch_with_long_tensor_cuda_fp32, test/test_nn.py::TestNN::test_Conv3d_1x1x1_no_bias_cuda_tf32, test/test_nn.py::TestNN::test_Conv3d_1x1x1_no_bias_with_long_tensor_cuda_tf32, test/test_nn.py::TestNN::test_Conv3d_circular_stride2_pad2, test/test_nn.py::TestNN::test_Conv3d_circular_stride2_pad2_cuda_tf32, test/test_nn.py::TestNN::test_Conv3d_cuda_fp32, test/test_nn.py::TestNN::test_Conv3d_dilated_cuda_tf32, test/test_nn.py::TestNN::test_Conv3d_groups_with_long_tensor_cuda_fp32, test/test_nn.py::TestNN::test_Conv3d_no_bias_cuda_fp32, test/test_nn.py::TestNN::test_Conv3d_no_bias_with_long_tensor, test/test_nn.py::TestNN::test_Conv3d_no_bias_with_long_tensor_cuda_fp32, test/test_nn.py::TestNN::test_Conv3d_pad_same, test/test_nn.py::TestNN::test_Conv3d_pad_same_dilated_cuda_fp32, test/test_nn.py::TestNN::test_Conv3d_pad_valid, test/test_nn.py::TestNN::test_Conv3d_pad_valid_cuda_fp32, test/test_nn.py::TestNN::test_Conv3d_replicate_stride2_pad2_cuda_tf32, test/test_nn.py::TestNN::test_Conv3d_stride, test/test_nn.py::TestNN::test_Conv3d_stride_cuda_fp32, test/test_nn.py::TestNN::test_Conv3d_stride_cuda_tf32, test/test_nn.py::TestNN::test_Conv3d_stride_padding, test/test_nn.py::TestNN::test_Conv3d_stride_padding_cuda_fp32, test/test_nn.py::TestNN::test_Conv3d_stride_with_long_tensor_cuda_tf32, test/test_nn.py::TestNN::test_Conv3d_with_long_tensor, test/test_nn.py::TestNN::test_Conv3d_with_long_tensor_cuda_fp32, test/test_nn.py::TestNN::test_Conv3d_with_long_tensor_cuda_tf32, test/test_nn.py::TestNN::test_Conv3d_zero_batch, test/test_nn.py::TestNN::test_Conv3d_zero_batch_cuda_fp32, test/test_nn.py::TestNN::test_Conv3d_zero_batch_cuda_tf32, test/test_nn.py::TestNN::test_Conv3d_zero_batch_with_long_tensor_cuda_fp32, test/test_nn.py::TestNN::test_ConvTranspose1d, test/test_nn.py::TestNN::test_ConvTranspose1d_cuda_tf32, test/test_nn.py::TestNN::test_ConvTranspose1d_dilated, test/test_nn.py::TestNN::test_ConvTranspose1d_dilated_cuda_fp32, test/test_nn.py::TestNN::test_ConvTranspose1d_groups, test/test_nn.py::TestNN::test_ConvTranspose1d_no_bias, test/test_nn.py::TestNN::test_ConvTranspose1d_no_bias_cuda_fp32, test/test_nn.py::TestNN::test_ConvTranspose2d_cuda_tf32, test/test_nn.py::TestNN::test_ConvTranspose2d_dilated_cuda_fp32, test/test_nn.py::TestNN::test_ConvTranspose2d_dilated_cuda_tf32, test/test_nn.py::TestNN::test_ConvTranspose2d_dilated_with_long_tensor_cuda_tf32, test/test_nn.py::TestNN::test_ConvTranspose2d_groups, test/test_nn.py::TestNN::test_ConvTranspose2d_groups_cuda_fp32, test/test_nn.py::TestNN::test_ConvTranspose2d_groups_cuda_tf32, test/test_nn.py::TestNN::test_ConvTranspose2d_groups_with_long_tensor, test/test_nn.py::TestNN::test_ConvTranspose2d_groups_with_long_tensor_cuda_fp32, test/test_nn.py::TestNN::test_ConvTranspose2d_groups_with_long_tensor_cuda_tf32, test/test_nn.py::TestNN::test_ConvTranspose2d_no_bias, test/test_nn.py::TestNN::test_ConvTranspose2d_no_bias_cuda_fp32, test/test_nn.py::TestNN::test_ConvTranspose2d_no_bias_with_long_tensor_cuda_fp32, test/test_nn.py::TestNN::test_ConvTranspose2d_with_long_tensor, test/test_nn.py::TestNN::test_ConvTranspose2d_with_long_tensor_cuda_fp32, test/test_nn.py::TestNN::test_ConvTranspose3d_dilated, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_mean_cuda_fp32, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_mean_cuda_tf32, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_none_cuda_fp32, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_none_cuda_tf32, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_CosineEmbeddingLoss_no_batch_dim_sum_cuda_fp32, test/test_nn.py::TestNN::test_CrossMapLRN2d, test/test_nn.py::TestNN::test_ELU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_EmbeddingBag_discontiguous_cuda, test/test_nn.py::TestNN::test_EmbeddingBag_max_cuda, test/test_nn.py::TestNN::test_EmbeddingBag_mean, test/test_nn.py::TestNN::test_EmbeddingBag_mean_cuda, test/test_nn.py::TestNN::test_EmbeddingBag_mean_padding_idx, test/test_nn.py::TestNN::test_EmbeddingBag_sparse, test/test_nn.py::TestNN::test_EmbeddingBag_sparse_cuda, test/test_nn.py::TestNN::test_EmbeddingBag_sum, test/test_nn.py::TestNN::test_Embedding_discontiguous, test/test_nn.py::TestNN::test_Embedding_discontiguous_cuda, test/test_nn.py::TestNN::test_Embedding_sparse, test/test_nn.py::TestNN::test_Flatten_cuda, test/test_nn.py::TestNN::test_Fold_cuda, test/test_nn.py::TestNN::test_Fold_no_batch_dim_input, test/test_nn.py::TestNN::test_Fold_no_batch_dim_int_input, test/test_nn.py::TestNN::test_Fold_no_batch_dim_int_input_cuda, test/test_nn.py::TestNN::test_GELU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Hardshrink_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Hardswish_no_batch_dim, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_mean_cuda_tf32, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_sum_cuda_fp32, test/test_nn.py::TestNN::test_HingeEmbeddingLoss_no_batch_dim_sum_cuda_tf32, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_mean_cuda_tf32, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_none_cuda_fp32, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_HuberLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_KLDivLoss_batch_mean, test/test_nn.py::TestNN::test_KLDivLoss_batch_mean_log_target, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_mean_cuda_tf32, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_KLDivLoss_no_batch_dim_sum_cuda_fp32, test/test_nn.py::TestNN::test_KLDivLoss_no_reduce_cuda, test/test_nn.py::TestNN::test_KLDivLoss_no_reduce_scalar, test/test_nn.py::TestNN::test_KLDivLoss_no_reduce_scalar_log_target, test/test_nn.py::TestNN::test_KLDivLoss_no_reduce_scalar_log_target_cuda, test/test_nn.py::TestNN::test_KLDivLoss_with_log_target_no_reduce, test/test_nn.py::TestNN::test_KLDivLoss_with_target_no_reduce, test/test_nn.py::TestNN::test_KLDivLoss_with_target_no_reduce_cuda, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_mean, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_none, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_none_cuda_fp32, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_none_cuda_tf32, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_sum, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_L1Loss_no_batch_dim_sum_cuda_tf32, test/test_nn.py::TestNN::test_L1Loss_no_reduce, test/test_nn.py::TestNN::test_L1Loss_no_reduce_complex_cuda, test/test_nn.py::TestNN::test_L1Loss_no_reduce_scalar, test/test_nn.py::TestNN::test_LSTM_cell, test/test_nn.py::TestNN::test_LSTM_cell_forward_input_size, test/test_nn.py::TestNN::test_LayerNorm_3d_no_affine_large_feature_cuda, test/test_nn.py::TestNN::test_LeakyReLU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Linear_cuda_tf32, test/test_nn.py::TestNN::test_Linear_no_batch_dim_cuda_fp32, test/test_nn.py::TestNN::test_Linear_no_batch_dim_cuda_tf32, test/test_nn.py::TestNN::test_Linear_no_bias_cuda_fp32, test/test_nn.py::TestNN::test_Linear_no_bias_cuda_tf32, test/test_nn.py::TestNN::test_LogSigmoid_no_batch_dim_cuda, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_mean_cuda_fp32, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_mean_cuda_tf32, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_none, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_MSELoss_no_batch_dim_sum_cuda_tf32, test/test_nn.py::TestNN::test_MSELoss_no_reduce, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_mean_cuda_tf32, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_none_cuda_tf32, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_sum_cuda_fp32, test/test_nn.py::TestNN::test_MarginRankingLoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_MaxUnpool1d_net_cuda, test/test_nn.py::TestNN::test_MaxUnpool1d_net_no_batch_dim, test/test_nn.py::TestNN::test_MaxUnpool1d_net_no_batch_dim_cuda, test/test_nn.py::TestNN::test_MaxUnpool2d_net_cuda, test/test_nn.py::TestNN::test_MaxUnpool2d_net_no_batch_dim, test/test_nn.py::TestNN::test_MaxUnpool3d_net, test/test_nn.py::TestNN::test_MaxUnpool3d_net_cuda, test/test_nn.py::TestNN::test_MaxUnpool3d_net_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Mish_no_batch_dim_cuda, test/test_nn.py::TestNN::test_ModuleList, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_1d_no_reduce, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_index_neg_cuda, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_mean_cuda_fp32, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_mean_cuda_tf32, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_none_cuda_fp32, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_reduce, test/test_nn.py::TestNN::test_MultiLabelMarginLoss_no_reduce_cuda, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_mean_cuda_fp32, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_none, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_none_cuda_fp32, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_batch_dim_sum_cuda_fp32, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_reduce, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_no_reduce_cuda, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_weights_no_reduce, test/test_nn.py::TestNN::test_MultiLabelSoftMarginLoss_weights_no_reduce_cuda, test/test_nn.py::TestNN::test_MultiMarginLoss_1d_no_reduce, test/test_nn.py::TestNN::test_MultiMarginLoss_margin_no_reduce, test/test_nn.py::TestNN::test_MultiMarginLoss_weights_no_reduce, test/test_nn.py::TestNN::test_MultiMarginLoss_weights_no_reduce_cuda, test/test_nn.py::TestNN::test_NLLLoss2d_no_reduce_cuda, test/test_nn.py::TestNN::test_NLLLoss2d_no_reduce_weights, test/test_nn.py::TestNN::test_NLLLossNd_no_reduce_ignore_index, test/test_nn.py::TestNN::test_NLLLossNd_no_reduce_weights, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_mean_cuda_tf32, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_none_cuda_double, test/test_nn.py::TestNN::test_NLLLoss_no_batch_dim_sum_cuda_fp32, test/test_nn.py::TestNN::test_NLLLoss_no_reduce, test/test_nn.py::TestNN::test_NLLLoss_no_reduce_cuda, test/test_nn.py::TestNN::test_NLLLoss_no_reduce_ignore_index, test/test_nn.py::TestNN::test_NLLLoss_no_reduce_weights_ignore_index, test/test_nn.py::TestNN::test_NLLLoss_no_reduce_weights_ignore_index_cuda, test/test_nn.py::TestNN::test_NLLLoss_no_reduce_weights_ignore_index_neg_cuda, test/test_nn.py::TestNN::test_PReLU_no_batch_dim, test/test_nn.py::TestNN::test_PReLU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_PairwiseDistance, test/test_nn.py::TestNN::test_PairwiseDistance_with_non_default_args_cuda, test/test_nn.py::TestNN::test_ParameterDict, test/test_nn.py::TestNN::test_ParameterList, test/test_nn.py::TestNN::test_ParameterList_meta, test/test_nn.py::TestNN::test_PixelShuffle_cuda, test/test_nn.py::TestNN::test_PixelUnshuffle, test/test_nn.py::TestNN::test_PixelUnshuffle_cuda, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_none_cuda_fp32, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_sum_cuda_fp32, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_sum_cuda_half, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_batch_dim_sum_cuda_tf32, test/test_nn.py::TestNN::test_PoissonNLLLoss_no_reduce, test/test_nn.py::TestNN::test_RNN_cell_forward_zero_hidden_size, test/test_nn.py::TestNN::test_RNN_cell_no_broadcasting, test/test_nn.py::TestNN::test_RNN_dropout_state, test/test_nn.py::TestNN::test_RReLU_cuda, test/test_nn.py::TestNN::test_RReLU_no_batch_dim, test/test_nn.py::TestNN::test_RReLU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_RReLU_with_up_down_cuda, test/test_nn.py::TestNN::test_RReLU_with_up_down_scalar, test/test_nn.py::TestNN::test_ReLU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_ReplicationPad3d_complex_cuda, test/test_nn.py::TestNN::test_ReplicationPad3d_cuda, test/test_nn.py::TestNN::test_ReplicationPad3d_no_batch_dim_cuda, test/test_nn.py::TestNN::test_SELU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Sequential_extend, test/test_nn.py::TestNN::test_Sequential_getitem, test/test_nn.py::TestNN::test_Sequential_iadd, test/test_nn.py::TestNN::test_Sequential_rmul, test/test_nn.py::TestNN::test_SiLU_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Sigmoid_no_batch_dim_cuda, test/test_nn.py::TestNN::test_SmoothL1Loss_beta, test/test_nn.py::TestNN::test_SmoothL1Loss_beta_cuda, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_mean_cuda_fp32, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_mean_cuda_tf32, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_none, test/test_nn.py::TestNN::test_SmoothL1Loss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_SmoothL1Loss_no_reduce, test/test_nn.py::TestNN::test_SmoothL1Loss_no_reduce_cuda, test/test_nn.py::TestNN::test_SmoothL1Loss_no_reduce_scalar_cuda, test/test_nn.py::TestNN::test_SmoothL1Loss_zero_beta_cuda, test/test_nn.py::TestNN::test_SoftMarginLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_SoftMarginLoss_no_batch_dim_mean_cuda_tf32, test/test_nn.py::TestNN::test_SoftMarginLoss_no_batch_dim_none_cuda_fp32, test/test_nn.py::TestNN::test_SoftMarginLoss_no_batch_dim_sum_cuda_tf32, test/test_nn.py::TestNN::test_SoftMarginLoss_no_reduce_cuda, test/test_nn.py::TestNN::test_Softplus_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Softshrink_no_batch_dim, test/test_nn.py::TestNN::test_Softsign_no_batch_dim_cuda, test/test_nn.py::TestNN::test_Tanhshrink_no_batch_dim_cuda, test/test_nn.py::TestNN::test_TransformerDecoderLayer_relu_activation_cuda_tf32, test/test_nn.py::TestNN::test_TransformerEncoderLayer_gelu_activation, test/test_nn.py::TestNN::test_TransformerEncoderLayer_relu_activation_cuda_fp32, test/test_nn.py::TestNN::test_Transformer_cell, test/test_nn.py::TestNN::test_Transformer_multilayer_coder, test/test_nn.py::TestNN::test_Transformer_multilayer_coder_cuda_fp32, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_mean, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_mean_cuda_double, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_mean_cuda_half, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_none_cuda_fp32, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_none_cuda_half, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_sum, test/test_nn.py::TestNN::test_TripletMarginLoss_no_batch_dim_sum_cuda_double, test/test_nn.py::TestNN::test_Unfold, test/test_nn.py::TestNN::test_Unfold_cuda, test/test_nn.py::TestNN::test_add_module_raises_error_if_attr_exists, test/test_nn.py::TestNN::test_affine_grid_3d, test/test_nn.py::TestNN::test_affine_grid_backward_cl_cf_consistency_device_cpu_nd_2, test/test_nn.py::TestNN::test_affine_grid_backward_cl_cf_consistency_device_cuda_nd_3, test/test_nn.py::TestNN::test_affine_grid_error_checking, test/test_nn.py::TestNN::test_assignment, test/test_nn.py::TestNN::test_batch_norm_update_stats, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NCHW_vs_cpu_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NCHW_vs_cpu_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NCHW_vs_native_float32, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NCHW_vs_native_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NCHW_vs_native_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NHWC_vs_NCHW_float32, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NHWC_vs_NCHW_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NHWC_vs_native_float32, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NHWC_vs_native_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_2D_inference_NHWC_vs_native_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_2D_train_NCHW_vs_cpu_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_2D_train_NCHW_vs_native_float32, test/test_nn.py::TestNN::test_batchnorm_2D_train_NCHW_vs_native_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_2D_train_NHWC_vs_NCHW_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_2D_train_NHWC_vs_cpu_float32, test/test_nn.py::TestNN::test_batchnorm_2D_train_NHWC_vs_cpu_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_2D_train_NHWC_vs_cpu_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_2D_train_NHWC_vs_native_float32, test/test_nn.py::TestNN::test_batchnorm_2D_train_NHWC_vs_native_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NCHW_vs_cpu_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NCHW_vs_native_float32, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NCHW_vs_native_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NHWC_vs_NCHW_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NHWC_vs_cpu_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NHWC_vs_native_float32, test/test_nn.py::TestNN::test_batchnorm_3D_inference_NHWC_vs_native_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NCHW_vs_cpu_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NCHW_vs_cpu_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NCHW_vs_native_float32, test/test_nn.py::TestNN::test_batchnorm_3D_train_NCHW_vs_native_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NCHW_vs_native_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NHWC_vs_NCHW_float32, test/test_nn.py::TestNN::test_batchnorm_3D_train_NHWC_vs_NCHW_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NHWC_vs_cpu_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NHWC_vs_native_float32, test/test_nn.py::TestNN::test_batchnorm_3D_train_NHWC_vs_native_mixed_bfloat16, test/test_nn.py::TestNN::test_batchnorm_3D_train_NHWC_vs_native_mixed_float16, test/test_nn.py::TestNN::test_batchnorm_buffer_update_when_stats_are_not_tracked, test/test_nn.py::TestNN::test_batchnorm_cudnn_half, test/test_nn.py::TestNN::test_batchnorm_cudnn_nhwc, test/test_nn.py::TestNN::test_batchnorm_load_state_dict, test/test_nn.py::TestNN::test_batchnorm_nhwc_cpu, test/test_nn.py::TestNN::test_batchnorm_nhwc_cuda, test/test_nn.py::TestNN::test_batchnorm_non_contig_cpu_BatchNorm2d, test/test_nn.py::TestNN::test_batchnorm_nonaffine_cuda_half_input, test/test_nn.py::TestNN::test_batchnorm_raises_error_if_bias_is_not_same_size_as_input, test/test_nn.py::TestNN::test_batchnorm_raises_error_if_less_than_one_value_per_channel, test/test_nn.py::TestNN::test_bce_with_logits_broadcasts_weights, test/test_nn.py::TestNN::test_bce_with_logits_gives_same_result_as_sigmoid_and_bce_loss, test/test_nn.py::TestNN::test_bce_with_logits_gives_same_result_as_sigmoid_and_bce_loss_large_tensors_with_grad, test/test_nn.py::TestNN::test_bce_with_logits_has_correct_forward_grad, test/test_nn.py::TestNN::test_bce_with_logits_ones_in_pos_weights_are_the_same_as_none, test/test_nn.py::TestNN::test_bce_with_logits_with_pos_weight_has_correct_grad_at_zero, test/test_nn.py::TestNN::test_bilinear, test/test_nn.py::TestNN::test_broadcast_double_backwards_gpu, test/test_nn.py::TestNN::test_broadcast_no_grad, test/test_nn.py::TestNN::test_buffer_bad_module_subclass, test/test_nn.py::TestNN::test_buffer_not_persistent_load, test/test_nn.py::TestNN::test_buffers_and_named_buffers, test/test_nn.py::TestNN::test_cosine_embedding_loss_margin_no_reduce, test/test_nn.py::TestNN::test_cosine_embedding_loss_no_reduce, test/test_nn.py::TestNN::test_cosine_embedding_loss_with_diff_type, test/test_nn.py::TestNN::test_cross_entropy_loss_precision, test/test_nn.py::TestNN::test_cudnn_rnn_dropout_states_device, test/test_nn.py::TestNN::test_cudnn_weight_tying, test/test_nn.py::TestNN::test_extra_state, test/test_nn.py::TestNN::test_extra_state_missing_set_extra_state, test/test_nn.py::TestNN::test_fb_fc_packed, test/test_nn.py::TestNN::test_flatten, test/test_nn.py::TestNN::test_fractional_max_pool2d_invalid_output_ratio, test/test_nn.py::TestNN::test_gaussian_nll_loss_args, test/test_nn.py::TestNN::test_gaussian_nll_loss_broadcasting, test/test_nn.py::TestNN::test_gaussian_nll_loss_scalar_var, test/test_nn.py::TestNN::test_get_buffer, test/test_nn.py::TestNN::test_grid_sample, test/test_nn.py::TestNN::test_grid_sample_3d, test/test_nn.py::TestNN::test_grid_sample_error_checking, test/test_nn.py::TestNN::test_hardtanh_backward, test/test_nn.py::TestNN::test_hardtanh_inplace_gradgrad, test/test_nn.py::TestNN::test_huber_loss_invalid_delta, test/test_nn.py::TestNN::test_inplace_thnn, test/test_nn.py::TestNN::test_interpolate_bicubic_2d_cuda, test/test_nn.py::TestNN::test_interpolate_bicubic_2d_zero_dim, test/test_nn.py::TestNN::test_interpolate_bicubic_scale_2d, test/test_nn.py::TestNN::test_interpolate_bicubic_scale_tuple_shared_2d, test/test_nn.py::TestNN::test_interpolate_bicubic_scale_tuple_shared_2d_cuda, test/test_nn.py::TestNN::test_interpolate_bicubic_scale_tuple_skewed_2d_align_corners, test/test_nn.py::TestNN::test_interpolate_bicubic_scale_tuple_skewed_2d_align_corners_cuda, test/test_nn.py::TestNN::test_interpolate_bicubic_scale_tuple_skewed_2d_cuda, test/test_nn.py::TestNN::test_interpolate_bicubic_tuple_2d, test/test_nn.py::TestNN::test_interpolate_bicubic_tuple_2d_align_corners, test/test_nn.py::TestNN::test_interpolate_bicubic_tuple_2d_align_corners_cuda, test/test_nn.py::TestNN::test_interpolate_bicubic_tuple_2d_cuda, test/test_nn.py::TestNN::test_interpolate_bilinear_2d, test/test_nn.py::TestNN::test_interpolate_bilinear_2d_cuda, test/test_nn.py::TestNN::test_interpolate_bilinear_2d_zero_dim, test/test_nn.py::TestNN::test_interpolate_bilinear_scale_tuple_skewed_2d_align_corners_cuda, test/test_nn.py::TestNN::test_interpolate_bilinear_scale_tuple_skewed_2d_cuda, test/test_nn.py::TestNN::test_interpolate_bilinear_tuple_2d_align_corners_cuda, test/test_nn.py::TestNN::test_interpolate_linear_1d_align_corners, test/test_nn.py::TestNN::test_interpolate_linear_1d_cuda, test/test_nn.py::TestNN::test_interpolate_linear_1d_zero_dim, test/test_nn.py::TestNN::test_interpolate_linear_1d_zero_dim_cuda, test/test_nn.py::TestNN::test_interpolate_linear_scale_1d_align_corners, test/test_nn.py::TestNN::test_interpolate_linear_scale_1d_align_corners_cuda, test/test_nn.py::TestNN::test_interpolate_linear_tuple_1d_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_1d_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_1d_zero_dim, test/test_nn.py::TestNN::test_interpolate_nearest_2d_launch_configs, test/test_nn.py::TestNN::test_interpolate_nearest_2d_launch_configs_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_2d_zero_dim, test/test_nn.py::TestNN::test_interpolate_nearest_3d, test/test_nn.py::TestNN::test_interpolate_nearest_3d_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_3d_zero_dim, test/test_nn.py::TestNN::test_interpolate_nearest_3d_zero_dim_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_scale_1d_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_scale_2d, test/test_nn.py::TestNN::test_interpolate_nearest_scale_2d_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_scale_3d, test/test_nn.py::TestNN::test_interpolate_nearest_scale_3d_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_tuple_1d, test/test_nn.py::TestNN::test_interpolate_nearest_tuple_1d_cuda, test/test_nn.py::TestNN::test_interpolate_nearest_tuple_2d, test/test_nn.py::TestNN::test_interpolate_nearest_tuple_3d_cuda, test/test_nn.py::TestNN::test_interpolate_trilinear_3d, test/test_nn.py::TestNN::test_interpolate_trilinear_3d_zero_dim_cuda, test/test_nn.py::TestNN::test_interpolate_trilinear_scale_3d_align_corners, test/test_nn.py::TestNN::test_interpolate_trilinear_tuple_3d_align_corners, test/test_nn.py::TestNN::test_interpolate_trilinear_tuple_3d_align_corners_cuda, test/test_nn.py::TestNN::test_interpolate_trilinear_tuple_3d_cuda, test/test_nn.py::TestNN::test_interpolate_undefined_behavior_casting, test/test_nn.py::TestNN::test_l1_loss_correct, test/test_nn.py::TestNN::test_large_max_pool2d_ch_last, test/test_nn.py::TestNN::test_layer_norm_grads_with_create_graph_flag, test/test_nn.py::TestNN::test_layer_norm_large_tensor, test/test_nn.py::TestNN::test_linear_autograd_device_cpu_bias_weightCSR, test/test_nn.py::TestNN::test_linear_autograd_device_cpu_nobias_weightCOO, test/test_nn.py::TestNN::test_linear_autograd_device_cpu_nobias_weightStrided, test/test_nn.py::TestNN::test_linear_autograd_device_cuda_bias_weightCOO, test/test_nn.py::TestNN::test_linear_autograd_device_cuda_bias_weightCSC, test/test_nn.py::TestNN::test_linear_autograd_device_cuda_nobias_weightStrided, test/test_nn.py::TestNN::test_log_softmax_scalar, test/test_nn.py::TestNN::test_log_softmax_spatial_special_cuda, test/test_nn.py::TestNN::test_loss_equal_input_target_shape, test/test_nn.py::TestNN::test_margin_ranking_loss_no_reduce, test/test_nn.py::TestNN::test_module_backcompat, test/test_nn.py::TestNN::test_module_super_init, test/test_nn.py::TestNN::test_modules, test/test_nn.py::TestNN::test_multimarginloss_1d_input_0d_target_no_reduce, test/test_nn.py::TestNN::test_named_modules, test/test_nn.py::TestNN::test_named_parameters_remove_duplicate, test/test_nn.py::TestNN::test_nested_tensor_from_mask, test/test_nn.py::TestNN::test_overwrite_module_params_on_conversion, test/test_nn.py::TestNN::test_pack_sequence_batch_sizes_throw, test/test_nn.py::TestNN::test_padding_list, test/test_nn.py::TestNN::test_parameterlistdict_setting_attributes, test/test_nn.py::TestNN::test_pdist_empty_col, test/test_nn.py::TestNN::test_pickle_module_no_weights_only_warning, test/test_nn.py::TestNN::test_pixel_shuffle_nhwc_cpu, test/test_nn.py::TestNN::test_pixel_shuffle_unshuffle, test/test_nn.py::TestNN::test_pointwise_loss_broadcast, test/test_nn.py::TestNN::test_projections_errors_on_gru_and_rnn, test/test_nn.py::TestNN::test_projections_lstm_args_check, test/test_nn.py::TestNN::test_projections_lstm_initial_hidden_state, test/test_nn.py::TestNN::test_register_buffer_allows_overwriting_with_same_name, test/test_nn.py::TestNN::test_register_buffer_allows_tensor_like_object, test/test_nn.py::TestNN::test_register_buffer_raises_error_if_attr_exists, test/test_nn.py::TestNN::test_register_parameter_raises_error_if_name_is_not_string, test/test_nn.py::TestNN::test_relu_inplace_on_view, test/test_nn.py::TestNN::test_rnn_check_device, test/test_nn.py::TestNN::test_rnn_initial_hidden_state, test/test_nn.py::TestNN::test_rnn_weight_norm, test/test_nn.py::TestNN::test_set_submodule, test/test_nn.py::TestNN::test_smoothl1loss_intergral_target, test/test_nn.py::TestNN::test_softmax_functional_dim0, test/test_nn.py::TestNN::test_softmax_functional_dim0_cuda, test/test_nn.py::TestNN::test_softmax_functional_dim3, test/test_nn.py::TestNN::test_softmax_lastdim, test/test_nn.py::TestNN::test_softmax_lastdim_dtype, test/test_nn.py::TestNN::test_softmax_spatial, test/test_nn.py::TestNN::test_softmax_spatial_dtype_cuda, test/test_nn.py::TestNN::test_softmax_spatial_special, test/test_nn.py::TestNN::test_softmin, test/test_nn.py::TestNN::test_spectral_norm, test/test_nn.py::TestNN::test_spectral_norm_dim, test/test_nn.py::TestNN::test_spectral_norm_forward, test/test_nn.py::TestNN::test_spectral_norm_load_state_dict, test/test_nn.py::TestNN::test_spectral_norm_pickle, test/test_nn.py::TestNN::test_state_dict, test/test_nn.py::TestNN::test_swap_module_params_poisons_acc_grad, test/test_nn.py::TestNN::test_to, test/test_nn.py::TestNN::test_transformer_args_check, test/test_nn.py::TestNN::test_transformerdecoderlayer_gelu, test/test_nn.py::TestNN::test_triplet_margin_loss_no_reduce, test/test_nn.py::TestNN::test_triplet_margin_loss_swap, test/test_nn.py::TestNN::test_type, test/test_nn.py::TestNN::test_unflatten, test/test_nn.py::TestNN::test_unfold_invalid_arg, test/test_nn.py::TestNN::test_upsamplingBilinear2d_spatial_invariance, test/test_nn.py::TestNN::test_upsamplingLinear1d_spatial_invariance, test/test_nn.py::TestNN::test_upsampling_bfloat16, test/test_nn.py::TestNN::test_upsampling_not_recompute_scale_factor, test/test_nn.py::TestNN::test_upsampling_small_scale, test/test_nn.py::TestNN::test_weighted_huber_loss, test/test_nn.py::TestNN::test_weighted_l1_loss_with_weights, test/test_nn.py::TestNN::test_weighted_mse_loss, test/test_nn.py::TestFusionEval::test_fuse_module_eval_numerics, test/test_nn.py::TestConstantPadNd::test_constant_pad_nd, test/test_nn.py::TestAddRelu::test_add_relu, test/test_nn.py::TestFunctionalPickle::test_pickle_softsign, test/test_nn.py::TestFusionUtils::test_fuse_linear_bn_requires_grad, test/test_nn.py::TestUtils::test_consume_prefix_in_state_dict_if_present, test/test_nn.py::TestNNDeviceTypeCUDA::test_BatchNorm_empty_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_Bilinear_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_CTCLoss_cudnn_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_CTCLoss_no_batch_dim_reduction_mean_use_module_form_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_CTCLoss_no_batch_dim_reduction_none_use_module_form_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_CTCLoss_no_batch_dim_reduction_sum_use_module_form_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_GroupNorm_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_GroupNorm_numeric_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_GroupNorm_raises_error_if_one_value_per_group_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_InstanceNorm1d_general_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_LSTM_differentiable_backward_using_oneDNN_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_LayerNorm_general_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_LocalResponseNorm_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_MarginLoss_empty_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_MarginLoss_race_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_ReflectionPad3d_large_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_ReflectionPad_empty_cuda_complex64, test/test_nn.py::TestNNDeviceTypeCUDA::test_ReflectionPad_fails_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_ReplicationPad_empty_cuda_complex128, test/test_nn.py::TestNNDeviceTypeCUDA::test_TransformerEncoderLayer_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_TransformerEncoder_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_affine_2d_rotate0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_affine_2d_rotateRandom_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_affine_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_affine_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_affine_mixed_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_eval_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_eval_mixed_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_simple_average_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_simple_average_mixed_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_batchnorm_update_stats_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_False_norm_type_0_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_False_norm_type_1_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_False_norm_type_4_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_False_norm_type_inf_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_True_norm_type_0_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_True_norm_type_2_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_True_norm_type_4_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_clip_grad_norm_foreach_True_norm_type_inf_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_conv_empty_input_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_conv_empty_input_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_64bit_reduction_none_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_label_smoothing_consistent_index_target_and_probs_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_label_smoothing_errors_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_large_tensor_reduction_mean_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_large_tensor_reduction_none_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_large_tensor_reduction_sum_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_prob_target_all_reductions_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_prob_target_no_batch_dim_reduction_none_weighted_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_prob_target_no_batch_dim_reduction_sum_weighted_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_prob_target_no_batch_dim_reduction_sum_weighted_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cross_entropy_loss_prob_target_unit_weights_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_ctc_loss_error_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_cudnn_rnn_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_elu_inplace_overlap_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_elu_inplace_with_neg_alpha_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_fold_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_grid_sample_half_precision_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_grid_sample_large_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_grid_sample_large_index_2d_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_grid_sample_large_index_2d_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_grid_sample_large_index_3d_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_grid_sample_nan_inf_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_groupnorm_nhwc_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_gumbel_softmax_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_gumbel_softmax_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_hardswish_grad_corner_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_hardswish_grad_corner_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_hardswish_grad_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_for_single_spatial_element_during_training_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm1d_no_batch_dim_False_affine_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm2d_no_batch_dim_False_affine_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm2d_no_batch_dim_True_affine_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm3d_no_batch_dim_True_affine_False_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_input_channels_is_not_num_features_InstanceNorm3d_no_batch_dim_True_affine_True_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_instancenorm_raises_error_if_less_than_one_value_per_channel_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_large_max_pool2d_ch_last_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_layernorm_half_precision_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_layernorm_weight_bias_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_leaky_relu_inplace_with_neg_slope_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_leaky_relu_inplace_with_zero_slope_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_log_softmax_cpu_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_log_softmax_cpu_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_masked_softmax_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_masked_softmax_forward_with_nans_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_masked_softmax_mask_types_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_masked_softmax_transformer_layout_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_module_to_empty_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_module_to_empty_non_recursive_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_1d_input_1d_target_invalid_size_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_all_ignored_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_byte_target_matches_long_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_empty_tensor_reduction_sum_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_invalid_weights_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_large_tensor_reduction_sum_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nll_loss_out_of_bounds_ignore_index_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nn_empty_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nn_scalars_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_nonlinearity_propagate_nan_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_one_hot_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_pad_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_rmsnorm_epsilon_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_rmsnorm_epsilon_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_rmsnorm_epsilon_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_rnn_fused_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_rnn_retain_variables_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_rnn_retain_variables_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_rrelu_bounds_validation_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_save_lstm_compatibility_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_skip_init_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_smooth_l1_loss_vs_huber_loss_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_backward_64bit_indexing_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_backward_smem_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_backward_unaligned_grad_output_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_backward_unaligned_output_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_bfloat16_half_to_float_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_cpu_cuda_bfloat16, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_double_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_forward_64bit_indexing_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softmax_results_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_softplus_inplace_overlap_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softshrink_inplace_overlap_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_softshrink_negative_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_threshold_inplace_overlap_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_to_complex_cuda_complex128, test/test_nn.py::TestNNDeviceTypeCUDA::test_to_complex_cuda_complex64, test/test_nn.py::TestNNDeviceTypeCUDA::test_to_complex_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_transformerencoderlayer_fast_path_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_transformerencoderlayer_gelu_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_transformerencoderlayer_gelu_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiLinear2d_consistency_interp_size_bug_memory_format0_align_corners_True_input_size_399_output_size_437_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiLinear2d_consistency_interp_size_bug_memory_format0_align_corners_True_input_size_403_output_size_377_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiLinear2d_consistency_interp_size_bug_memory_format1_align_corners_False_input_size_399_output_size_437_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiLinear2d_consistency_interp_size_bug_memory_format1_align_corners_False_input_size_403_output_size_377_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiLinear2d_consistency_interp_size_bug_memory_format1_align_corners_True_input_size_399_output_size_437_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiLinear2d_consistency_interp_size_bug_memory_format1_align_corners_True_input_size_403_output_size_377_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_False_align_corners_False_mode_bicubic_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_False_align_corners_False_mode_bicubic_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_False_align_corners_False_mode_bilinear_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_True_align_corners_False_mode_bicubic_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_True_align_corners_False_mode_bilinear_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_True_align_corners_True_mode_bilinear_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_antialias_True_align_corners_True_mode_bilinear_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format0_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bicubic_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_False_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_False_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_3_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_32_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_False_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_False_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_restrided_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_consistency_memory_format1_mode_bilinear_antialias_True_align_corners_True_num_channels_5_output_size_600_check_as_unsqueezed_3d_tensor_True_non_contig_sliced_batch_size_5_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bicubic_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bicubic_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bicubic_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bicubic_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bicubic_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bilinear_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bilinear_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bilinear_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bilinear_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bilinear_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_bilinear_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest-exact_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest-exact_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest-exact_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest-exact_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_3_mode_nearest_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bicubic_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bicubic_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bicubic_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bicubic_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bilinear_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bilinear_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_bilinear_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest-exact_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest-exact_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest-exact_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_False_num_channels_5_mode_nearest_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bicubic_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bicubic_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bicubic_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bicubic_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bilinear_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bilinear_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_bilinear_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest-exact_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest-exact_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest-exact_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest-exact_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest-exact_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_3_mode_nearest_uint8_cuda_uint8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bicubic_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bicubic_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bicubic_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bicubic_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bicubic_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bilinear_int32_cuda_int32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_bilinear_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest_float32_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest_float64_cuda_float64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest_int16_cuda_int16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest_int64_cuda_int64, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBiMode2d_nonsupported_dtypes_antialias_True_num_channels_5_mode_nearest_int8_cuda_int8, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBicubic2d_correctness_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBilinear2d_aa_correctness_memory_format0_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingBilinear2d_aa_correctness_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest1d_launch_config_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest1d_mode_nearest-exact_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest2d_correctness_memory_format1_isize_10_osize_15_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest2d_launch_config_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest2d_launch_rocm_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest2d_memory_format1_mode_nearest-exact_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest3d_correctness_memory_format0_isize_10_osize_15_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest3d_launch_config_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest3d_memory_format0_mode_nearest_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearest3d_memory_format1_mode_nearest-exact_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearestExact1d_correctness_isize_20_osize_11_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearestExact2d_correctness_memory_format0_isize_20_osize_11_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearestExact2d_correctness_memory_format1_isize_10_osize_15_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearestExact2d_correctness_memory_format1_isize_20_osize_11_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingNearestExact3d_correctness_memory_format0_isize_10_osize_15_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingTrilinear3d_align_corners_False_memory_format1_cuda, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsampling_64bit_indexing_channels_last_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_upsamplingnearest2d_backward_64bit_indexing_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_variable_sequence_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_variable_sequence_cuda_float32, test/test_nn.py::TestNNDeviceTypeCUDA::test_warp_softmax_64bit_indexing_cuda_float16, test/test_nn.py::TestNNDeviceTypeCUDA::test_warp_softmax_64bit_indexing_cuda_float32 2025-12-04T13:46:34.3243649Z 2025-12-04T13:46:34.3243755Z Finished test_nn 1/2 ... [2025-12-04 13:46:34.283520][3582502.808329193], took 5.45min 2025-12-04T13:46:34.3244160Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T13:46:34.3244543Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:46:34.3244784Z Running torch_np/test_reductions 1/1 ... [2025-12-04 13:46:34.290693][3582502.815507217] 2025-12-04T13:46:34.3244980Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:46:34.3245400Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/test_reductions.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:46:34.290881] 2025-12-04T13:46:37.7101898Z 2025-12-04T13:46:37.7102742Z torch_np/test_reductions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.test_reductions_1.1_5739da353526e535_.log 2025-12-04T13:46:37.7237528Z Running 966 items in this shard: test/torch_np/test_reductions.py::TestFlatnonzero::test_basic, test/torch_np/test_reductions.py::TestAny::test_basic, test/torch_np/test_reductions.py::TestAny::test_method_vs_function, test/torch_np/test_reductions.py::TestAny::test_nd, test/torch_np/test_reductions.py::TestAll::test_basic, test/torch_np/test_reductions.py::TestAll::test_method_vs_function, test/torch_np/test_reductions.py::TestAll::test_nd, test/torch_np/test_reductions.py::TestMean::test_mean, test/torch_np/test_reductions.py::TestMean::test_mean_float16, test/torch_np/test_reductions.py::TestMean::test_mean_values, test/torch_np/test_reductions.py::TestMean::test_mean_where, test/torch_np/test_reductions.py::TestSum::test_sum, test/torch_np/test_reductions.py::TestSum::test_sum_boolean, test/torch_np/test_reductions.py::TestSum::test_sum_complex_1_dt0, test/torch_np/test_reductions.py::TestSum::test_sum_complex_1_dt1, test/torch_np/test_reductions.py::TestSum::test_sum_complex_2_dt0, test/torch_np/test_reductions.py::TestSum::test_sum_complex_2_dt1, test/torch_np/test_reductions.py::TestSum::test_sum_dtypes_2, test/torch_np/test_reductions.py::TestSum::test_sum_dtypes_warnings, test/torch_np/test_reductions.py::TestSum::test_sum_initial, test/torch_np/test_reductions.py::TestSum::test_sum_stability, test/torch_np/test_reductions.py::TestSum::test_sum_where, test/torch_np/test_reductions.py::TestGenericReductions::test_array_axis_func0, test/torch_np/test_reductions.py::TestGenericReductions::test_array_axis_func1, test/torch_np/test_reductions.py::TestGenericReductions::test_array_axis_func10, test/torch_np/test_reductions.py::TestGenericReductions::test_array_axis_func11, test/torch_np/test_reductions.py::TestGenericReductions::test_array_axis_func2, test/torch_np/test_reductions.py::TestGenericReductions::test_array_axis_func3, test/torch_np/test_reductions.py::TestGenericReductions::test_array_axis_func4, test/torch_np/test_reductions.py::TestGenericReductions::test_array_axis_func5, test/torch_np/test_reductions.py::TestGenericReductions::test_array_axis_func6, test/torch_np/test_reductions.py::TestGenericReductions::test_array_axis_func7, test/torch_np/test_reductions.py::TestGenericReductions::test_array_axis_func8, test/torch_np/test_reductions.py::TestGenericReductions::test_array_axis_func9, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_bad_tuple_func0, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_bad_tuple_func1, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_bad_tuple_func10, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_bad_tuple_func11, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_bad_tuple_func2, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_bad_tuple_func3, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_bad_tuple_func4, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_bad_tuple_func5, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_bad_tuple_func6, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_bad_tuple_func7, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_bad_tuple_func8, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_bad_tuple_func9, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_empty_generic_func0, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_empty_generic_func1, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_empty_generic_func10, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_empty_generic_func11, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_empty_generic_func2, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_empty_generic_func3, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_empty_generic_func4, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_empty_generic_func5, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_empty_generic_func6, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_empty_generic_func7, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_empty_generic_func8, test/torch_np/test_reductions.py::TestGenericReductions::test_axis_empty_generic_func9, test/torch_np/test_reductions.py::TestGenericReductions::test_bad_axis_func0, test/torch_np/test_reductions.py::TestGenericReductions::test_bad_axis_func1, test/torch_np/test_reductions.py::TestGenericReductions::test_bad_axis_func10, test/torch_np/test_reductions.py::TestGenericReductions::test_bad_axis_func11, test/torch_np/test_reductions.py::TestGenericReductions::test_bad_axis_func2, test/torch_np/test_reductions.py::TestGenericReductions::test_bad_axis_func3, test/torch_np/test_reductions.py::TestGenericReductions::test_bad_axis_func4, test/torch_np/test_reductions.py::TestGenericReductions::test_bad_axis_func5, test/torch_np/test_reductions.py::TestGenericReductions::test_bad_axis_func6, test/torch_np/test_reductions.py::TestGenericReductions::test_bad_axis_func7, test/torch_np/test_reductions.py::TestGenericReductions::test_bad_axis_func8, test/torch_np/test_reductions.py::TestGenericReductions::test_bad_axis_func9, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis5_func0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis5_func1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis5_func10, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis5_func11, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis5_func2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis5_func3, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis5_func4, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis5_func5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis5_func6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis5_func7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis5_func8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis5_func9, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis6_func0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis6_func1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis6_func10, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis6_func11, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis6_func2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis6_func3, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis6_func4, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis6_func5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis6_func6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis6_func7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis6_func8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis6_func9, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis7_func0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis7_func1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis7_func10, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis7_func11, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis7_func2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis7_func3, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis7_func4, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis7_func5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis7_func6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis7_func7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis7_func8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis7_func9, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis8_func0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis8_func1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis8_func10, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis8_func11, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis8_func2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis8_func3, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis8_func4, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis8_func5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis8_func6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis8_func7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis8_func8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis8_func9, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-1_func0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-1_func1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-1_func10, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-1_func11, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-1_func2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-1_func3, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-1_func4, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-1_func5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-1_func6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-1_func7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-1_func8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-1_func9, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-2_func0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-2_func1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-2_func10, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-2_func11, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-2_func2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-2_func3, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-2_func4, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-2_func5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-2_func6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-2_func7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-2_func8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_-2_func9, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_0_func0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_0_func1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_0_func10, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_0_func11, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_0_func2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_0_func3, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_0_func4, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_0_func5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_0_func6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_0_func7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_0_func8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_0_func9, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_1_func0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_1_func1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_1_func10, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_1_func11, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_1_func2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_1_func3, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_1_func4, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_1_func5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_1_func6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_1_func7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_1_func8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_1_func9, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_2_func0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_2_func1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_2_func10, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_2_func11, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_2_func2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_2_func3, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_2_func4, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_2_func5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_2_func6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_2_func7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_2_func8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_2_func9, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_none_func0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_none_func1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_none_func10, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_none_func11, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_none_func2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_none_func3, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_none_func4, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_none_func5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_none_func6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_none_func7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_none_func8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_generic_axis_none_func9, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func0_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func0_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func0_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func0_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func0_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func0_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func0_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func0_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func0_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func10_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func10_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func10_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func10_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func10_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func10_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func10_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func10_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func10_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func11_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func11_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func11_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func11_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func11_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func11_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func11_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func11_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func11_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func1_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func1_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func1_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func1_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func1_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func1_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func1_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func1_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func1_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func2_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func2_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func2_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func2_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func2_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func2_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func2_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func2_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func2_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func3_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func3_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func3_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func3_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func3_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func3_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func3_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func3_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func3_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func4_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func4_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func4_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func4_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func4_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func4_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func4_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func4_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func4_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func5_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func5_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func5_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func5_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func5_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func5_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func5_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func5_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func5_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func6_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func6_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func6_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func6_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func6_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func6_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func6_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func6_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func6_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func7_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func7_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func7_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func7_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func7_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func7_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func7_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func7_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func7_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func8_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func8_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func8_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func8_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func8_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func8_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func8_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func8_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func8_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func9_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func9_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func9_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func9_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func9_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func9_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func9_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func9_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_keepdims_out_func9_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func0_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func0_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func0_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func0_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func0_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func0_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func0_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func0_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func0_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func10_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func10_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func10_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func10_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func10_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func10_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func10_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func10_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func10_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func11_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func11_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func11_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func11_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func11_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func11_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func11_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func11_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func11_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func1_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func1_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func1_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func1_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func1_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func1_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func1_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func1_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func1_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func2_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func2_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func2_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func2_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func2_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func2_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func2_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func2_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func2_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func3_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func3_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func3_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func3_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func3_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func3_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func3_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func3_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func3_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func4_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func4_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func4_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func4_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func4_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func4_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func4_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func4_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func4_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func5_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func5_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func5_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func5_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func5_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func5_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func5_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func5_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func5_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func6_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func6_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func6_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func6_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func6_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func6_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func6_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func6_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func6_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func7_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func7_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func7_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func7_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func7_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func7_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func7_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func7_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func7_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func8_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func8_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func8_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func8_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func8_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func8_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func8_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func8_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func8_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func9_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func9_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func9_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func9_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func9_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func9_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func9_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func9_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype0_func9_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func0_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func0_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func0_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func0_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func0_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func0_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func0_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func0_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func0_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func10_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func10_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func10_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func10_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func10_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func10_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func10_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func10_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func10_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func11_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func11_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func11_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func11_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func11_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func11_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func11_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func11_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func11_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func1_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func1_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func1_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func1_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func1_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func1_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func1_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func1_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func1_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func2_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func2_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func2_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func2_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func2_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func2_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func2_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func2_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func2_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func3_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func3_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func3_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func3_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func3_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func3_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func3_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func3_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func3_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func4_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func4_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func4_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func4_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func4_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func4_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func4_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func4_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func4_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func5_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func5_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func5_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func5_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func5_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func5_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func5_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func5_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func5_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func6_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func6_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func6_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func6_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func6_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func6_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func6_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func6_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func6_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func7_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func7_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func7_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func7_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func7_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func7_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func7_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func7_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func7_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func8_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func8_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func8_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func8_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func8_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func8_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func8_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func8_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func8_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func9_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func9_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func9_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func9_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func9_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func9_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func9_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func9_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_float64_func9_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func0_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func0_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func0_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func0_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func0_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func0_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func0_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func0_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func0_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func10_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func10_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func10_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func10_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func10_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func10_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func10_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func10_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func10_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func11_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func11_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func11_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func11_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func11_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func11_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func11_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func11_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func11_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func1_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func1_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func1_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func1_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func1_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func1_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func1_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func1_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func1_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func2_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func2_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func2_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func2_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func2_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func2_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func2_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func2_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func2_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func3_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func3_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func3_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func3_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func3_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func3_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func3_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func3_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func3_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func4_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func4_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func4_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func4_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func4_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func4_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func4_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func4_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func4_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func5_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func5_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func5_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func5_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func5_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func5_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func5_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func5_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func5_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func6_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func6_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func6_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func6_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func6_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func6_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func6_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func6_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func6_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func7_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func7_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func7_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func7_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func7_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func7_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func7_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func7_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func7_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func8_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func8_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func8_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func8_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func8_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func8_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func8_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func8_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func8_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func9_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func9_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func9_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func9_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func9_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func9_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func9_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func9_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_False_dtype_int32_func9_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func0_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func0_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func0_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func0_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func0_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func0_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func0_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func0_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func0_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func10_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func10_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func10_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func10_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func10_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func10_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func10_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func10_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func10_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func11_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func11_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func11_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func11_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func11_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func11_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func11_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func11_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func11_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func1_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func1_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func1_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func1_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func1_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func1_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func1_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func1_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func1_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func2_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func2_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func2_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func2_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func2_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func2_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func2_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func2_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func2_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func3_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func3_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func3_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func3_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func3_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func3_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func3_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func3_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func3_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func4_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func4_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func4_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func4_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func4_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func4_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func4_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func4_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func4_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func5_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func5_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func5_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func5_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func5_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func5_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func5_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func5_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func5_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func6_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func6_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func6_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func6_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func6_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func6_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func6_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func6_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func6_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func7_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func7_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func7_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func7_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func7_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func7_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func7_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func7_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func7_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func8_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func8_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func8_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func8_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func8_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func8_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func8_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func8_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func8_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func9_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func9_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func9_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func9_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func9_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func9_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func9_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func9_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype0_func9_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func0_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func0_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func0_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func0_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func0_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func0_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func0_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func0_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func0_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func10_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func10_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func10_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func10_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func10_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func10_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func10_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func10_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func10_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func11_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func11_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func11_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func11_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func11_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func11_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func11_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func11_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func11_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func1_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func1_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func1_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func1_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func1_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func1_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func1_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func1_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func1_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func2_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func2_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func2_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func2_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func2_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func2_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func2_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func2_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func2_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func3_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func3_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func3_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func3_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func3_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func3_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func3_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func3_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func3_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func4_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func4_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func4_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func4_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func4_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func4_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func4_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func4_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func4_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func5_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func5_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func5_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func5_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func5_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func5_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func5_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func5_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func5_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func6_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func6_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func6_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func6_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func6_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func6_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func6_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func6_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func6_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func7_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func7_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func7_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func7_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func7_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func7_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func7_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func7_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func7_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func8_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func8_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func8_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func8_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func8_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func8_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func8_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func8_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func8_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func9_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func9_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func9_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func9_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func9_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func9_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func9_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func9_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_float64_func9_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func0_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func0_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func0_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func0_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func0_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func0_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func0_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func0_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func0_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func10_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func10_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func10_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func10_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func10_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func10_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func10_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func10_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func10_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func11_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func11_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func11_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func11_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func11_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func11_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func11_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func11_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func11_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func1_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func1_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func1_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func1_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func1_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func1_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func1_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func1_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func1_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func2_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func2_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func2_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func2_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func2_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func2_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func2_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func2_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func2_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func3_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func3_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func3_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func3_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func3_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func3_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func3_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func3_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func3_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func4_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func4_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func4_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func4_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func4_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func4_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func4_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func4_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func4_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func5_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func5_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func5_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func5_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func5_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func5_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func5_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func5_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func5_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func6_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func6_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func6_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func6_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func6_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func6_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func6_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func6_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func6_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func7_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func7_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func7_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func7_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func7_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func7_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func7_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func7_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func7_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func8_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func8_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func8_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func8_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func8_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func8_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func8_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func8_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func8_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func9_axis5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func9_axis6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func9_axis7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func9_axis8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func9_axis_-1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func9_axis_-2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func9_axis_0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func9_axis_1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_axis_keepdims_True_dtype_int32_func9_axis_2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_scalar_func0, test/torch_np/test_reductions.py::TestGenericReductions::test_out_scalar_func1, test/torch_np/test_reductions.py::TestGenericReductions::test_out_scalar_func10, test/torch_np/test_reductions.py::TestGenericReductions::test_out_scalar_func11, test/torch_np/test_reductions.py::TestGenericReductions::test_out_scalar_func2, test/torch_np/test_reductions.py::TestGenericReductions::test_out_scalar_func3, test/torch_np/test_reductions.py::TestGenericReductions::test_out_scalar_func4, test/torch_np/test_reductions.py::TestGenericReductions::test_out_scalar_func5, test/torch_np/test_reductions.py::TestGenericReductions::test_out_scalar_func6, test/torch_np/test_reductions.py::TestGenericReductions::test_out_scalar_func7, test/torch_np/test_reductions.py::TestGenericReductions::test_out_scalar_func8, test/torch_np/test_reductions.py::TestGenericReductions::test_out_scalar_func9, test/torch_np/test_reductions.py::TestGenericCumSumProd::test_array_axis_func0, test/torch_np/test_reductions.py::TestGenericCumSumProd::test_array_axis_func1, test/torch_np/test_reductions.py::TestGenericCumSumProd::test_axis_bad_tuple_func0, test/torch_np/test_reductions.py::TestGenericCumSumProd::test_axis_bad_tuple_func1, test/torch_np/test_reductions.py::TestGenericCumSumProd::test_axis_empty_generic_func0, test/torch_np/test_reductions.py::TestGenericCumSumProd::test_axis_empty_generic_func1, test/torch_np/test_reductions.py::TestGenericCumSumProd::test_bad_axis_func0, test/torch_np/test_reductions.py::TestGenericCumSumProd::test_bad_axis_func1 2025-12-04T13:46:37.7362849Z 2025-12-04T13:46:37.7362971Z Finished torch_np/test_reductions 1/1 ... [2025-12-04 13:46:37.710632][3582506.235442619], took 0.06min 2025-12-04T13:46:37.7363364Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T13:46:37.7363725Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:46:37.7363977Z Running torch_np/numpy_tests/core/test_scalar_ctors 1/1 ... [2025-12-04 13:46:37.717556][3582506.242370538] 2025-12-04T13:46:37.7364196Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:46:37.7364614Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/core/test_scalar_ctors.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:46:37.717746] 2025-12-04T13:46:39.9360075Z 2025-12-04T13:46:39.9360754Z torch_np/numpy_tests/core/test_scalar_ctors 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.core.test_scalar_ctors_1.1_cd212594be4dfe31_.log 2025-12-04T13:46:39.9377582Z Running 65 items in this shard: test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestFromString::test_bool, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestFromString::test_floating, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestFromString::test_floating_overflow, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestFromInt::test_intp, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestFromInt::test_uint64_from_negative, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_complex_t10_t20, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_complex_t10_t21, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_complex_t10_t22, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_complex_t11_t20, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_complex_t11_t21, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_complex_t11_t22, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_byte_np_byte, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_byte_np_int_, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_byte_np_intc, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_byte_np_longlong, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_byte_np_short, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_byte_t25, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_byte_t26, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_int__np_byte, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_int__np_int_, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_int__np_intc, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_int__np_longlong, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_int__np_short, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_int__t25, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_int__t26, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_intc_np_byte, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_intc_np_int_, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_intc_np_intc, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_intc_np_longlong, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_intc_np_short, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_intc_t25, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_intc_t26, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_longlong_np_byte, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_longlong_np_int_, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_longlong_np_intc, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_longlong_np_longlong, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_longlong_np_short, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_longlong_t25, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_longlong_t26, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_short_np_byte, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_short_np_int_, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_short_np_intc, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_short_np_longlong, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_short_np_short, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_short_t25, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_np_short_t26, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_t15_np_byte, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_t15_np_int_, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_t15_np_intc, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_t15_np_longlong, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_t15_np_short, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_t15_t25, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_integers_t15_t26, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_reals_t10_t20, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_reals_t10_t21, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_reals_t10_t22, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_reals_t10_t23, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_reals_t11_t20, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_reals_t11_t21, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_reals_t11_t22, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_reals_t11_t23, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_reals_t12_t20, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_reals_t12_t21, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_reals_t12_t22, test/torch_np/numpy_tests/core/test_scalar_ctors.py::TestArrayFromScalar::test_reals_t12_t23 2025-12-04T13:46:39.9388922Z 2025-12-04T13:46:39.9389072Z Finished torch_np/numpy_tests/core/test_scalar_ctors 1/1 ... [2025-12-04 13:46:39.935702][3582508.460511022], took 0.04min 2025-12-04T13:46:39.9389540Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T13:46:39.9434205Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:46:39.9436340Z Running torch_np/numpy_tests/lib/test_arraypad 1/1 ... [2025-12-04 13:46:39.943429][3582508.468242366] 2025-12-04T13:46:39.9436547Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:46:39.9437964Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'torch_np/numpy_tests/lib/test_arraypad.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:46:39.943615] 2025-12-04T13:46:42.2119311Z 2025-12-04T13:46:42.2120189Z torch_np/numpy_tests/lib/test_arraypad 1/1 was successful, full logs can be found in artifacts with path test/test-reports/torch_np.numpy_tests.lib.test_arraypad_1.1_a9d0ec2e2f65f0ef_.log 2025-12-04T13:46:42.2122788Z Running 9 items in this shard: test/torch_np/numpy_tests/lib/test_arraypad.py::TestConstant::test_check_constant, test/torch_np/numpy_tests/lib/test_arraypad.py::TestConstant::test_check_constant_float, test/torch_np/numpy_tests/lib/test_arraypad.py::TestConstant::test_check_constant_float2, test/torch_np/numpy_tests/lib/test_arraypad.py::TestConstant::test_check_constant_float3, test/torch_np/numpy_tests/lib/test_arraypad.py::TestConstant::test_check_constant_odd_pad_amount, test/torch_np/numpy_tests/lib/test_arraypad.py::TestConstant::test_check_constant_pad_2d, test/torch_np/numpy_tests/lib/test_arraypad.py::TestConstant::test_check_constant_zeros, test/torch_np/numpy_tests/lib/test_arraypad.py::TestConstant::test_check_large_integers, test/torch_np/numpy_tests/lib/test_arraypad.py::TestConstant::test_pad_empty_dimension 2025-12-04T13:46:42.2124734Z 2025-12-04T13:46:42.2124966Z Finished torch_np/numpy_tests/lib/test_arraypad 1/1 ... [2025-12-04 13:46:42.211609][3582510.736418485], took 0.04min 2025-12-04T13:46:42.2138934Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T13:46:42.2193458Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:46:42.2193742Z Running test_prims 1/1 ... [2025-12-04 13:46:42.219248][3582510.744061641] 2025-12-04T13:46:42.2193970Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:46:42.2196080Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_prims.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:46:42.219434] 2025-12-04T13:46:45.7898724Z 2025-12-04T13:46:45.7899682Z test_prims 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_prims_1.1_da93f1b2696a8ac0_.log 2025-12-04T13:46:45.7905272Z Running 26 items in this shard: test/test_prims.py::TestPrimsBasic::test_check_deprecation_warning, test/test_prims.py::TestPrimsBasic::test_clone_complex, test/test_prims.py::TestPrimsBasic::test_clone_meta_stride_preservation_dense, test/test_prims.py::TestPrimsBasic::test_clone_meta_stride_preservation_sparse, test/test_prims.py::TestPrimsBasic::test_mul_complex, test/test_prims.py::TestPrimsBasic::test_torch_ops, test/test_prims.py::TestPrimsCUDA::test_aten_overload_to_prims_cuda, test/test_prims.py::TestPrimsCUDA::test_broadcast_in_dim_cuda_float32, test/test_prims.py::TestPrimsCUDA::test_broadcast_in_dim_sum_cuda_float32, test/test_prims.py::TestPrimsCUDA::test_cbrt_prim_cuda_float64, test/test_prims.py::TestPrimsCUDA::test_cbrt_prim_cuda_int64, test/test_prims.py::TestPrimsCUDA::test_collapse_cuda_float32, test/test_prims.py::TestPrimsCUDA::test_functional_rng_wrappers_cuda_float32, test/test_prims.py::TestPrimsCUDA::test_memory_format_strides_cuda_float32, test/test_prims.py::TestPrimsCUDA::test_philox_rand_cuda_float32, test/test_prims.py::TestPrimsCUDA::test_reshape_view_method_cuda_float32, test/test_prims.py::TestPrimsCUDA::test_var_correction_0_cuda_float32, test/test_prims.py::TestPrimsCUDA::test_var_correction_1_cuda_float32, test/test_prims.py::TestRefsCUDA::test_constant_pad_nd_memory_format_cuda_float32, test/test_prims.py::TestRefsCUDA::test_inferred_tags_cuda, test/test_prims.py::TestRefsCUDA::test_infinite_loop_from_py_dispatcher_cuda, test/test_prims.py::TestRefsCUDA::test_linspace_with_complex_input_cuda, test/test_prims.py::TestRefsCUDA::test_logspace_with_complex_input_cuda, test/test_prims.py::TestRefsCUDA::test_unbind_cuda, test/test_prims.py::TestDecompCUDA::test_decomposition_method_vararg_ones_cuda_float32, test/test_prims.py::TestDecompCUDA::test_decomposition_method_vararg_permute_cuda_float32 2025-12-04T13:46:45.7909758Z 2025-12-04T13:46:45.7909915Z Finished test_prims 1/1 ... [2025-12-04 13:46:45.789584][3582514.314393557], took 0.06min 2025-12-04T13:46:45.7919234Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T13:46:45.7971976Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:46:45.7974258Z Running test_spectral_ops 1/1 ... [2025-12-04 13:46:45.797250][3582514.322063713] 2025-12-04T13:46:45.7974498Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:46:45.7975286Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_spectral_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=2', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:46:45.797444] 2025-12-04T13:48:10.7776694Z 2025-12-04T13:48:10.7777744Z test_spectral_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/test_spectral_ops_1.1_ec30cbac1a8735db_.log 2025-12-04T13:48:10.7818508Z Running 347 items in this shard: test/test_spectral_ops.py::TestFFTCUDA::test_batch_istft_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_complex_istft_real_equiv_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_complex_stft_definition_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_complex_stft_onesided_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_complex_stft_real_equiv_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_complex_stft_roundtrip_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_complex_stft_roundtrip_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_cufft_context_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_cufft_context_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_cufft_plan_cache_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fft2_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_fftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfft2_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_hfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifft2_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ifftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ihfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ihfft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ihfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ihfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ihfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_ihfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfft2_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_irfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_rfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_rfft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_rfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_rfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_rfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft__refs_fft_rfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fft2_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_fftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfft2_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_hfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifft2_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ifftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ihfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ihfft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ihfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ihfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ihfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_ihfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfft2_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_irfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_rfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_rfft2_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_rfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_rfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_rfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_empty_fft_fft_rfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_empty_ifft_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_fft2_fftn_equivalence_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fft2_fftn_equivalence_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fft2_invalid_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_fft2_numpy_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_fft2_numpy_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_fft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_fft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_fftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_hfft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_hfft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_hfftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_ifft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_ifft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_ifftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_ihfft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_ihfft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_ihfftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_irfft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_irfft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_irfftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_rfft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_rfft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors__refs_fft_rfftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_fft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_fft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_fftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_hfft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_hfft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_hfftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_ifft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_ifft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_ifftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_ihfft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_ihfft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_ihfftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_irfft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_irfft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_irfftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_rfft2_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_rfft_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_bfloat16_errors_fft_rfftn_cuda_bfloat16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_fft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_fft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_fft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_fft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_fftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_fftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_hfft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_hfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_hfft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_hfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_hfftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_hfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ifft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ifft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ifft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ifft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ifftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ifftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ihfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ihfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_ihfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_irfft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_irfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_irfft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_irfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_irfftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_irfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_rfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_rfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft_rfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_fft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_fft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_fft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_fft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_fftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_fftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_hfft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_hfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_hfft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_hfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_hfftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_hfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ifft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ifft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ifft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ifft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ifftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ifftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ihfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ihfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_ihfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_irfft2_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_irfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_irfft_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_irfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_irfftn_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_irfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_rfft2_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_rfft_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft_rfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_ifft_rfft_irfft_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fft_input_modification_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_fft_invalid_dtypes_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_fft_plan_repeatable_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_fft_round_trip_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_fft_round_trip_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_round_trip_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fft_round_trip_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_round_trip_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_round_trip_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fft_type_promotion_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_fft_type_promotion_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_type_promotion_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fft_type_promotion_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fft_type_promotion_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fft_type_promotion_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fft_type_promotion_cuda_int8, test/test_spectral_ops.py::TestFFTCUDA::test_fftfreq_numpy_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftfreq_numpy_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fftfreq_out_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftfreq_out_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_fftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_fftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_hfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_hfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_ifftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_ifftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_ihfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_irfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_irfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid__refs_fft_rfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_fftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_fftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_hfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_hfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_ifftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_ifftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_ihfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_irfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_irfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_invalid_fft_rfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_noop_transform_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_noop_transform_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_noop_transform_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_noop_transform_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_noop_transform_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_round_trip_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_round_trip_cuda_complex32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_round_trip_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_round_trip_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_round_trip_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftn_round_trip_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fftshift_frequencies_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftshift_frequencies_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_fftshift_numpy_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_fftshift_numpy_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_fftshift_numpy_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_fftshift_numpy_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_hfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_hfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_hfftn_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_ihfftn_cuda_float16, test/test_spectral_ops.py::TestFFTCUDA::test_ihfftn_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_ihfftn_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_istft_against_librosa_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_istft_linearity_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_istft_of_sine_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_istft_requires_window_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_istft_round_trip_simple_cases_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_istft_round_trip_various_params_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_istft_round_trip_with_padding_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_istft_throws_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_fft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_fft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_hfft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_hfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_ifft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_ifft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_ihfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_irfft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_irfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d__refs_fft_rfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_fft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_fft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_hfft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_hfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_ifft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_ifft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_ihfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_irfft_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_irfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_1d_fft_rfft_cuda_float32, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd__refs_fft_fftn_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd__refs_fft_fftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd__refs_fft_hfftn_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd__refs_fft_hfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd__refs_fft_ifftn_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd__refs_fft_ifftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd__refs_fft_irfftn_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd__refs_fft_irfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd_fft_fftn_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd_fft_fftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd_fft_hfftn_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd_fft_hfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd_fft_ifftn_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd_fft_ifftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd_fft_irfftn_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_reference_nd_fft_irfftn_cuda_complex64, test/test_spectral_ops.py::TestFFTCUDA::test_stft_align_to_window_only_requires_non_center_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_stft_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_stft_requires_complex_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_stft_requires_window_cuda, test/test_spectral_ops.py::TestFFTCUDA::test_stft_roundtrip_complex_window_cuda_complex128, test/test_spectral_ops.py::TestFFTCUDA::test_stft_roundtrip_complex_window_cuda_float64, test/test_spectral_ops.py::TestFFTCUDA::test_stft_window_device_cuda 2025-12-04T13:48:10.7855508Z 2025-12-04T13:48:10.7855618Z Finished test_spectral_ops 1/1 ... [2025-12-04 13:48:10.777563][3582599.30237333], took 1.42min 2025-12-04T13:48:10.7856012Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T13:48:10.7856371Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:48:10.7856581Z Running doctests 1/1 ... [2025-12-04 13:48:10.785242][3582599.310055696] 2025-12-04T13:48:11.1969282Z msg = Cannot scrape callname=Library.fallback in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py line=368. 2025-12-04T13:48:11.1969737Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:11.1970061Z Registers the function implementation as the fallback for the given key. 2025-12-04T13:48:11.1970233Z 2025-12-04T13:48:11.1970354Z This function only works for a library with global namespace ("_"). 2025-12-04T13:48:11.1970511Z 2025-12-04T13:48:11.1970566Z Args: 2025-12-04T13:48:11.1970777Z fn: function used as fallback for the given dispatch key or :func:`~fallthrough_kernel` 2025-12-04T13:48:11.1971033Z to register a fallthrough. 2025-12-04T13:48:11.1971293Z dispatch_key: dispatch key that the input function should be registered for. By default, it uses 2025-12-04T13:48:11.1971594Z the dispatch key that the library was created with. 2025-12-04T13:48:11.1972050Z with_keyset: flag controlling if the current dispatcher call keyset should be passed as the first argument 2025-12-04T13:48:11.1972423Z to :attr:`fn` when calling. This should be used to create the appropriate keyset for redispatch calls. 2025-12-04T13:48:11.1972623Z 2025-12-04T13:48:11.1972717Z Example:: 2025-12-04T13:48:11.1972793Z 2025-12-04T13:48:11.1972865Z >>> my_lib = Library("_", "IMPL") 2025-12-04T13:48:11.1973052Z >>> def fallback_kernel(op, *args, **kwargs): 2025-12-04T13:48:11.1973245Z >>> # Handle all autocast ops generically 2025-12-04T13:48:11.1973422Z >>> # ... 2025-12-04T13:48:11.1973584Z >>> my_lib.fallback(fallback_kernel, "Autocast") 2025-12-04T13:48:11.1973753Z 2025-12-04T13:48:11.1974103Z Original Error: IndentationError('expected an indented block after function definition on line 2', ('', 5, 1, 'my_lib.fallback(fallback_kernel, "Autocast")\n', 5, 7)) 2025-12-04T13:48:11.1974416Z 2025-12-04T13:48:11.1974492Z my_lib.fallback(fallback_kernel, "Autocast") 2025-12-04T13:48:11.1974658Z ^ 2025-12-04T13:48:11.2028349Z msg = Cannot scrape callname=register_fake in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py line=958. 2025-12-04T13:48:11.2028905Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:11.2029152Z Register a FakeTensor implementation ("fake impl") for this operator. 2025-12-04T13:48:11.2029293Z 2025-12-04T13:48:11.2029381Z Also sometimes known as a "meta kernel", "abstract impl". 2025-12-04T13:48:11.2029502Z 2025-12-04T13:48:11.2029610Z An "FakeTensor implementation" specifies the behavior of this operator on 2025-12-04T13:48:11.2029851Z Tensors that carry no data ("FakeTensor"). Given some input Tensors with 2025-12-04T13:48:11.2030139Z certain properties (sizes/strides/storage_offset/device), it specifies 2025-12-04T13:48:11.2030398Z what the properties of the output Tensors are. 2025-12-04T13:48:11.2030508Z 2025-12-04T13:48:11.2030609Z The FakeTensor implementation has the same signature as the operator. 2025-12-04T13:48:11.2030935Z It is run for both FakeTensors and meta tensors. To write a FakeTensor 2025-12-04T13:48:11.2031230Z implementation, assume that all Tensor inputs to the operator are 2025-12-04T13:48:11.2031458Z regular CPU/CUDA/Meta tensors, but they do not have storage, and 2025-12-04T13:48:11.2031678Z you are trying to return regular CPU/CUDA/Meta tensor(s) as output. 2025-12-04T13:48:11.2031957Z The FakeTensor implementation must consist of only PyTorch operations 2025-12-04T13:48:11.2032182Z (and may not directly access the storage or data of any input or 2025-12-04T13:48:11.2032361Z intermediate Tensors). 2025-12-04T13:48:11.2032446Z 2025-12-04T13:48:11.2032522Z This API may be used as a decorator (see examples). 2025-12-04T13:48:11.2032639Z 2025-12-04T13:48:11.2032849Z For a detailed guide on custom ops, please see 2025-12-04T13:48:11.2033090Z https://pytorch.org/tutorials/advanced/custom_ops_landing_page.html 2025-12-04T13:48:11.2033237Z 2025-12-04T13:48:11.2033283Z Args: 2025-12-04T13:48:11.2033440Z op_name: Operator name (along with the overload) or OpOverload object. 2025-12-04T13:48:11.2033642Z func: Fake tensor implementation. 2025-12-04T13:48:11.2033824Z lib (Optional[Library]): Library to register the fake tensor to. 2025-12-04T13:48:11.2034033Z allow_override: Flag controlling if we want to override an 2025-12-04T13:48:11.2034233Z existing registered fake impl. This is by default off, 2025-12-04T13:48:11.2034435Z and will error you're trying to register a fake impl to 2025-12-04T13:48:11.2034637Z an operator that already has a fake impl. This also only 2025-12-04T13:48:11.2034832Z applies if the custom operator was not created via 2025-12-04T13:48:11.2035032Z torch.library.custom_op, as overriding and existing fake 2025-12-04T13:48:11.2035211Z impl is already allowed. 2025-12-04T13:48:11.2035308Z 2025-12-04T13:48:11.2035370Z Examples: 2025-12-04T13:48:11.2035484Z >>> import torch 2025-12-04T13:48:11.2035617Z >>> import numpy as np 2025-12-04T13:48:11.2035759Z >>> from torch import Tensor 2025-12-04T13:48:11.2035896Z >>> 2025-12-04T13:48:11.2036042Z >>> # Example 1: an operator without data-dependent output shape 2025-12-04T13:48:11.2036259Z >>> @torch.library.custom_op("mylib::custom_linear", mutates_args=()) 2025-12-04T13:48:11.2036492Z >>> def custom_linear(x: Tensor, weight: Tensor, bias: Tensor) -> Tensor: 2025-12-04T13:48:11.2036721Z >>> raise NotImplementedError("Implementation goes here") 2025-12-04T13:48:11.2036888Z >>> 2025-12-04T13:48:11.2037031Z >>> @torch.library.register_fake("mylib::custom_linear") 2025-12-04T13:48:11.2037201Z >>> def _(x, weight, bias): 2025-12-04T13:48:11.2037345Z >>> assert x.dim() == 2 2025-12-04T13:48:11.2037493Z >>> assert weight.dim() == 2 2025-12-04T13:48:11.2037637Z >>> assert bias.dim() == 1 2025-12-04T13:48:11.2037821Z >>> assert x.shape[1] == weight.shape[1] 2025-12-04T13:48:11.2037993Z >>> assert weight.shape[0] == bias.shape[0] 2025-12-04T13:48:11.2038153Z >>> assert x.device == weight.device 2025-12-04T13:48:11.2038286Z >>> 2025-12-04T13:48:11.2038417Z >>> return (x @ weight.t()) + bias 2025-12-04T13:48:11.2038550Z >>> 2025-12-04T13:48:11.2038680Z >>> with torch._subclasses.fake_tensor.FakeTensorMode(): 2025-12-04T13:48:11.2038847Z >>> x = torch.randn(2, 3) 2025-12-04T13:48:11.2039002Z >>> w = torch.randn(3, 3) 2025-12-04T13:48:11.2039135Z >>> b = torch.randn(3) 2025-12-04T13:48:11.2039300Z >>> y = torch.ops.mylib.custom_linear(x, w, b) 2025-12-04T13:48:11.2039445Z >>> 2025-12-04T13:48:11.2039551Z >>> assert y.shape == (2, 3) 2025-12-04T13:48:11.2039676Z >>> 2025-12-04T13:48:11.2039808Z >>> # Example 2: an operator with data-dependent output shape 2025-12-04T13:48:11.2040041Z >>> @torch.library.custom_op("mylib::custom_nonzero", mutates_args=()) 2025-12-04T13:48:11.2040230Z >>> def custom_nonzero(x: Tensor) -> Tensor: 2025-12-04T13:48:11.2040366Z >>> x_np = x.numpy(force=True) 2025-12-04T13:48:11.2040505Z >>> res = np.stack(np.nonzero(x_np), axis=1) 2025-12-04T13:48:11.2040649Z >>> return torch.tensor(res, device=x.device) 2025-12-04T13:48:11.2040773Z >>> 2025-12-04T13:48:11.2040894Z >>> @torch.library.register_fake("mylib::custom_nonzero") 2025-12-04T13:48:11.2041035Z >>> def _(x): 2025-12-04T13:48:11.2041160Z >>> # Number of nonzero-elements is data-dependent. 2025-12-04T13:48:11.2041319Z >>> # Since we cannot peek at the data in an fake impl, 2025-12-04T13:48:11.2041479Z >>> # we use the ctx object to construct a new symint that 2025-12-04T13:48:11.2041631Z >>> # represents the data-dependent size. 2025-12-04T13:48:11.2041769Z >>> ctx = torch.library.get_ctx() 2025-12-04T13:48:11.2041949Z >>> nnz = ctx.new_dynamic_size() 2025-12-04T13:48:11.2042080Z >>> shape = [nnz, x.dim()] 2025-12-04T13:48:11.2042222Z >>> result = x.new_empty(shape, dtype=torch.int64) 2025-12-04T13:48:11.2042364Z >>> return result 2025-12-04T13:48:11.2042473Z >>> 2025-12-04T13:48:11.2042594Z >>> from torch.fx.experimental.proxy_tensor import make_fx 2025-12-04T13:48:11.2042735Z >>> 2025-12-04T13:48:11.2042831Z >>> x = torch.tensor([0, 1, 2, 3, 4, 0]) 2025-12-04T13:48:11.2043007Z >>> trace = make_fx(torch.ops.mylib.custom_nonzero, tracing_mode="symbolic")(x) 2025-12-04T13:48:11.2043186Z >>> trace.print_readable() 2025-12-04T13:48:11.2043298Z >>> 2025-12-04T13:48:11.2043430Z >>> assert torch.allclose(trace(x), torch.ops.mylib.custom_nonzero(x)) 2025-12-04T13:48:11.2043550Z 2025-12-04T13:48:11.2043588Z 2025-12-04T13:48:11.2043816Z Original Error: IndentationError('expected an indented block after function definition on line 37', ('', 38, 1, '_._ = None\n', 38, 2)) 2025-12-04T13:48:11.2044028Z 2025-12-04T13:48:11.2044068Z _._ = None 2025-12-04T13:48:11.2044151Z ^ 2025-12-04T13:48:11.2088971Z msg = Cannot scrape callname=get_kernel in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py line=1530. 2025-12-04T13:48:11.2089264Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:11.2089471Z Returns the computed kernel for a given operator and dispatch key. 2025-12-04T13:48:11.2089585Z 2025-12-04T13:48:11.2089678Z This function retrieves the kernel that would be executed for a given 2025-12-04T13:48:11.2089886Z operator and dispatch key combination. The returned SafeKernelFunction 2025-12-04T13:48:11.2090084Z can be used to call the kernel in a boxed fashion. The intended use 2025-12-04T13:48:11.2090271Z case for this function is to retrieve the original kernel for a given 2025-12-04T13:48:11.2090493Z dispatch key and then register another kernel to the same dispatch key 2025-12-04T13:48:11.2090676Z that calls into the original kernel for certain cases. 2025-12-04T13:48:11.2090780Z 2025-12-04T13:48:11.2090816Z Args: 2025-12-04T13:48:11.2090943Z op: Operator name (along with the overload) or OpOverload object 2025-12-04T13:48:11.2091140Z Can be a string (e.g., "aten::add.Tensor"), an OpOverload, or a CustomOpDef. 2025-12-04T13:48:11.2091361Z dispatch_key (str | torch.DispatchKey): The dispatch key to get the kernel for. 2025-12-04T13:48:11.2091579Z Can be a string (e.g., "CPU", "CUDA") or a DispatchKey enum value. 2025-12-04T13:48:11.2091704Z 2025-12-04T13:48:11.2091742Z Returns: 2025-12-04T13:48:11.2091926Z torch._C._SafeKernelFunction: A safe kernel function that can be used to 2025-12-04T13:48:11.2092097Z call the kernel. 2025-12-04T13:48:11.2092169Z 2025-12-04T13:48:11.2092205Z Raises: 2025-12-04T13:48:11.2092315Z RuntimeError: If the operator does not exist. 2025-12-04T13:48:11.2092425Z 2025-12-04T13:48:11.2092463Z Example: 2025-12-04T13:48:11.2092570Z >>> # Get the CPU kernel for torch.add 2025-12-04T13:48:11.2092730Z >>> kernel = torch.library.get_kernel("aten::add.Tensor", "CPU") 2025-12-04T13:48:11.2092874Z >>> 2025-12-04T13:48:11.2092974Z >>> # You can also use DispatchKey enum 2025-12-04T13:48:11.2093157Z >>> kernel = torch.library.get_kernel("aten::add.Tensor", torch.DispatchKey.CPU) 2025-12-04T13:48:11.2093327Z >>> 2025-12-04T13:48:11.2093422Z >>> # Or use an OpOverload directly 2025-12-04T13:48:11.2093592Z >>> kernel = torch.library.get_kernel(torch.ops.aten.add.Tensor, "CPU") 2025-12-04T13:48:11.2093748Z >>> 2025-12-04T13:48:11.2093880Z >>> # Example: Using get_kernel in a custom op with conditional dispatch 2025-12-04T13:48:11.2094048Z >>> # Get the original kernel for torch.sin 2025-12-04T13:48:11.2094220Z >>> original_sin_kernel = torch.library.get_kernel("aten::sin", "CPU") 2025-12-04T13:48:11.2094371Z >>> 2025-12-04T13:48:11.2094506Z >>> # If input has negative values, use original sin, otherwise return zeros 2025-12-04T13:48:11.2094683Z >>> def conditional_sin_impl(dispatch_keys, x): 2025-12-04T13:48:11.2094832Z >>> if (x < 0).any(): 2025-12-04T13:48:11.2094976Z >>> return original_sin_kernel.call_boxed(dispatch_keys, x) 2025-12-04T13:48:11.2095124Z >>> else: 2025-12-04T13:48:11.2095233Z >>> return torch.zeros_like(x) 2025-12-04T13:48:11.2095349Z >>> 2025-12-04T13:48:11.2095456Z >>> lib = torch.library.Library("aten", "IMPL") 2025-12-04T13:48:11.2095643Z >>> # with_keyset=True so the first argument to the impl is the current DispatchKeySet 2025-12-04T13:48:11.2095846Z >>> which needs to be the first argument to ``kernel.call_boxed`` 2025-12-04T13:48:11.2096038Z >>> lib.impl("sin", conditional_sin_impl, "CPU", with_keyset=True) 2025-12-04T13:48:11.2096186Z >>> 2025-12-04T13:48:11.2096283Z >>> # Test the conditional behavior 2025-12-04T13:48:11.2096418Z >>> x_positive = torch.tensor([1.0, 2.0]) 2025-12-04T13:48:11.2096553Z >>> x_mixed = torch.tensor([-1.0, 2.0]) 2025-12-04T13:48:11.2096682Z >>> torch.sin(x_positive) 2025-12-04T13:48:11.2096799Z tensor([0., 0.]) 2025-12-04T13:48:11.2096909Z >>> torch.sin(x_mixed) 2025-12-04T13:48:11.2097024Z tensor([-0.8415, 0.9093]) 2025-12-04T13:48:11.2097134Z 2025-12-04T13:48:11.2097350Z Original Error: SyntaxError('invalid syntax', ('', 23, 7, 'which needs to be the first argument to ``kernel.call_boxed``\n', 23, 12)) 2025-12-04T13:48:11.2097552Z 2025-12-04T13:48:11.2097621Z which needs to be the first argument to ``kernel.call_boxed`` 2025-12-04T13:48:11.2097763Z ^ 2025-12-04T13:48:11.4856669Z msg = Cannot scrape callname=is_available in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py line=70. 2025-12-04T13:48:11.4857948Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:11.4858560Z Check if the current accelerator is available at runtime: it was build, all the 2025-12-04T13:48:11.4859133Z required drivers are available and at least one device is visible. 2025-12-04T13:48:11.4859621Z See :ref:`accelerator` for details. 2025-12-04T13:48:11.4859887Z 2025-12-04T13:48:11.4860075Z Returns: 2025-12-04T13:48:11.4860502Z bool: A boolean indicating if there is an available :ref:`accelerator`. 2025-12-04T13:48:11.4860879Z 2025-12-04T13:48:11.4861318Z .. note:: This API delegates to the device-specific version of `is_available`. 2025-12-04T13:48:11.4862090Z On CUDA, when the environment variable ``PYTORCH_NVML_BASED_CUDA_CHECK=1`` is set, 2025-12-04T13:48:11.4862516Z this function will NOT poison fork. Otherwise, it will. For more details, see 2025-12-04T13:48:11.4862963Z :ref:`multiprocessing-poison-fork-note`. 2025-12-04T13:48:11.4863162Z 2025-12-04T13:48:11.4863239Z Example:: 2025-12-04T13:48:11.4863346Z 2025-12-04T13:48:11.4863539Z >>> assert torch.accelerator.is_available() "No available accelerators detected." 2025-12-04T13:48:11.4863888Z 2025-12-04T13:48:11.4864348Z Original Error: SyntaxError('invalid syntax', ('', 1, 41, 'assert torch.accelerator.is_available() "No available accelerators detected."\n', 1, 78)) 2025-12-04T13:48:11.4864785Z 2025-12-04T13:48:11.4864964Z assert torch.accelerator.is_available() "No available accelerators detected." 2025-12-04T13:48:11.4865295Z ^ 2025-12-04T13:48:11.4868485Z msg = Cannot scrape callname=synchronize in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py line=239. 2025-12-04T13:48:11.4869100Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:11.4869498Z Wait for all kernels in all streams on the given device to complete. 2025-12-04T13:48:11.4869714Z 2025-12-04T13:48:11.4869786Z Args: 2025-12-04T13:48:11.4870095Z device (:class:`torch.device`, str, int, optional): device for which to synchronize. It must match 2025-12-04T13:48:11.4870481Z the current :ref:`accelerator` device type. If not given, 2025-12-04T13:48:11.4870764Z use :func:`torch.accelerator.current_device_index` by default. 2025-12-04T13:48:11.4870930Z 2025-12-04T13:48:11.4871093Z .. note:: This function is a no-op if the current :ref:`accelerator` is not initialized. 2025-12-04T13:48:11.4871303Z 2025-12-04T13:48:11.4871361Z Example:: 2025-12-04T13:48:11.4871447Z 2025-12-04T13:48:11.4871533Z >>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_CUDA) 2025-12-04T13:48:11.4871807Z >>> assert torch.accelerator.is_available() "No available accelerators detected." 2025-12-04T13:48:11.4872130Z >>> start_event = torch.Event(enable_timing=True) 2025-12-04T13:48:11.4872348Z >>> end_event = torch.Event(enable_timing=True) 2025-12-04T13:48:11.4872548Z >>> start_event.record() 2025-12-04T13:48:11.4872786Z >>> tensor = torch.randn(100, device=torch.accelerator.current_accelerator()) 2025-12-04T13:48:11.4873035Z >>> sum = torch.sum(tensor) 2025-12-04T13:48:11.4873206Z >>> end_event.record() 2025-12-04T13:48:11.4873391Z >>> torch.accelerator.synchronize() 2025-12-04T13:48:11.4873611Z >>> elapsed_time_ms = start_event.elapsed_time(end_event) 2025-12-04T13:48:11.4873813Z 2025-12-04T13:48:11.4874159Z Original Error: SyntaxError('invalid syntax', ('', 2, 41, 'assert torch.accelerator.is_available() "No available accelerators detected."\n', 2, 78)) 2025-12-04T13:48:11.4874479Z 2025-12-04T13:48:11.4874615Z assert torch.accelerator.is_available() "No available accelerators detected." 2025-12-04T13:48:11.4874912Z ^ 2025-12-04T13:48:11.4998847Z msg = Cannot scrape callname=cudart in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py line=448. 2025-12-04T13:48:11.4999248Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:11.4999477Z Retrieves the CUDA runtime API module. 2025-12-04T13:48:11.4999592Z 2025-12-04T13:48:11.4999596Z 2025-12-04T13:48:11.4999712Z This function initializes the CUDA runtime environment if it is not already 2025-12-04T13:48:11.5000030Z initialized and returns the CUDA runtime API module (_cudart). The CUDA 2025-12-04T13:48:11.5000318Z runtime API module provides access to various CUDA runtime functions. 2025-12-04T13:48:11.5000477Z 2025-12-04T13:48:11.5000524Z Args: 2025-12-04T13:48:11.5000637Z ``None`` 2025-12-04T13:48:11.5000714Z 2025-12-04T13:48:11.5000757Z Returns: 2025-12-04T13:48:11.5000903Z module: The CUDA runtime API module (_cudart). 2025-12-04T13:48:11.5001033Z 2025-12-04T13:48:11.5001100Z Raises: 2025-12-04T13:48:11.5001277Z RuntimeError: If CUDA cannot be re-initialized in a forked subprocess. 2025-12-04T13:48:11.5001599Z AssertionError: If PyTorch is not compiled with CUDA support or if libcudart functions are unavailable. 2025-12-04T13:48:11.5001810Z 2025-12-04T13:48:11.5001931Z Example of CUDA operations with profiling: 2025-12-04T13:48:11.5002106Z >>> import torch 2025-12-04T13:48:11.5002265Z >>> from torch.cuda import cudart, check_error 2025-12-04T13:48:11.5002433Z >>> import os 2025-12-04T13:48:11.5011237Z >>> 2025-12-04T13:48:11.5011370Z >>> os.environ["CUDA_PROFILE"] = "1" 2025-12-04T13:48:11.5011503Z >>> 2025-12-04T13:48:11.5011628Z >>> def perform_cuda_operations_with_streams(): 2025-12-04T13:48:11.5011785Z >>> stream = torch.cuda.Stream() 2025-12-04T13:48:11.5011982Z >>> with torch.cuda.stream(stream): 2025-12-04T13:48:11.5012141Z >>> x = torch.randn(100, 100, device='cuda') 2025-12-04T13:48:11.5012298Z >>> y = torch.randn(100, 100, device='cuda') 2025-12-04T13:48:11.5012440Z >>> z = torch.mul(x, y) 2025-12-04T13:48:11.5012575Z >>> return z 2025-12-04T13:48:11.5012685Z >>> 2025-12-04T13:48:11.5012787Z >>> torch.cuda.synchronize() 2025-12-04T13:48:11.5012936Z >>> print("====== Start nsys profiling ======") 2025-12-04T13:48:11.5013098Z >>> check_error(cudart().cudaProfilerStart()) 2025-12-04T13:48:11.5013263Z >>> with torch.autograd.profiler.emit_nvtx(): 2025-12-04T13:48:11.5013435Z >>> result = perform_cuda_operations_with_streams() 2025-12-04T13:48:11.5013594Z >>> print("CUDA operations completed.") 2025-12-04T13:48:11.5013760Z >>> check_error(torch.cuda.cudart().cudaProfilerStop()) 2025-12-04T13:48:11.5013926Z >>> print("====== End nsys profiling ======") 2025-12-04T13:48:11.5014024Z 2025-12-04T13:48:11.5014109Z To run this example and save the profiling information, execute: 2025-12-04T13:48:11.5014383Z >>> $ nvprof --profile-from-start off --csv --print-summary -o trace_name.prof -f -- python cudart_test.py 2025-12-04T13:48:11.5014564Z 2025-12-04T13:48:11.5014662Z This command profiles the CUDA operations in the provided script and saves 2025-12-04T13:48:11.5014880Z the profiling information to a file named `trace_name.prof`. 2025-12-04T13:48:11.5015112Z The `--profile-from-start off` option ensures that profiling starts only 2025-12-04T13:48:11.5015305Z after the `cudaProfilerStart` call in the script. 2025-12-04T13:48:11.5015496Z The `--csv` and `--print-summary` options format the profiling output as a 2025-12-04T13:48:11.5015684Z CSV file and print a summary, respectively. 2025-12-04T13:48:11.5015874Z The `-o` option specifies the output file name, and the `-f` option forces the 2025-12-04T13:48:11.5016078Z overwrite of the output file if it already exists. 2025-12-04T13:48:11.5016270Z 2025-12-04T13:48:11.5016566Z Original Error: SyntaxError('invalid syntax', ('', 1, 1, '$ nvprof --profile-from-start off --csv --print-summary -o trace_name.prof -f -- python cudart_test.py\n', 1, 2)) 2025-12-04T13:48:11.5016843Z 2025-12-04T13:48:11.5016977Z $ nvprof --profile-from-start off --csv --print-summary -o trace_name.prof -f -- python cudart_test.py 2025-12-04T13:48:11.5017188Z ^ 2025-12-04T13:48:12.9344314Z msg = Cannot scrape callname=vmap in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/apis.py line=39. 2025-12-04T13:48:12.9345578Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:12.9345912Z 2025-12-04T13:48:12.9346123Z vmap is the vectorizing map; ``vmap(func)`` returns a new function that 2025-12-04T13:48:12.9346613Z maps ``func`` over some dimension of the inputs. Semantically, vmap 2025-12-04T13:48:12.9347086Z pushes the map into PyTorch operations called by ``func``, effectively 2025-12-04T13:48:12.9347638Z vectorizing those operations. 2025-12-04T13:48:12.9347831Z 2025-12-04T13:48:12.9348028Z vmap is useful for handling batch dimensions: one can write a function 2025-12-04T13:48:12.9348498Z ``func`` that runs on examples and then lift it to a function that can 2025-12-04T13:48:12.9348967Z take batches of examples with ``vmap(func)``. vmap can also be used to 2025-12-04T13:48:12.9349409Z compute batched gradients when composed with autograd. 2025-12-04T13:48:12.9349666Z 2025-12-04T13:48:12.9349802Z .. note:: 2025-12-04T13:48:12.9350110Z :func:`torch.vmap` is aliased to :func:`torch.func.vmap` for 2025-12-04T13:48:12.9350510Z convenience. Use whichever one you'd like. 2025-12-04T13:48:12.9350724Z 2025-12-04T13:48:12.9350813Z Args: 2025-12-04T13:48:12.9351127Z func (function): A Python function that takes one or more arguments. 2025-12-04T13:48:12.9351537Z Must return one or more Tensors. 2025-12-04T13:48:12.9352050Z in_dims (int or nested structure): Specifies which dimension of the 2025-12-04T13:48:12.9352426Z inputs should be mapped over. ``in_dims`` should have a 2025-12-04T13:48:12.9352795Z structure like the inputs. If the ``in_dim`` for a particular 2025-12-04T13:48:12.9353165Z input is None, then that indicates there is no map dimension. 2025-12-04T13:48:12.9353454Z Default: 0. 2025-12-04T13:48:12.9353731Z out_dims (int or Tuple[int]): Specifies where the mapped dimension 2025-12-04T13:48:12.9354107Z should appear in the outputs. If ``out_dims`` is a Tuple, then 2025-12-04T13:48:12.9354459Z it should have one element per output. Default: 0. 2025-12-04T13:48:12.9354810Z randomness (str): Specifies whether the randomness in this 2025-12-04T13:48:12.9355188Z vmap should be the same or different across batches. If 'different', 2025-12-04T13:48:12.9355587Z the randomness for each batch will be different. If 'same', the 2025-12-04T13:48:12.9355982Z randomness will be the same across batches. If 'error', any calls to 2025-12-04T13:48:12.9356387Z random functions will error. Default: 'error'. WARNING: this flag 2025-12-04T13:48:12.9356787Z only applies to random PyTorch operations and does not apply to 2025-12-04T13:48:12.9357186Z Python's random module or numpy randomness. 2025-12-04T13:48:12.9357565Z chunk_size (None or int): If None (default), apply a single vmap over inputs. 2025-12-04T13:48:12.9357982Z If not None, then compute the vmap :attr:`chunk_size` samples at a time. 2025-12-04T13:48:12.9358425Z Note that :attr:`chunk_size=1` is equivalent to computing the vmap with a for-loop. 2025-12-04T13:48:12.9358905Z If you run into memory issues computing the vmap, please try a non-None chunk_size. 2025-12-04T13:48:12.9359182Z 2025-12-04T13:48:12.9359263Z Returns: 2025-12-04T13:48:12.9359519Z Returns a new "batched" function. It takes the same inputs as 2025-12-04T13:48:12.9359954Z ``func``, except each input has an extra dimension at the index 2025-12-04T13:48:12.9360326Z specified by ``in_dims``. It takes returns the same outputs as 2025-12-04T13:48:12.9360699Z ``func``, except each output has an extra dimension at the index 2025-12-04T13:48:12.9361009Z specified by ``out_dims``. 2025-12-04T13:48:12.9361149Z 2025-12-04T13:48:12.9361212Z .. warning: 2025-12-04T13:48:12.9361413Z :func:`vmap` works best with functional-style code. Please do not 2025-12-04T13:48:12.9361688Z perform any side-effects in ``func``, with the exception of 2025-12-04T13:48:12.9362055Z in-place PyTorch operations. Examples of side-effects include mutating 2025-12-04T13:48:12.9362379Z Python data structures and assigning values to variables not captured 2025-12-04T13:48:12.9362604Z in ``func``. 2025-12-04T13:48:12.9362685Z 2025-12-04T13:48:12.9362813Z One example of using :func:`vmap` is to compute batched dot products. PyTorch 2025-12-04T13:48:12.9363109Z doesn't provide a batched ``torch.dot`` API; instead of unsuccessfully 2025-12-04T13:48:12.9363421Z rummaging through docs, use :func:`vmap` to construct a new function. 2025-12-04T13:48:12.9363588Z 2025-12-04T13:48:12.9363684Z >>> torch.dot # [D], [D] -> [] 2025-12-04T13:48:12.9363910Z >>> batched_dot = torch.func.vmap(torch.dot) # [N, D], [N, D] -> [N] 2025-12-04T13:48:12.9364149Z >>> x, y = torch.randn(2, 5), torch.randn(2, 5) 2025-12-04T13:48:12.9364333Z >>> batched_dot(x, y) 2025-12-04T13:48:12.9364429Z 2025-12-04T13:48:12.9364559Z :func:`vmap` can be helpful in hiding batch dimensions, leading to a simpler 2025-12-04T13:48:12.9364794Z model authoring experience. 2025-12-04T13:48:12.9364895Z 2025-12-04T13:48:12.9364965Z >>> batch_size, feature_size = 3, 5 2025-12-04T13:48:12.9365176Z >>> weights = torch.randn(feature_size, requires_grad=True) 2025-12-04T13:48:12.9365372Z >>> 2025-12-04T13:48:12.9365498Z >>> def model(feature_vec): 2025-12-04T13:48:12.9365675Z >>> # Very simple linear model with activation 2025-12-04T13:48:12.9365878Z >>> return feature_vec.dot(weights).relu() 2025-12-04T13:48:12.9366050Z >>> 2025-12-04T13:48:12.9366202Z >>> examples = torch.randn(batch_size, feature_size) 2025-12-04T13:48:12.9366408Z >>> result = torch.vmap(model)(examples) 2025-12-04T13:48:12.9366526Z 2025-12-04T13:48:12.9366660Z :func:`vmap` can also help vectorize computations that were previously difficult 2025-12-04T13:48:12.9366965Z or impossible to batch. One example is higher-order gradient computation. 2025-12-04T13:48:12.9367263Z The PyTorch autograd engine computes vjps (vector-Jacobian products). 2025-12-04T13:48:12.9367559Z Computing a full Jacobian matrix for some function f: R^N -> R^N usually 2025-12-04T13:48:12.9367863Z requires N calls to ``autograd.grad``, one per Jacobian row. Using :func:`vmap`, 2025-12-04T13:48:12.9368173Z we can vectorize the whole computation, computing the Jacobian in a single 2025-12-04T13:48:12.9368411Z call to ``autograd.grad``. 2025-12-04T13:48:12.9368511Z 2025-12-04T13:48:12.9368566Z >>> # Setup 2025-12-04T13:48:12.9368697Z >>> N = 5 2025-12-04T13:48:12.9368827Z >>> f = lambda x: x**2 2025-12-04T13:48:12.9368989Z >>> x = torch.randn(N, requires_grad=True) 2025-12-04T13:48:12.9369159Z >>> y = f(x) 2025-12-04T13:48:12.9369293Z >>> I_N = torch.eye(N) 2025-12-04T13:48:12.9369435Z >>> 2025-12-04T13:48:12.9369561Z >>> # Sequential approach 2025-12-04T13:48:12.9369783Z >>> jacobian_rows = [torch.autograd.grad(y, x, v, retain_graph=True)[0] 2025-12-04T13:48:12.9370023Z >>> for v in I_N.unbind()] 2025-12-04T13:48:12.9370212Z >>> jacobian = torch.stack(jacobian_rows) 2025-12-04T13:48:12.9370384Z >>> 2025-12-04T13:48:12.9370518Z >>> # vectorized gradient computation 2025-12-04T13:48:12.9370692Z >>> def get_vjp(v): 2025-12-04T13:48:12.9370857Z >>> return torch.autograd.grad(y, x, v) 2025-12-04T13:48:12.9371048Z >>> jacobian = torch.vmap(get_vjp)(I_N) 2025-12-04T13:48:12.9371190Z 2025-12-04T13:48:12.9371304Z :func:`vmap` can also be nested, producing an output with multiple batched dimensions 2025-12-04T13:48:12.9371455Z 2025-12-04T13:48:12.9371505Z >>> torch.dot # [D], [D] -> [] 2025-12-04T13:48:12.9371641Z >>> batched_dot = torch.vmap( 2025-12-04T13:48:12.9371774Z ... torch.vmap(torch.dot) 2025-12-04T13:48:12.9371954Z ... ) # [N1, N0, D], [N1, N0, D] -> [N1, N0] 2025-12-04T13:48:12.9372114Z >>> x, y = torch.randn(2, 3, 5), torch.randn(2, 3, 5) 2025-12-04T13:48:12.9372324Z >>> batched_dot(x, y) # tensor of size [2, 3] 2025-12-04T13:48:12.9372422Z 2025-12-04T13:48:12.9372537Z If the inputs are not batched along the first dimension, ``in_dims`` specifies 2025-12-04T13:48:12.9372742Z the dimension that each inputs are batched along as 2025-12-04T13:48:12.9372850Z 2025-12-04T13:48:12.9372899Z >>> torch.dot # [N], [N] -> [] 2025-12-04T13:48:12.9373077Z >>> batched_dot = torch.vmap(torch.dot, in_dims=1) # [N, D], [N, D] -> [D] 2025-12-04T13:48:12.9373267Z >>> x, y = torch.randn(2, 5), torch.randn(2, 5) 2025-12-04T13:48:12.9373424Z >>> batched_dot( 2025-12-04T13:48:12.9373532Z ... x, y 2025-12-04T13:48:12.9373675Z ... ) # output is [5] instead of [2] if batched along the 0th dimension 2025-12-04T13:48:12.9373798Z 2025-12-04T13:48:12.9373904Z If there are multiple inputs each of which is batched along different dimensions, 2025-12-04T13:48:12.9374140Z ``in_dims`` must be a tuple with the batch dimension for each input as 2025-12-04T13:48:12.9374269Z 2025-12-04T13:48:12.9374317Z >>> torch.dot # [D], [D] -> [] 2025-12-04T13:48:12.9374500Z >>> batched_dot = torch.vmap(torch.dot, in_dims=(0, None)) # [N, D], [D] -> [N] 2025-12-04T13:48:12.9374694Z >>> x, y = torch.randn(2, 5), torch.randn(5) 2025-12-04T13:48:12.9374829Z >>> batched_dot( 2025-12-04T13:48:12.9374934Z ... x, y 2025-12-04T13:48:12.9375075Z ... ) # second arg doesn't have a batch dim because in_dim[1] was None 2025-12-04T13:48:12.9375197Z 2025-12-04T13:48:12.9375298Z If the input is a Python struct, ``in_dims`` must be a tuple containing a struct 2025-12-04T13:48:12.9375485Z matching the shape of the input: 2025-12-04T13:48:12.9375571Z 2025-12-04T13:48:12.9375640Z >>> f = lambda dict: torch.dot(dict["x"], dict["y"]) 2025-12-04T13:48:12.9375794Z >>> x, y = torch.randn(2, 5), torch.randn(5) 2025-12-04T13:48:12.9375933Z >>> input = {"x": x, "y": y} 2025-12-04T13:48:12.9376091Z >>> batched_dot = torch.vmap(f, in_dims=({"x": 0, "y": None},)) 2025-12-04T13:48:12.9376254Z >>> batched_dot(input) 2025-12-04T13:48:12.9376328Z 2025-12-04T13:48:12.9376446Z By default, the output is batched along the first dimension. However, it can be batched 2025-12-04T13:48:12.9376654Z along any dimension by using ``out_dims`` 2025-12-04T13:48:12.9376747Z 2025-12-04T13:48:12.9376795Z >>> f = lambda x: x**2 2025-12-04T13:48:12.9376913Z >>> x = torch.randn(2, 5) 2025-12-04T13:48:12.9377051Z >>> batched_pow = torch.vmap(f, out_dims=1) 2025-12-04T13:48:12.9377194Z >>> batched_pow(x) # [5, 2] 2025-12-04T13:48:12.9377276Z 2025-12-04T13:48:12.9377395Z For any function that uses kwargs, the returned function will not batch the kwargs but will 2025-12-04T13:48:12.9377593Z accept kwargs 2025-12-04T13:48:12.9377672Z 2025-12-04T13:48:12.9377719Z >>> x = torch.randn([2, 5]) 2025-12-04T13:48:12.9377848Z >>> def fn(x, scale=4.): 2025-12-04T13:48:12.9377971Z >>> return x * scale 2025-12-04T13:48:12.9378084Z >>> 2025-12-04T13:48:12.9378189Z >>> batched_pow = torch.vmap(fn) 2025-12-04T13:48:12.9378343Z >>> assert torch.allclose(batched_pow(x), x * 4) 2025-12-04T13:48:12.9378543Z >>> batched_pow(x, scale=x) # scale is not batched, output has shape [2, 2, 5] 2025-12-04T13:48:12.9378682Z 2025-12-04T13:48:12.9378723Z .. note:: 2025-12-04T13:48:12.9378873Z vmap does not provide general autobatching or handle variable-length 2025-12-04T13:48:12.9379055Z sequences out of the box. 2025-12-04T13:48:12.9379162Z 2025-12-04T13:48:12.9379365Z Original Error: IndentationError('expected an indented block after function definition on line 4', ('', 5, 1, '_._ = None\n', 5, 2)) 2025-12-04T13:48:12.9379606Z 2025-12-04T13:48:12.9379646Z _._ = None 2025-12-04T13:48:12.9379748Z ^ 2025-12-04T13:48:12.9379998Z msg = Cannot scrape callname=grad in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/apis.py line=306. 2025-12-04T13:48:12.9380323Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:12.9380582Z ``grad`` operator helps computing gradients of ``func`` with respect to the 2025-12-04T13:48:12.9380822Z input(s) specified by ``argnums``. This operator can be nested to 2025-12-04T13:48:12.9381001Z compute higher-order gradients. 2025-12-04T13:48:12.9381094Z 2025-12-04T13:48:12.9381133Z Args: 2025-12-04T13:48:12.9381267Z func (Callable): A Python function that takes one or more arguments. 2025-12-04T13:48:12.9381501Z Must return a single-element Tensor. If specified ``has_aux`` equals ``True``, 2025-12-04T13:48:12.9381735Z function can return a tuple of single-element Tensor and other auxiliary objects: 2025-12-04T13:48:12.9381945Z ``(output, aux)``. 2025-12-04T13:48:12.9382118Z argnums (int or Tuple[int]): Specifies arguments to compute gradients with respect to. 2025-12-04T13:48:12.9382331Z ``argnums`` can be single integer or tuple of integers. Default: 0. 2025-12-04T13:48:12.9382526Z has_aux (bool): Flag indicating that ``func`` returns a tensor and other 2025-12-04T13:48:12.9382710Z auxiliary objects: ``(output, aux)``. Default: False. 2025-12-04T13:48:12.9382814Z 2025-12-04T13:48:12.9382852Z Returns: 2025-12-04T13:48:12.9383009Z Function to compute gradients with respect to its inputs. By default, the output of 2025-12-04T13:48:12.9383233Z the function is the gradient tensor(s) with respect to the first argument. 2025-12-04T13:48:12.9383465Z If specified ``has_aux`` equals ``True``, tuple of gradients and output auxiliary objects 2025-12-04T13:48:12.9383695Z is returned. If ``argnums`` is a tuple of integers, a tuple of output gradients with 2025-12-04T13:48:12.9383882Z respect to each ``argnums`` value is returned. 2025-12-04T13:48:12.9383977Z 2025-12-04T13:48:12.9384021Z Example of using ``grad``: 2025-12-04T13:48:12.9384096Z 2025-12-04T13:48:12.9384139Z >>> # xdoctest: +SKIP 2025-12-04T13:48:12.9384256Z >>> from torch.func import grad 2025-12-04T13:48:12.9384382Z >>> x = torch.randn([]) 2025-12-04T13:48:12.9384507Z >>> cos_x = grad(lambda x: torch.sin(x))(x) 2025-12-04T13:48:12.9384647Z >>> assert torch.allclose(cos_x, x.cos()) 2025-12-04T13:48:12.9384769Z >>> 2025-12-04T13:48:12.9384866Z >>> # Second-order gradients 2025-12-04T13:48:12.9385006Z >>> neg_sin_x = grad(grad(lambda x: torch.sin(x)))(x) 2025-12-04T13:48:12.9385160Z >>> assert torch.allclose(neg_sin_x, -x.sin()) 2025-12-04T13:48:12.9385253Z 2025-12-04T13:48:12.9385350Z When composed with ``vmap``, ``grad`` can be used to compute per-sample-gradients: 2025-12-04T13:48:12.9385482Z 2025-12-04T13:48:12.9385524Z >>> # xdoctest: +SKIP 2025-12-04T13:48:12.9385646Z >>> from torch.func import grad, vmap 2025-12-04T13:48:12.9385779Z >>> batch_size, feature_size = 3, 5 2025-12-04T13:48:12.9385896Z >>> 2025-12-04T13:48:12.9385995Z >>> def model(weights, feature_vec): 2025-12-04T13:48:12.9386133Z >>> # Very simple linear model with activation 2025-12-04T13:48:12.9386270Z >>> assert feature_vec.dim() == 1 2025-12-04T13:48:12.9386405Z >>> return feature_vec.dot(weights).relu() 2025-12-04T13:48:12.9386527Z >>> 2025-12-04T13:48:12.9386634Z >>> def compute_loss(weights, example, target): 2025-12-04T13:48:12.9386772Z >>> y = model(weights, example) 2025-12-04T13:48:12.9386910Z >>> return ((y - target) ** 2).mean() # MSELoss 2025-12-04T13:48:12.9387063Z >>> 2025-12-04T13:48:12.9387182Z >>> weights = torch.randn(feature_size, requires_grad=True) 2025-12-04T13:48:12.9387349Z >>> examples = torch.randn(batch_size, feature_size) 2025-12-04T13:48:12.9387493Z >>> targets = torch.randn(batch_size) 2025-12-04T13:48:12.9387627Z >>> inputs = (weights, examples, targets) 2025-12-04T13:48:12.9387802Z >>> grad_weight_per_example = vmap(grad(compute_loss), in_dims=(None, 0, 0))( 2025-12-04T13:48:12.9387993Z ... *inputs 2025-12-04T13:48:12.9388092Z ... ) 2025-12-04T13:48:12.9388144Z 2025-12-04T13:48:12.9388233Z Example of using ``grad`` with ``has_aux`` and ``argnums``: 2025-12-04T13:48:12.9388339Z 2025-12-04T13:48:12.9388384Z >>> # xdoctest: +SKIP 2025-12-04T13:48:12.9388502Z >>> from torch.func import grad 2025-12-04T13:48:12.9388628Z >>> def my_loss_func(y, y_pred): 2025-12-04T13:48:12.9388767Z >>> loss_per_sample = (0.5 * y_pred - y) ** 2 2025-12-04T13:48:12.9388922Z >>> loss = loss_per_sample.mean() 2025-12-04T13:48:12.9389062Z >>> return loss, (y_pred, loss_per_sample) 2025-12-04T13:48:12.9389187Z >>> 2025-12-04T13:48:12.9389298Z >>> fn = grad(my_loss_func, argnums=(0, 1), has_aux=True) 2025-12-04T13:48:12.9389441Z >>> y_true = torch.rand(4) 2025-12-04T13:48:12.9389572Z >>> y_preds = torch.rand(4, requires_grad=True) 2025-12-04T13:48:12.9389709Z >>> out = fn(y_true, y_preds) 2025-12-04T13:48:12.9389882Z >>> # > output is ((grads w.r.t y_true, grads w.r.t y_preds), (y_pred, loss_per_sample)) 2025-12-04T13:48:12.9390007Z 2025-12-04T13:48:12.9390049Z .. note:: 2025-12-04T13:48:12.9390171Z Using PyTorch ``torch.no_grad`` together with ``grad``. 2025-12-04T13:48:12.9390273Z 2025-12-04T13:48:12.9390336Z Case 1: Using ``torch.no_grad`` inside a function: 2025-12-04T13:48:12.9390431Z 2025-12-04T13:48:12.9390478Z >>> # xdoctest: +SKIP 2025-12-04T13:48:12.9390596Z >>> def f(x): 2025-12-04T13:48:12.9390707Z >>> with torch.no_grad(): 2025-12-04T13:48:12.9390831Z >>> c = x ** 2 2025-12-04T13:48:12.9390947Z >>> return x - c 2025-12-04T13:48:12.9391019Z 2025-12-04T13:48:12.9391096Z In this case, ``grad(f)(x)`` will respect the inner ``torch.no_grad``. 2025-12-04T13:48:12.9391207Z 2025-12-04T13:48:12.9391281Z Case 2: Using ``grad`` inside ``torch.no_grad`` context manager: 2025-12-04T13:48:12.9391387Z 2025-12-04T13:48:12.9391430Z >>> # xdoctest: +SKIP 2025-12-04T13:48:12.9391547Z >>> with torch.no_grad(): 2025-12-04T13:48:12.9391662Z >>> grad(f)(x) 2025-12-04T13:48:12.9391730Z 2025-12-04T13:48:12.9391814Z In this case, ``grad`` will respect the inner ``torch.no_grad``, but not the 2025-12-04T13:48:12.9392054Z outer one. This is because ``grad`` is a "function transform": its result 2025-12-04T13:48:12.9392259Z should not depend on the result of a context manager outside of ``f``. 2025-12-04T13:48:12.9392375Z 2025-12-04T13:48:12.9392410Z 2025-12-04T13:48:12.9392634Z Original Error: IndentationError('expected an indented block after function definition on line 5', ('', 6, 1, '_._ = None\n', 6, 2)) 2025-12-04T13:48:12.9392843Z 2025-12-04T13:48:12.9392879Z _._ = None 2025-12-04T13:48:12.9392962Z ^ 2025-12-04T13:48:15.8679081Z msg = Cannot scrape callname=CustomOpDef.register_fake in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py line=402. 2025-12-04T13:48:15.8680036Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:15.8680568Z Register a FakeTensor implementation for this custom op. 2025-12-04T13:48:15.8680817Z 2025-12-04T13:48:15.8681044Z This is necessary to get the operator to work efficiently with torch.compile. 2025-12-04T13:48:15.8681755Z 2025-12-04T13:48:15.8682107Z The Fake impl (sometimes also known as a meta kernel or abstract impl) 2025-12-04T13:48:15.8682585Z specifies the behavior of this operator on Tensors that carry no data. 2025-12-04T13:48:15.8683006Z Given some input Tensors with certain properties 2025-12-04T13:48:15.8683441Z (sizes/strides/storage_offset/device), it specifies what the properties of 2025-12-04T13:48:15.8683845Z the output Tensors are. 2025-12-04T13:48:15.8684021Z 2025-12-04T13:48:15.8684206Z Please see :func:`torch.library.register_fake` for more details. 2025-12-04T13:48:15.8684547Z 2025-12-04T13:48:15.8684636Z Args: 2025-12-04T13:48:15.8685013Z fn (Callable): The function to register as the FakeTensor 2025-12-04T13:48:15.8685354Z implementation. 2025-12-04T13:48:15.8685517Z 2025-12-04T13:48:15.8685608Z Examples: 2025-12-04T13:48:15.8685829Z >>> import torch 2025-12-04T13:48:15.8686087Z >>> import numpy as np 2025-12-04T13:48:15.8686373Z >>> from torch import Tensor 2025-12-04T13:48:15.8686688Z >>> 2025-12-04T13:48:15.8686986Z >>> # Example 1: an operator without data-dependent output shape 2025-12-04T13:48:15.8687443Z >>> @torch.library.custom_op("mylib::linear", mutates_args=()) 2025-12-04T13:48:15.8687875Z >>> def linear(x: Tensor, weight: Tensor, bias: Tensor) -> Tensor: 2025-12-04T13:48:15.8688244Z >>> return (x @ weight.t()) + bias 2025-12-04T13:48:15.8688529Z >>> 2025-12-04T13:48:15.8688756Z >>> @linear.register_fake 2025-12-04T13:48:15.8689042Z >>> def _(x, weight, bias): 2025-12-04T13:48:15.8689318Z >>> assert x.dim() == 2 2025-12-04T13:48:15.8689612Z >>> assert weight.dim() == 2 2025-12-04T13:48:15.8689908Z >>> assert bias.dim() == 1 2025-12-04T13:48:15.8690218Z >>> assert x.shape[1] == weight.shape[1] 2025-12-04T13:48:15.8690550Z >>> assert weight.shape[0] == bias.shape[0] 2025-12-04T13:48:15.8690889Z >>> assert x.device == weight.device 2025-12-04T13:48:15.8691192Z >>> return x.new_empty(x.size(0), weight.size(0)) 2025-12-04T13:48:15.8691418Z >>> 2025-12-04T13:48:15.8691575Z >>> x = torch.randn(2, 2) 2025-12-04T13:48:15.8691774Z >>> weight = torch.randn(2, 2) 2025-12-04T13:48:15.8692015Z >>> bias = torch.randn(2) 2025-12-04T13:48:15.8692227Z >>> # xdoctest: +SKIP("Requires Python <= 3.11") 2025-12-04T13:48:15.8692493Z >>> out = torch.compile(linear, fullgraph=True)(x, weight, bias) 2025-12-04T13:48:15.8692758Z >>> # xdoctest: +SKIP("Requires Python <= 3.11") 2025-12-04T13:48:15.8693054Z >>> assert torch.allclose(out, torch.nn.functional.linear(x, weight, bias)) 2025-12-04T13:48:15.8693320Z >>> 2025-12-04T13:48:15.8693518Z >>> # Example 2: an operator with data-dependent output shape 2025-12-04T13:48:15.8693812Z >>> @torch.library.custom_op("mylib::nonzero", mutates_args=()) 2025-12-04T13:48:15.8694070Z >>> def nonzero(x: Tensor) -> Tensor: 2025-12-04T13:48:15.8694279Z >>> x_np = x.cpu().numpy() 2025-12-04T13:48:15.8694493Z >>> res = np.stack(np.nonzero(x_np), axis=1) 2025-12-04T13:48:15.8694726Z >>> return torch.tensor(res, device=x.device) 2025-12-04T13:48:15.8694925Z >>> 2025-12-04T13:48:15.8695084Z >>> @nonzero.register_fake 2025-12-04T13:48:15.8695275Z >>> def _(x): 2025-12-04T13:48:15.8695478Z >>> # Number of nonzero-elements is data-dependent. 2025-12-04T13:48:15.8695739Z >>> # Since we cannot peek at the data in an abstract impl, 2025-12-04T13:48:15.8696001Z >>> # we use the ctx object to construct a new symint that 2025-12-04T13:48:15.8696245Z >>> # represents the data-dependent size. 2025-12-04T13:48:15.8696508Z >>> ctx = torch.library.get_ctx() 2025-12-04T13:48:15.8696723Z >>> nnz = ctx.new_dynamic_size() 2025-12-04T13:48:15.8696929Z >>> shape = [nnz, x.dim()] 2025-12-04T13:48:15.8697157Z >>> result = x.new_empty(shape, dtype=torch.int64) 2025-12-04T13:48:15.8697379Z >>> return result 2025-12-04T13:48:15.8697551Z >>> 2025-12-04T13:48:15.8697716Z >>> x = torch.tensor([0, 1, 2, 0, 0, 1]) 2025-12-04T13:48:15.8697939Z >>> # xdoctest: +SKIP("Requires Python <= 3.11") 2025-12-04T13:48:15.8698206Z >>> out = torch.compile(nonzero, fullgraph=True)(x) 2025-12-04T13:48:15.8698468Z >>> # xdoctest: +SKIP("Requires Python <= 3.11") 2025-12-04T13:48:15.8698700Z >>> assert torch.allclose(out, x.nonzero()) 2025-12-04T13:48:15.8698841Z 2025-12-04T13:48:15.8698900Z 2025-12-04T13:48:15.8699273Z Original Error: IndentationError('expected an indented block after function definition on line 36', ('', 37, 1, '_._ = None\n', 37, 2)) 2025-12-04T13:48:15.8699647Z 2025-12-04T13:48:15.8699733Z _._ = None 2025-12-04T13:48:15.8699870Z ^ 2025-12-04T13:48:15.8765735Z msg = Cannot scrape callname=unsafe_generate_fake_kernels in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/fake_profile.py line=94. 2025-12-04T13:48:15.8766171Z Caused by: DoctestParseError('Failed to parse doctest in _label_docsrc_lines') 2025-12-04T13:48:15.8766335Z 2025-12-04T13:48:15.8766443Z Registers a fake kernel based on the given operator profiles. This fake 2025-12-04T13:48:15.8766702Z kernel registration will override any existing fake kernel registrations. 2025-12-04T13:48:15.8766849Z 2025-12-04T13:48:15.8766946Z The input is a dictionary mapping operator names to a set of operator 2025-12-04T13:48:15.8767179Z profiles, which we will use to generate fake kernels. The operator profiles 2025-12-04T13:48:15.8767411Z are a record of the input and output tensor metadata. Based on this 2025-12-04T13:48:15.8767645Z information we will match a given input to the recorded profile, and return 2025-12-04T13:48:15.8767888Z an output with the same metadata as in the recorded profile. If a profile 2025-12-04T13:48:15.8768092Z doesn't exist then an exception will be thrown. 2025-12-04T13:48:15.8768197Z 2025-12-04T13:48:15.8768299Z The fake kernel generation is considered unsafe because it relies on the 2025-12-04T13:48:15.8768536Z rigid, pre-defined operator profiles that do not account for potential 2025-12-04T13:48:15.8768782Z variations in output behavior. Specifically, the generated kernels assume a 2025-12-04T13:48:15.8769035Z fixed relationship between input and output ranks. However, in reality, it's 2025-12-04T13:48:15.8769289Z possible that data-dependent operations may produce outputs of different 2025-12-04T13:48:15.8769529Z ranks even when given inputs of the same rank. The generated fake kernels 2025-12-04T13:48:15.8769760Z are inflexible and unable to accommodate these nuances, making them 2025-12-04T13:48:15.8769943Z potentially unsafe. 2025-12-04T13:48:15.8770016Z 2025-12-04T13:48:15.8770060Z Args: 2025-12-04T13:48:15.8770214Z op_profiles (dict[str, set[OpProfile]]): A dictionary mapping operator 2025-12-04T13:48:15.8770439Z name to a set of operator profiles from which we will generate fake 2025-12-04T13:48:15.8770612Z kernels. 2025-12-04T13:48:15.8770678Z 2025-12-04T13:48:15.8770725Z Examples: 2025-12-04T13:48:15.8770784Z 2025-12-04T13:48:15.8770868Z >>> # Example: Registering an op-profile from draft-export 2025-12-04T13:48:15.8771031Z >>> import torch 2025-12-04T13:48:15.8771174Z >>> from torch.export._draft_export import draft_export 2025-12-04T13:48:15.8771325Z >>> 2025-12-04T13:48:15.8771462Z >>> @torch.library.custom_op("mylib::foo", mutates_args=()) 2025-12-04T13:48:15.8771643Z >>> def foo(x: Tensor, y: Tensor) -> Tensor: 2025-12-04T13:48:15.8771787Z >>> return x + y 2025-12-04T13:48:15.8771939Z >>> 2025-12-04T13:48:15.8772087Z >>> class M(torch.nn.Module): 2025-12-04T13:48:15.8772228Z >>> def forward(self, a, b): 2025-12-04T13:48:15.8772386Z >>> res = torch.ops.mylib.foo(a, b) # no fake impl 2025-12-04T13:48:15.8772540Z >>> return res 2025-12-04T13:48:15.8772656Z >>> 2025-12-04T13:48:15.8772787Z >>> ep = draft_export(M(), (torch.ones(3, 4), torch.ones(3, 4)) 2025-12-04T13:48:15.8772944Z >>> 2025-12-04T13:48:15.8773118Z >>> with torch._library.fake_profile.unsafe_generate_fake_kernels(ep._report.op_profiles): 2025-12-04T13:48:15.8773364Z >>> decomp = ep.run_decompositions() 2025-12-04T13:48:15.8773461Z 2025-12-04T13:48:15.8773463Z 2025-12-04T13:48:15.8773668Z Original Error: IncompleteParseError('ill-formed doctest: all parts have been processed but the doctest source is not balanced') 2025-12-04T13:48:15.8773898Z 2025-12-04T13:48:16.2619463Z msg = Cannot scrape callname=ActivationSparsifier in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/_experimental/activation_sparsifier/activation_sparsifier.py line=16. 2025-12-04T13:48:16.2620065Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:16.2620251Z 2025-12-04T13:48:16.2620391Z The Activation sparsifier class aims to sparsify/prune activations in a neural 2025-12-04T13:48:16.2620696Z network. The idea is to attach the sparsifier to a layer (or layers) and it 2025-12-04T13:48:16.2620993Z zeroes out the activations based on the mask_fn (or sparsification function) 2025-12-04T13:48:16.2621229Z input by the user. 2025-12-04T13:48:16.2621433Z The mask_fn is applied once all the inputs are aggregated and reduced i.e. 2025-12-04T13:48:16.2621726Z mask = mask_fn(reduce_fn(aggregate_fn(activations))) 2025-12-04T13:48:16.2621898Z 2025-12-04T13:48:16.2621983Z Note:: 2025-12-04T13:48:16.2622216Z The sparsification mask is computed on the input **before it goes through the attached layer**. 2025-12-04T13:48:16.2622431Z 2025-12-04T13:48:16.2622481Z Args: 2025-12-04T13:48:16.2622608Z model (nn.Module): 2025-12-04T13:48:16.2622814Z The model whose layers will be sparsified. The layers that needs to be 2025-12-04T13:48:16.2623104Z sparsified should be added separately using the register_layer() function 2025-12-04T13:48:16.2623343Z aggregate_fn (Optional, Callable): 2025-12-04T13:48:16.2623587Z default aggregate_fn that is used if not specified while registering the layer. 2025-12-04T13:48:16.2623861Z specifies how inputs should be aggregated over time. 2025-12-04T13:48:16.2624145Z The aggregate_fn should usually take 2 torch tensors and return the aggregated tensor. 2025-12-04T13:48:16.2624391Z Example 2025-12-04T13:48:16.2624569Z def add_agg_fn(tensor1, tensor2): return tensor1 + tensor2 2025-12-04T13:48:16.2624781Z reduce_fn (Optional, Callable): 2025-12-04T13:48:16.2625021Z default reduce_fn that is used if not specified while registering the layer. 2025-12-04T13:48:16.2625333Z reduce_fn will be called on the aggregated tensor i.e. the tensor obtained after 2025-12-04T13:48:16.2625583Z calling agg_fn() on all inputs. 2025-12-04T13:48:16.2625754Z Example 2025-12-04T13:48:16.2625944Z def mean_reduce_fn(agg_tensor): return agg_tensor.mean(dim=0) 2025-12-04T13:48:16.2626163Z mask_fn (Optional, Callable): 2025-12-04T13:48:16.2626426Z default mask_fn that is used to create the sparsification mask using the tensor obtained after 2025-12-04T13:48:16.2626760Z calling the reduce_fn(). This is used by default if a custom one is passed in the 2025-12-04T13:48:16.2627004Z register_layer(). 2025-12-04T13:48:16.2627281Z Note that the mask_fn() definition should contain the sparse arguments that is passed in sparse_config 2025-12-04T13:48:16.2627560Z arguments. 2025-12-04T13:48:16.2627716Z features (Optional, list): 2025-12-04T13:48:16.2627986Z default selected features to sparsify. 2025-12-04T13:48:16.2628245Z If this is non-empty, then the mask_fn will be applied for each feature of the input. 2025-12-04T13:48:16.2628494Z For example, 2025-12-04T13:48:16.2628721Z mask = [mask_fn(reduce_fn(aggregated_fn(input[feature])) for feature in features] 2025-12-04T13:48:16.2628970Z feature_dim (Optional, int): 2025-12-04T13:48:16.2629224Z default dimension of input features. Again, features along this dim will be chosen 2025-12-04T13:48:16.2629508Z for sparsification. 2025-12-04T13:48:16.2629716Z sparse_config (Dict): 2025-12-04T13:48:16.2629934Z Default configuration for the mask_fn. This config will be passed 2025-12-04T13:48:16.2630156Z with the mask_fn() 2025-12-04T13:48:16.2630262Z 2025-12-04T13:48:16.2630319Z Example: 2025-12-04T13:48:16.2630445Z >>> # xdoctest: +SKIP 2025-12-04T13:48:16.2630598Z >>> model = SomeModel() 2025-12-04T13:48:16.2630844Z >>> act_sparsifier = ActivationSparsifier(...) # init activation sparsifier 2025-12-04T13:48:16.2631038Z >>> # Initialize aggregate_fn 2025-12-04T13:48:16.2631166Z >>> def agg_fn(x, y): 2025-12-04T13:48:16.2631286Z >>> return x + y 2025-12-04T13:48:16.2631400Z >>> 2025-12-04T13:48:16.2631504Z >>> # Initialize reduce_fn 2025-12-04T13:48:16.2631631Z >>> def reduce_fn(x): 2025-12-04T13:48:16.2631757Z >>> return torch.mean(x, dim=0) 2025-12-04T13:48:16.2631928Z >>> 2025-12-04T13:48:16.2632029Z >>> # Initialize mask_fn 2025-12-04T13:48:16.2632157Z >>> def mask_fn(data): 2025-12-04T13:48:16.2632302Z >>> return torch.eye(data.shape).to(data.device) 2025-12-04T13:48:16.2632450Z >>> 2025-12-04T13:48:16.2632541Z >>> 2025-12-04T13:48:16.2632646Z >>> act_sparsifier.register_layer( 2025-12-04T13:48:16.2632787Z ... model.some_layer, 2025-12-04T13:48:16.2632916Z ... aggregate_fn=agg_fn, 2025-12-04T13:48:16.2633049Z ... reduce_fn=reduce_fn, 2025-12-04T13:48:16.2633177Z ... mask_fn=mask_fn, 2025-12-04T13:48:16.2633295Z ... ) 2025-12-04T13:48:16.2633389Z >>> 2025-12-04T13:48:16.2633489Z >>> # start training process 2025-12-04T13:48:16.2633616Z >>> for _ in [...]: 2025-12-04T13:48:16.2633733Z >>> # epoch starts 2025-12-04T13:48:16.2633883Z >>> # model.forward(), compute_loss() and model.backwards() 2025-12-04T13:48:16.2634046Z >>> # epoch ends 2025-12-04T13:48:16.2634166Z >>> act_sparsifier.step() 2025-12-04T13:48:16.2634300Z >>> # end training process 2025-12-04T13:48:16.2634434Z >>> sparsifier.squash_mask() 2025-12-04T13:48:16.2634516Z 2025-12-04T13:48:16.2634716Z Original Error: IndentationError("expected an indented block after 'for' statement on line 25", ('', 26, 1, '_._ = None\n', 26, 2)) 2025-12-04T13:48:16.2634949Z 2025-12-04T13:48:16.2634993Z _._ = None 2025-12-04T13:48:16.2635092Z ^ 2025-12-04T13:48:16.8794601Z msg = Cannot scrape callname=DeviceMesh.__getitem__ in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/device_mesh.py line=547. 2025-12-04T13:48:16.8795423Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:16.8795735Z 2025-12-04T13:48:16.8795968Z Slice the current DeviceMesh based on the mesh_dim_names given to create a submesh. 2025-12-04T13:48:16.8796512Z The submesh created consists of the dimensions and the communicators indicated by 2025-12-04T13:48:16.8796907Z ``mesh_dim_names`` 2025-12-04T13:48:16.8797039Z 2025-12-04T13:48:16.8797130Z Args: 2025-12-04T13:48:16.8797445Z mesh_dim_names (Union[str, Tuple[str]]): the name or the tuple of names of the 2025-12-04T13:48:16.8797903Z mesh dimension of the DeviceMesh to create the submesh for. 2025-12-04T13:48:16.8798236Z Returns: 2025-12-04T13:48:16.8798457Z A :class:`DeviceMesh` object 2025-12-04T13:48:16.8798807Z 2025-12-04T13:48:16.8799041Z The following program runs on each process/rank in an SPMD manner in a world size of 8. 2025-12-04T13:48:16.8799457Z In the first example: 2025-12-04T13:48:16.8799824Z Calling mesh_2d["tp"] on rank 0, 1, 2, 3 returns a 1D submesh of DeviceMesh:([0, 1, 2, 3]). 2025-12-04T13:48:16.8800332Z Calling mesh_2d["tp"] on rank 4, 5, 6, 7 returns a 1D submesh of DeviceMesh:([4, 5, 6, 7]). 2025-12-04T13:48:16.8800776Z Calling mesh_2d["dp"] on rank 0, 4 returns a 1D submesh of DeviceMesh:([0, 4]). 2025-12-04T13:48:16.8801238Z Calling mesh_2d["dp"] on rank 1, 5 returns a 1D submesh of DeviceMesh:([1, 5]). 2025-12-04T13:48:16.8801700Z Calling mesh_2d["dp"] on rank 2, 6 returns a 1D submesh of DeviceMesh:([2, 6]). 2025-12-04T13:48:16.8802250Z Calling mesh_2d["dp"] on rank 3, 7 returns a 1D submesh of DeviceMesh:([3, 7]). 2025-12-04T13:48:16.8802491Z 2025-12-04T13:48:16.8802571Z In the second example: 2025-12-04T13:48:16.8802904Z Calling mesh_3d["dp", "cp"] on rank 0, 1, 4, 5 returns a 2D submesh of DeviceMesh:([[0, 1], [4, 5]]). 2025-12-04T13:48:16.8803423Z Calling mesh_3d["dp", "cp"] on rank 2, 3, 6, 7 returns a 2D submesh of DeviceMesh:([[2, 3], [6, 7]]). 2025-12-04T13:48:16.8803876Z Calling mesh_3d["cp", "dp"] on rank 0, 1, 4, 5 returns a 2D submesh of DeviceMesh:([[0, 4], [1, 5]]). 2025-12-04T13:48:16.8804318Z Calling mesh_3d["cp", "dp"] on rank 2, 3, 6, 7 returns a 2D submesh of DeviceMesh:([[2, 6], [3, 7]]). 2025-12-04T13:48:16.8804579Z 2025-12-04T13:48:16.8804680Z Example:: 2025-12-04T13:48:16.8804784Z 2025-12-04T13:48:16.8804883Z >>> # xdoctest: +SKIP("no rank") 2025-12-04T13:48:16.8805191Z >>> from torch.distributed.device_mesh import DeviceMesh 2025-12-04T13:48:16.8805475Z >>> 2025-12-04T13:48:16.8805725Z >>> # Initialize a 2D device mesh as (2, 4) to represent the topology 2025-12-04T13:48:16.8806063Z >>> # of cross-host(dim 0), and within-host (dim 1). 2025-12-04T13:48:16.8806447Z >>> mesh_2d = init_device_mesh(device_type="cuda", (2,4), mesh_dim_names=("dp", "tp")) 2025-12-04T13:48:16.8806807Z >>> tp_mesh = mesh_2d["tp"] 2025-12-04T13:48:16.8807033Z >>> dp_mesh = mesh_2d["dp"] 2025-12-04T13:48:16.8807246Z >>> 2025-12-04T13:48:16.8807424Z >>> # Initialize a 3D mesh. 2025-12-04T13:48:16.8807775Z >>> mesh_3d = init_device_mesh(device_type="cuda", (2,2,2), mesh_dim_names=("dp", "pp", "cp")) 2025-12-04T13:48:16.8808268Z >>> # The order of the mesh_dim_names provided deteremines the order of dimensions in the submesh. 2025-12-04T13:48:16.8808663Z >>> dp_cp_mesh = mesh_3d["dp", "cp"] 2025-12-04T13:48:16.8808920Z >>> cp_dp_mesh = mesh_3d["cp", "dp"] 2025-12-04T13:48:16.8809086Z 2025-12-04T13:48:16.8809569Z Original Error: SyntaxError('positional argument follows keyword argument', ('', 6, 82, 'mesh_2d = init_device_mesh(device_type="cuda", (2,4), mesh_dim_names=("dp", "tp"))\n', 6, 83)) 2025-12-04T13:48:16.8810108Z 2025-12-04T13:48:16.8810299Z mesh_2d = init_device_mesh(device_type="cuda", (2,4), mesh_dim_names=("dp", "tp")) 2025-12-04T13:48:16.8810672Z ^ 2025-12-04T13:48:17.0301770Z msg = Cannot scrape callname=SavePlanner in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/planner.py line=122. 2025-12-04T13:48:17.0302873Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:17.0303252Z 2025-12-04T13:48:17.0303543Z Abstract class defining the protocol used by save_state_dict to plan the save process. 2025-12-04T13:48:17.0303960Z 2025-12-04T13:48:17.0304264Z SavePlanners are stateful objects that can be used to customize the whole save process. 2025-12-04T13:48:17.0304657Z 2025-12-04T13:48:17.0304948Z SavePlanner acts as an access proxy to the state_dict, so any transformation done to it 2025-12-04T13:48:17.0305456Z will be visible to the whole process. 2025-12-04T13:48:17.0305687Z 2025-12-04T13:48:17.0305973Z A planner subclass can expect the following sequence of calls during save_state_dict: 2025-12-04T13:48:17.0306609Z 2025-12-04T13:48:17.0306766Z 1) set_up_planner - called on all ranks. 2025-12-04T13:48:17.0307139Z Signals the start of a checkpoint save. 2025-12-04T13:48:17.0307368Z 2025-12-04T13:48:17.0307510Z 2) create_local_plan - called on all ranks. 2025-12-04T13:48:17.0308026Z Process the state_dict and produces a `SavePlan` that will be sent for global planning. 2025-12-04T13:48:17.0308420Z 2025-12-04T13:48:17.0308622Z 3) create_global_plan - called on the coordinator rank only. 2025-12-04T13:48:17.0309190Z Takes the SavePlan from all ranks and make any global decision. 2025-12-04T13:48:17.0309493Z 2025-12-04T13:48:17.0309690Z 4) finish_plan - called on all ranks. 2025-12-04T13:48:17.0310129Z This gives each rank a chance to adjust to global planning decisions. 2025-12-04T13:48:17.0310465Z 2025-12-04T13:48:17.0310635Z 5) resolve_data - called multiple times on each rank 2025-12-04T13:48:17.0311117Z Lookups a value on the `state_dict` for the storage layer to write. 2025-12-04T13:48:17.0311492Z 2025-12-04T13:48:17.0311793Z Users are recommended to extend DefaultSavePlanner instead of this interface directly as 2025-12-04T13:48:17.0312496Z most changes can be expressed by changes in a single method. 2025-12-04T13:48:17.0312669Z 2025-12-04T13:48:17.0312749Z There are 3 usual patterns of extension: 2025-12-04T13:48:17.0312878Z 2025-12-04T13:48:17.0313029Z Rewriting state_dict. This is the simplest way to extend the save process as it 2025-12-04T13:48:17.0313361Z doesn't requite understanding the intrincacies of how SavePlan works: 2025-12-04T13:48:17.0313544Z 2025-12-04T13:48:17.0313625Z >>> # xdoctest: +SKIP("undefined vars") 2025-12-04T13:48:17.0313836Z >>> class RenamePlanner(DefaultSavePlanner): 2025-12-04T13:48:17.0314040Z >>> def set_up_planner( 2025-12-04T13:48:17.0314207Z >>> self, 2025-12-04T13:48:17.0314371Z >>> state_dict: STATE_DICT_TYPE, 2025-12-04T13:48:17.0314583Z >>> storage_meta: Optional[StorageMeta], 2025-12-04T13:48:17.0314791Z >>> is_coordinator: bool, 2025-12-04T13:48:17.0314975Z >>> ) -> None: 2025-12-04T13:48:17.0315137Z >>> # prefix all keys with `foo_`` 2025-12-04T13:48:17.0315433Z >>> super().set_up_planner({"foo_" + k: v for k, v in state_dict.items()}, storage_meta, is_coordinator) 2025-12-04T13:48:17.0315647Z 2025-12-04T13:48:17.0315842Z Modifying local plan and lookup in tandem. This is useful when fine control of how data is persisted 2025-12-04T13:48:17.0316085Z 2025-12-04T13:48:17.0316161Z >>> # xdoctest: +SKIP("undefined vars") 2025-12-04T13:48:17.0316374Z >>> class FP16Planner(DefaultSavePlanner): 2025-12-04T13:48:17.0316581Z >>> def create_local_plan(self): 2025-12-04T13:48:17.0316783Z >>> plan = super().create_local_plan() 2025-12-04T13:48:17.0316981Z >>> for p in plan: 2025-12-04T13:48:17.0317176Z >>> if p.tensor_data is not None: 2025-12-04T13:48:17.0317408Z >>> p.tensor_data.properties.dtype = torch.float16 2025-12-04T13:48:17.0317637Z >>> return plan 2025-12-04T13:48:17.0317790Z >>> 2025-12-04T13:48:17.0317943Z >>> def resolve_data(self, write_item): 2025-12-04T13:48:17.0318151Z >>> item = super().resolve_data(write_item) 2025-12-04T13:48:17.0318455Z >>> return item if write_item.type == WriteItemType.BYTE_IO else item.to(torch.float16) 2025-12-04T13:48:17.0318667Z 2025-12-04T13:48:17.0318863Z Using the global planning step to make central decisions that can't be made individually by each rank 2025-12-04T13:48:17.0319136Z 2025-12-04T13:48:17.0319218Z >>> # xdoctest: +SKIP("undefined vars") 2025-12-04T13:48:17.0319416Z >>> from itertools import zip_longest 2025-12-04T13:48:17.0319607Z >>> from dataclasses import replace 2025-12-04T13:48:17.0319866Z >>> class DDPLoadBalancingPlanner(DefaultSavePlanner): 2025-12-04T13:48:17.0320112Z >>> # This uses the default local plan behavior of having all non-sharded writes in rank 0 2025-12-04T13:48:17.0320364Z >>> # This sample doesn't handle ShardedTensors 2025-12-04T13:48:17.0320528Z >>> def create_global_plan(self, all_plans): 2025-12-04T13:48:17.0320710Z >>> iters = [iter(all_plans[0].items)] * len(all_plans) 2025-12-04T13:48:17.0320873Z >>> items_per_rank = [ 2025-12-04T13:48:17.0321028Z >>> [item for item in items if item is not None] 2025-12-04T13:48:17.0321211Z >>> for items in zip(*zip_longest(*iters), strict=True) 2025-12-04T13:48:17.0321373Z >>> ] 2025-12-04T13:48:17.0321504Z >>> all_plans = [ 2025-12-04T13:48:17.0321640Z >>> replace(plan, items=items) 2025-12-04T13:48:17.0321841Z >>> for plan, items in zip(all_plans, items_per_rank, strict=True) 2025-12-04T13:48:17.0322054Z >>> ] 2025-12-04T13:48:17.0322191Z >>> return super().create_global_plan(all_plans) 2025-12-04T13:48:17.0322301Z 2025-12-04T13:48:17.0322413Z Finally, some planners need to save additional metadata in the checkpoint, this is 2025-12-04T13:48:17.0322760Z accomplished by having each rank contribute their data items in the local plan and 2025-12-04T13:48:17.0322978Z the global planner aggregate them: 2025-12-04T13:48:17.0323077Z 2025-12-04T13:48:17.0323131Z >>> # xdoctest: +SKIP("undefined vars") 2025-12-04T13:48:17.0323299Z >>> class SaveExtraDataPlanner(DefaultSavePlanner): 2025-12-04T13:48:17.0323480Z >>> def create_local_plan(self) -> SavePlan: 2025-12-04T13:48:17.0323643Z >>> plan = super().create_local_plan() 2025-12-04T13:48:17.0323829Z >>> return replace(plan, planner_data="per-rank-data") 2025-12-04T13:48:17.0323989Z >>> 2025-12-04T13:48:17.0324177Z >>> def create_global_plan(self, all_plans: List[SavePlan]) -> Tuple[List[SavePlan], Metadata]: 2025-12-04T13:48:17.0324446Z >>> global_plan, metadata = super().create_global_plan(all_plans) 2025-12-04T13:48:17.0324656Z >>> merged_data = [p.planner_data for p in global_plan] 2025-12-04T13:48:17.0324857Z >>> metadata = replace(metadata, planner_data=merged_data) 2025-12-04T13:48:17.0325040Z >>> return global_plan, metadata 2025-12-04T13:48:17.0325140Z 2025-12-04T13:48:17.0325361Z Original Error: IndentationError('expected an indented block after function definition on line 3', ('', 9, 0, '_._ = None\n', 9, -1)) 2025-12-04T13:48:17.0325621Z 2025-12-04T13:48:17.0325660Z _._ = None 2025-12-04T13:48:17.0325758Z ^ 2025-12-04T13:48:17.0326061Z msg = Cannot scrape callname=LoadPlanner in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/planner.py line=305. 2025-12-04T13:48:17.0326454Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:17.0326610Z 2025-12-04T13:48:17.0326730Z Abstract class defining the protocol used by load_state_dict to plan the load process. 2025-12-04T13:48:17.0326906Z 2025-12-04T13:48:17.0327025Z LoadPlanner are stateful objects that can be used to customize the whole load process. 2025-12-04T13:48:17.0327203Z 2025-12-04T13:48:17.0327328Z LoadPlanner acts as an access proxy to the state_dict, so any transformation done to it 2025-12-04T13:48:17.0327546Z will be visible to the whole process. 2025-12-04T13:48:17.0327644Z 2025-12-04T13:48:17.0327769Z A planner subclass can expect the following sequence of calls during load_state_dict: 2025-12-04T13:48:17.0327932Z 2025-12-04T13:48:17.0327999Z 1) set_up_planner - called on all ranks. 2025-12-04T13:48:17.0328162Z Signals the start of loading a checkpoint. 2025-12-04T13:48:17.0328265Z 2025-12-04T13:48:17.0328336Z 2) create_local_plan - called on all ranks. 2025-12-04T13:48:17.0328560Z Process the state_dict and produces a `LoadPlan` that will be sent for global planning. 2025-12-04T13:48:17.0328725Z 2025-12-04T13:48:17.0328815Z 3) create_global_plan - called on the coordinator rank only. 2025-12-04T13:48:17.0329029Z Takes the LoadPlan from all ranks and make any global decision. 2025-12-04T13:48:17.0329167Z 2025-12-04T13:48:17.0329236Z 4) load_bytes - called multiple times on each rank 2025-12-04T13:48:17.0329452Z This is called once per non-tensor value in state_dict. 2025-12-04T13:48:17.0329575Z 2025-12-04T13:48:17.0329677Z 5) resolve_tensor and commit_tensor - called multiple times on each rank 2025-12-04T13:48:17.0329903Z They are called in pair for each Tensor value in state_dict. 2025-12-04T13:48:17.0330034Z 2025-12-04T13:48:17.0330160Z Users are recommended to extend DefaultLoadPlanner instead of this interface directly as 2025-12-04T13:48:17.0330421Z most changes can be expressed by changes in a single method. 2025-12-04T13:48:17.0330578Z 2025-12-04T13:48:17.0330648Z There are two usual patterns of extension: 2025-12-04T13:48:17.0330751Z 2025-12-04T13:48:17.0330880Z Rewriting state_dict. This is the simplest way to extend the load process as it 2025-12-04T13:48:17.0331142Z doesn't requite understanding the intrincacies of how LoadPlan works. We need 2025-12-04T13:48:17.0331357Z to keep a reference to the original state_dict as load happens in place so 2025-12-04T13:48:17.0331534Z we need to be able to perform it in place 2025-12-04T13:48:17.0331636Z 2025-12-04T13:48:17.0331688Z >>> # xdoctest: +SKIP("undefined vars") 2025-12-04T13:48:17.0331825Z >>> class RenamePlanner(DefaultLoadPlanner): 2025-12-04T13:48:17.0332308Z >>> def set_up_planner( 2025-12-04T13:48:17.0332416Z >>> self, 2025-12-04T13:48:17.0332521Z >>> state_dict: STATE_DICT_TYPE, 2025-12-04T13:48:17.0332644Z >>> metadata: Metadata, 2025-12-04T13:48:17.0332761Z >>> is_coordinator: bool, 2025-12-04T13:48:17.0332876Z >>> ) -> None: 2025-12-04T13:48:17.0332983Z >>> self.original_state_dict = state_dict 2025-12-04T13:48:17.0333140Z >>> state_dict = {"foo_" + k: v for k, v in state_dict.items()} 2025-12-04T13:48:17.0333278Z >>> 2025-12-04T13:48:17.0333375Z >>> if self.flatten_sharded_tensors: 2025-12-04T13:48:17.0333519Z >>> state_dict = _flatten_sharded_tensors(state_dict) 2025-12-04T13:48:17.0333649Z >>> 2025-12-04T13:48:17.0333742Z >>> if self.flatten_state_dict: 2025-12-04T13:48:17.0333897Z >>> state_dict, self.mappings = flatten_state_dict(state_dict) 2025-12-04T13:48:17.0334038Z >>> 2025-12-04T13:48:17.0334128Z >>> self.state_dict = state_dict 2025-12-04T13:48:17.0334254Z >>> self.metadata = metadata 2025-12-04T13:48:17.0334382Z >>> self.is_coordinator = is_coordinator 2025-12-04T13:48:17.0334501Z >>> 2025-12-04T13:48:17.0334597Z >>> def load_bytes(self, read_item, value): 2025-12-04T13:48:17.0334731Z >>> # Remove the "foo_" prefix 2025-12-04T13:48:17.0334918Z >>> self.original_state_dict[read_item.dest_index.fqn[4:]] = torch.load(value, weights_only=False) 2025-12-04T13:48:17.0335072Z 2025-12-04T13:48:17.0335074Z 2025-12-04T13:48:17.0335168Z Modifying resolve_tensor and commit_tensor to handle load time transformation. 2025-12-04T13:48:17.0335301Z 2025-12-04T13:48:17.0335350Z >>> # xdoctest: +SKIP("undefined vars") 2025-12-04T13:48:17.0335493Z >>> class MetaModelMaterialize(DefaultSavePlanner): 2025-12-04T13:48:17.0335643Z >>> def resolve_tensor(self, read_item): 2025-12-04T13:48:17.0335783Z >>> tensor = super().resolve_tensor(read_item) 2025-12-04T13:48:17.0335933Z >>> return torch.empty_like(tensor, device="cpu") 2025-12-04T13:48:17.0336063Z >>> 2025-12-04T13:48:17.0336162Z >>> def commit_tensor(self, read_item, tensor): 2025-12-04T13:48:17.0336312Z >>> self.state_dict[read_item.dest_index.fqn] = tensor 2025-12-04T13:48:17.0336411Z 2025-12-04T13:48:17.0336595Z Original Error: IndentationError('expected an indented block after function definition on line 22', ('', 23, 0, '_._ = None\n', 23, -1)) 2025-12-04T13:48:17.0336811Z 2025-12-04T13:48:17.0336848Z _._ = None 2025-12-04T13:48:17.0336936Z ^ 2025-12-04T13:48:17.3361581Z msg = Cannot scrape callname=FullStateDictConfig in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/api.py line=295. 2025-12-04T13:48:17.3362696Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:17.3363178Z 2025-12-04T13:48:17.3363419Z ``FullStateDictConfig`` is a config class meant to be used with 2025-12-04T13:48:17.3363945Z ``StateDictType.FULL_STATE_DICT``. We recommend enabling both 2025-12-04T13:48:17.3364468Z ``offload_to_cpu=True`` and ``rank0_only=True`` when saving full state 2025-12-04T13:48:17.3365014Z dicts to save GPU memory and CPU memory, respectively. This config class 2025-12-04T13:48:17.3365543Z is meant to be used via the :func:`state_dict_type` context manager as 2025-12-04T13:48:17.3365991Z follows: 2025-12-04T13:48:17.3366137Z 2025-12-04T13:48:17.3366326Z >>> # xdoctest: +SKIP("undefined variables") 2025-12-04T13:48:17.3366809Z >>> from torch.distributed.fsdp import FullyShardedDataParallel as FSDP 2025-12-04T13:48:17.3367303Z >>> fsdp = FSDP(model, auto_wrap_policy=...) 2025-12-04T13:48:17.3367743Z >>> cfg = FullStateDictConfig(offload_to_cpu=True, rank0_only=True) 2025-12-04T13:48:17.3368319Z >>> with FSDP.state_dict_type(fsdp, StateDictType.FULL_STATE_DICT, cfg): 2025-12-04T13:48:17.3368774Z >>> state = fsdp.state_dict() 2025-12-04T13:48:17.3369432Z >>> # `state` will be empty on non rank 0 and contain CPU tensors on rank 0. 2025-12-04T13:48:17.3369993Z >>> # To reload checkpoint for inference, finetuning, transfer learning, etc: 2025-12-04T13:48:17.3370583Z >>> model = model_fn() # Initialize model in preparation for wrapping with FSDP 2025-12-04T13:48:17.3371011Z >>> if dist.get_rank() == 0: 2025-12-04T13:48:17.3371314Z >>> # Load checkpoint only on rank 0 to avoid memory redundancy 2025-12-04T13:48:17.3371588Z >>> state_dict = torch.load("my_checkpoint.pt") 2025-12-04T13:48:17.3371824Z >>> model.load_state_dict(state_dict) 2025-12-04T13:48:17.3372142Z >>> # All ranks initialize FSDP module as usual. `sync_module_states` argument 2025-12-04T13:48:17.3372490Z >>> # communicates loaded checkpoint states from rank 0 to rest of the world. 2025-12-04T13:48:17.3372763Z >>> fsdp = FSDP( 2025-12-04T13:48:17.3372928Z ... model, 2025-12-04T13:48:17.3373110Z ... device_id=torch.cuda.current_device(), 2025-12-04T13:48:17.3373459Z ... auto_wrap_policy=..., 2025-12-04T13:48:17.3373737Z ... sync_module_states=True, 2025-12-04T13:48:17.3373981Z ... ) 2025-12-04T13:48:17.3374507Z >>> # After this point, all ranks have FSDP model with loaded checkpoint. 2025-12-04T13:48:17.3374718Z 2025-12-04T13:48:17.3374820Z Attributes: 2025-12-04T13:48:17.3375105Z rank0_only (bool): If ``True``, then only rank 0 saves the full state 2025-12-04T13:48:17.3375497Z dict, and nonzero ranks save an empty dict. If ``False``, then all 2025-12-04T13:48:17.3382676Z ranks save the full state dict. (Default: ``False``) 2025-12-04T13:48:17.3382803Z 2025-12-04T13:48:17.3383015Z Original Error: IndentationError("expected an indented block after 'if' statement on line 10", ('', 11, 1, '_._ = None\n', 11, 2)) 2025-12-04T13:48:17.3383273Z 2025-12-04T13:48:17.3383319Z _._ = None 2025-12-04T13:48:17.3383419Z ^ 2025-12-04T13:48:19.0135267Z msg = Cannot scrape callname=register_parametrization in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/parametrize.py line=437. 2025-12-04T13:48:19.0136120Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:19.0136603Z Register a parametrization to a tensor in a module. 2025-12-04T13:48:19.0136844Z 2025-12-04T13:48:19.0137115Z Assume that ``tensor_name="weight"`` for simplicity. When accessing ``module.weight``, 2025-12-04T13:48:19.0137704Z the module will return the parametrized version ``parametrization(module.weight)``. 2025-12-04T13:48:19.0138321Z If the original tensor requires a gradient, the backward pass will differentiate 2025-12-04T13:48:19.0138908Z through :attr:`parametrization`, and the optimizer will update the tensor accordingly. 2025-12-04T13:48:19.0139252Z 2025-12-04T13:48:19.0139796Z The first time that a module registers a parametrization, this function will add an attribute 2025-12-04T13:48:19.0140374Z ``parametrizations`` to the module of type :class:`~ParametrizationList`. 2025-12-04T13:48:19.0140687Z 2025-12-04T13:48:19.0140913Z The list of parametrizations on the tensor ``weight`` will be accessible under 2025-12-04T13:48:19.0141382Z ``module.parametrizations.weight``. 2025-12-04T13:48:19.0141581Z 2025-12-04T13:48:19.0141701Z The original tensor will be accessible under 2025-12-04T13:48:19.0142256Z ``module.parametrizations.weight.original``. 2025-12-04T13:48:19.0142469Z 2025-12-04T13:48:19.0142732Z Parametrizations may be concatenated by registering several parametrizations 2025-12-04T13:48:19.0143121Z on the same attribute. 2025-12-04T13:48:19.0143281Z 2025-12-04T13:48:19.0143474Z The training mode of a registered parametrization is updated on registration 2025-12-04T13:48:19.0143868Z to match the training mode of the host module 2025-12-04T13:48:19.0144072Z 2025-12-04T13:48:19.0144367Z Parametrized parameters and buffers have an inbuilt caching system that can be activated 2025-12-04T13:48:19.0144814Z using the context manager :func:`cached`. 2025-12-04T13:48:19.0145005Z 2025-12-04T13:48:19.0145198Z A :attr:`parametrization` may optionally implement a method with signature 2025-12-04T13:48:19.0145472Z 2025-12-04T13:48:19.0145607Z .. code-block:: python 2025-12-04T13:48:19.0145764Z 2025-12-04T13:48:19.0145946Z def right_inverse(self, X: Tensor) -> Union[Tensor, Sequence[Tensor]] 2025-12-04T13:48:19.0146211Z 2025-12-04T13:48:19.0146427Z This method is called on the unparametrized tensor when the first parametrization 2025-12-04T13:48:19.0146894Z is registered to compute the initial value of the original tensor. 2025-12-04T13:48:19.0147386Z If this method is not implemented, the original tensor will be just the unparametrized tensor. 2025-12-04T13:48:19.0147905Z 2025-12-04T13:48:19.0148158Z If all the parametrizations registered on a tensor implement `right_inverse` it is possible 2025-12-04T13:48:19.0148708Z to initialize a parametrized tensor by assigning to it, as shown in the example below. 2025-12-04T13:48:19.0149017Z 2025-12-04T13:48:19.0149204Z It is possible for the first parametrization to depend on several inputs. 2025-12-04T13:48:19.0149668Z This may be implemented returning a tuple of tensors from ``right_inverse`` 2025-12-04T13:48:19.0150133Z (see the example implementation of a ``RankOne`` parametrization below). 2025-12-04T13:48:19.0150407Z 2025-12-04T13:48:19.0150674Z In this case, the unconstrained tensors are also located under ``module.parametrizations.weight`` 2025-12-04T13:48:19.0151048Z with names ``original0``, ``original1``,... 2025-12-04T13:48:19.0151180Z 2025-12-04T13:48:19.0151245Z .. note:: 2025-12-04T13:48:19.0151327Z 2025-12-04T13:48:19.0151487Z If unsafe=False (default) both the forward and right_inverse methods will be called 2025-12-04T13:48:19.0151793Z once to perform a number of consistency checks. 2025-12-04T13:48:19.0152142Z If unsafe=True, then right_inverse will be called if the tensor is not parametrized, 2025-12-04T13:48:19.0152423Z and nothing will be called otherwise. 2025-12-04T13:48:19.0152550Z 2025-12-04T13:48:19.0152614Z .. note:: 2025-12-04T13:48:19.0152694Z 2025-12-04T13:48:19.0152818Z In most situations, ``right_inverse`` will be a function such that 2025-12-04T13:48:19.0153070Z ``forward(right_inverse(X)) == X`` (see 2025-12-04T13:48:19.0153384Z `right inverse `_). 2025-12-04T13:48:19.0153748Z Sometimes, when the parametrization is not surjective, it may be reasonable 2025-12-04T13:48:19.0154011Z to relax this. 2025-12-04T13:48:19.0154108Z 2025-12-04T13:48:19.0154171Z .. warning:: 2025-12-04T13:48:19.0154256Z 2025-12-04T13:48:19.0154414Z If a parametrization depends on several inputs, :func:`~register_parametrization` 2025-12-04T13:48:19.0154815Z will register a number of new parameters. If such parametrization is registered 2025-12-04T13:48:19.0155171Z after the optimizer is created, these new parameters will need to be added manually 2025-12-04T13:48:19.0155497Z to the optimizer. See :meth:`torch.Optimizer.add_param_group`. 2025-12-04T13:48:19.0155664Z 2025-12-04T13:48:19.0155723Z Args: 2025-12-04T13:48:19.0155923Z module (nn.Module): module on which to register the parametrization 2025-12-04T13:48:19.0156250Z tensor_name (str): name of the parameter or buffer on which to register 2025-12-04T13:48:19.0156519Z the parametrization 2025-12-04T13:48:19.0156751Z parametrization (nn.Module): the parametrization to register 2025-12-04T13:48:19.0156978Z Keyword args: 2025-12-04T13:48:19.0157186Z unsafe (bool): a boolean flag that denotes whether the parametrization 2025-12-04T13:48:19.0157501Z may change the dtype and shape of the tensor. Default: `False` 2025-12-04T13:48:19.0157812Z Warning: the parametrization is not checked for consistency upon registration. 2025-12-04T13:48:19.0158089Z Enable this flag at your own risk. 2025-12-04T13:48:19.0158216Z 2025-12-04T13:48:19.0158269Z Raises: 2025-12-04T13:48:19.0158505Z ValueError: if the module does not have a parameter or a buffer named :attr:`tensor_name` 2025-12-04T13:48:19.0158719Z 2025-12-04T13:48:19.0158776Z Examples: 2025-12-04T13:48:19.0158951Z >>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_LAPACK) 2025-12-04T13:48:19.0159155Z >>> import torch 2025-12-04T13:48:19.0159323Z >>> import torch.nn as nn 2025-12-04T13:48:19.0159525Z >>> import torch.nn.utils.parametrize as P 2025-12-04T13:48:19.0159714Z >>> 2025-12-04T13:48:19.0159858Z >>> class Symmetric(nn.Module): 2025-12-04T13:48:19.0160048Z >>> def forward(self, X): 2025-12-04T13:48:19.0160274Z >>> return X.triu() + X.triu(1).T # Return a symmetric matrix 2025-12-04T13:48:19.0160485Z >>> 2025-12-04T13:48:19.0160624Z >>> def right_inverse(self, A): 2025-12-04T13:48:19.0160772Z >>> return A.triu() 2025-12-04T13:48:19.0160901Z >>> 2025-12-04T13:48:19.0161008Z >>> m = nn.Linear(5, 5) 2025-12-04T13:48:19.0161171Z >>> P.register_parametrization(m, "weight", Symmetric()) 2025-12-04T13:48:19.0161395Z >>> print(torch.allclose(m.weight, m.weight.T)) # m.weight is now symmetric 2025-12-04T13:48:19.0161586Z True 2025-12-04T13:48:19.0161697Z >>> A = torch.rand(5, 5) 2025-12-04T13:48:19.0161838Z >>> A = A + A.T # A is now symmetric 2025-12-04T13:48:19.0162053Z >>> m.weight = A # Initialize the weight to be the symmetric matrix A 2025-12-04T13:48:19.0162240Z >>> print(torch.allclose(m.weight, A)) 2025-12-04T13:48:19.0162374Z True 2025-12-04T13:48:19.0162433Z 2025-12-04T13:48:19.0162485Z >>> class RankOne(nn.Module): 2025-12-04T13:48:19.0162629Z >>> def forward(self, x, y): 2025-12-04T13:48:19.0162788Z >>> # Form a rank 1 matrix multiplying two vectors 2025-12-04T13:48:19.0162960Z >>> return x.unsqueeze(-1) @ y.unsqueeze(-2) 2025-12-04T13:48:19.0163100Z >>> 2025-12-04T13:48:19.0163207Z >>> def right_inverse(self, Z): 2025-12-04T13:48:19.0163357Z >>> # Project Z onto the rank 1 matrices 2025-12-04T13:48:19.0163530Z >>> U, S, Vh = torch.linalg.svd(Z, full_matrices=False) 2025-12-04T13:48:19.0163695Z >>> # Return rescaled singular vectors 2025-12-04T13:48:19.0163848Z >>> s0_sqrt = S[0].sqrt().unsqueeze(-1) 2025-12-04T13:48:19.0164014Z >>> return U[..., :, 0] * s0_sqrt, Vh[..., 0, :] * s0_sqrt 2025-12-04T13:48:19.0164161Z >>> 2025-12-04T13:48:19.0164287Z >>> linear_rank_one = P.register_parametrization( 2025-12-04T13:48:19.0164510Z ... nn.Linear(4, 4), "weight", RankOne() 2025-12-04T13:48:19.0164650Z ... ) 2025-12-04T13:48:19.0164799Z >>> print(torch.linalg.matrix_rank(linear_rank_one.weight).item()) 2025-12-04T13:48:19.0164967Z 1 2025-12-04T13:48:19.0165025Z 2025-12-04T13:48:19.0165065Z 2025-12-04T13:48:19.0165330Z Original Error: IndentationError('expected an indented block after function definition on line 2', ('', 3, 0, '_._ = None\n', 3, -1)) 2025-12-04T13:48:19.0165581Z 2025-12-04T13:48:19.0165640Z _._ = None 2025-12-04T13:48:19.0165736Z ^ 2025-12-04T13:48:19.0609865Z msg = Cannot scrape callname=ReduceLROnPlateau in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py line=1586. 2025-12-04T13:48:19.0610850Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:19.0611422Z Reduce learning rate when a metric has stopped improving. 2025-12-04T13:48:19.0611725Z 2025-12-04T13:48:19.0612007Z Models often benefit from reducing the learning rate by a factor 2025-12-04T13:48:19.0612635Z of 2-10 once learning stagnates. This scheduler reads a metrics 2025-12-04T13:48:19.0613144Z quantity and if no improvement is seen for a 'patience' number 2025-12-04T13:48:19.0613589Z of epochs, the learning rate is reduced. 2025-12-04T13:48:19.0613831Z 2025-12-04T13:48:19.0613931Z Args: 2025-12-04T13:48:19.0614213Z optimizer (Optimizer): Wrapped optimizer. 2025-12-04T13:48:19.0614637Z mode (str): One of `min`, `max`. In `min` mode, lr will 2025-12-04T13:48:19.0615086Z be reduced when the quantity monitored has stopped 2025-12-04T13:48:19.0615540Z decreasing; in `max` mode it will be reduced when the 2025-12-04T13:48:19.0616028Z quantity monitored has stopped increasing. Default: 'min'. 2025-12-04T13:48:19.0616509Z factor (float): Factor by which the learning rate will be 2025-12-04T13:48:19.0616959Z reduced. new_lr = lr * factor. Default: 0.1. 2025-12-04T13:48:19.0617441Z patience (int): The number of allowed epochs with no improvement after 2025-12-04T13:48:19.0617927Z which the learning rate will be reduced. 2025-12-04T13:48:19.0618403Z For example, consider the case of having no patience (`patience = 0`). 2025-12-04T13:48:19.0619101Z In the first epoch, a baseline is established and is always considered good as there's no previous baseline. 2025-12-04T13:48:19.0619760Z In the second epoch, if the performance is worse than the baseline, 2025-12-04T13:48:19.0620239Z we have what is considered an intolerable epoch. 2025-12-04T13:48:19.0620765Z Since the count of intolerable epochs (1) is greater than the patience level (0), 2025-12-04T13:48:19.0621324Z the learning rate is reduced at the end of this epoch. 2025-12-04T13:48:19.0621741Z From the third epoch onwards, the learning rate continues to be reduced at the end of each epoch 2025-12-04T13:48:19.0622088Z if the performance is worse than the baseline. If the performance improves or remains the same, 2025-12-04T13:48:19.0622293Z the learning rate is not adjusted. 2025-12-04T13:48:19.0622423Z Default: 10. 2025-12-04T13:48:19.0622569Z threshold (float): Threshold for measuring the new optimum, 2025-12-04T13:48:19.0622748Z to only focus on significant changes. Default: 1e-4. 2025-12-04T13:48:19.0622921Z threshold_mode (str): One of `rel`, `abs`. In `rel` mode, 2025-12-04T13:48:19.0623092Z dynamic_threshold = best * ( 1 + threshold ) in 'max' 2025-12-04T13:48:19.0623258Z mode or best * ( 1 - threshold ) in `min` mode. 2025-12-04T13:48:19.0623418Z In `abs` mode, dynamic_threshold = best + threshold in 2025-12-04T13:48:19.0623592Z `max` mode or best - threshold in `min` mode. Default: 'rel'. 2025-12-04T13:48:19.0623773Z cooldown (int): Number of epochs to wait before resuming 2025-12-04T13:48:19.0623976Z normal operation after lr has been reduced. Default: 0. 2025-12-04T13:48:19.0624151Z min_lr (float or list): A scalar or a list of scalars. A 2025-12-04T13:48:19.0624317Z lower bound on the learning rate of all param groups 2025-12-04T13:48:19.0624474Z or each group respectively. Default: 0. 2025-12-04T13:48:19.0624639Z eps (float): Minimal decay applied to lr. If the difference 2025-12-04T13:48:19.0624820Z between new and old lr is smaller than eps, the update is 2025-12-04T13:48:19.0624991Z ignored. Default: 1e-8. 2025-12-04T13:48:19.0625075Z 2025-12-04T13:48:19.0625113Z Example: 2025-12-04T13:48:19.0625234Z >>> # xdoctest: +SKIP 2025-12-04T13:48:19.0625400Z >>> optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9) 2025-12-04T13:48:19.0625590Z >>> scheduler = ReduceLROnPlateau(optimizer, "min") 2025-12-04T13:48:19.0625736Z >>> for epoch in range(10): 2025-12-04T13:48:19.0625859Z >>> train(...) 2025-12-04T13:48:19.0625995Z >>> val_loss = validate(...) 2025-12-04T13:48:19.0626141Z >>> # Note that step should be called after validate() 2025-12-04T13:48:19.0626288Z >>> scheduler.step(val_loss) 2025-12-04T13:48:19.0626373Z 2025-12-04T13:48:19.0626454Z .. image:: ../scripts/lr_scheduler_images/ReduceLROnPlateau.png 2025-12-04T13:48:19.0626601Z 2025-12-04T13:48:19.0626803Z Original Error: IndentationError('unexpected indent', ('', 8, 4, ' scheduler.step(val_loss)\n', 8, -1)) 2025-12-04T13:48:19.0626994Z 2025-12-04T13:48:19.0627041Z scheduler.step(val_loss) 2025-12-04T13:48:19.0627150Z ^ 2025-12-04T13:48:22.0448186Z running 894 test(s) 2025-12-04T13:48:22.0454334Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::typename:0, line 1111 <- wrt source file 2025-12-04T13:48:22.0454868Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::typename:0 2025-12-04T13:48:22.0455278Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::is_tensor:0, line 1142 <- wrt source file 2025-12-04T13:48:22.0455667Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::is_tensor:0 2025-12-04T13:48:22.0456044Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::is_storage:0, line 1157 <- wrt source file 2025-12-04T13:48:22.0461250Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::is_storage:0 2025-12-04T13:48:22.0462051Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::set_default_device:0, line 1247 <- wrt source file 2025-12-04T13:48:22.0462468Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::set_default_device:0 2025-12-04T13:48:22.0462881Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::set_default_tensor_type:0, line 1296 <- wrt source file 2025-12-04T13:48:22.0463304Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::set_default_tensor_type:0 2025-12-04T13:48:22.0463708Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::set_default_dtype:0, line 1333 <- wrt source file 2025-12-04T13:48:22.0464114Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::set_default_dtype:0 2025-12-04T13:48:22.0464532Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::use_deterministic_algorithms:0, line 1497 <- wrt source file 2025-12-04T13:48:22.0464983Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::use_deterministic_algorithms:0 2025-12-04T13:48:22.0465790Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::compile:0, line 2655 <- wrt source file 2025-12-04T13:48:22.0466332Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::compile:0 2025-12-04T13:48:22.0466745Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::_is_device_backend_autoload_enabled:0, line 2963 <- wrt source file 2025-12-04T13:48:22.0470232Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/__init__.py::_is_device_backend_autoload_enabled:0 2025-12-04T13:48:22.0470890Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_C.cpython-310-x86_64-linux-gnu.so::Generator:0, line 15 <- wrt source file 2025-12-04T13:48:22.0471358Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_C.cpython-310-x86_64-linux-gnu.so::Generator:0 2025-12-04T13:48:22.0471930Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_C.cpython-310-x86_64-linux-gnu.so::_LinAlgError:0, line 5 <- wrt source file 2025-12-04T13:48:22.0472397Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_C.cpython-310-x86_64-linux-gnu.so::_LinAlgError:0 2025-12-04T13:48:22.0472817Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_custom_ops.py::custom_op:0, line 55 <- wrt source file 2025-12-04T13:48:22.0473217Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_custom_ops.py::custom_op:0 2025-12-04T13:48:22.0473592Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_custom_ops.py::impl:0, line 138 <- wrt source file 2025-12-04T13:48:22.0473970Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_custom_ops.py::impl:0 2025-12-04T13:48:22.0474354Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_custom_ops.py::impl_abstract:0, line 208 <- wrt source file 2025-12-04T13:48:22.0731600Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_custom_ops.py::impl_abstract:0 2025-12-04T13:48:22.0732111Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_namedtensor_internals.py::update_names:0, line 118 <- wrt source file 2025-12-04T13:48:22.0733799Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_namedtensor_internals.py::update_names:0 2025-12-04T13:48:22.0734239Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.register_hook:0, line 681 <- wrt source file 2025-12-04T13:48:22.0747980Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.register_hook:0 2025-12-04T13:48:22.0748431Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.register_post_accumulate_grad_hook:0, line 738 <- wrt source file 2025-12-04T13:48:22.0762563Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.register_post_accumulate_grad_hook:0 2025-12-04T13:48:22.0802634Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.refine_names:0, line 1374 <- wrt source file 2025-12-04T13:48:22.0803055Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.refine_names:0 2025-12-04T13:48:22.0803634Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.align_to:0, line 1419 <- wrt source file 2025-12-04T13:48:22.0805829Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.align_to:0 2025-12-04T13:48:22.0806223Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.rename:0, line 1492 <- wrt source file 2025-12-04T13:48:22.0808953Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.rename:0 2025-12-04T13:48:22.0809354Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.to_sparse_coo:0, line 1522 <- wrt source file 2025-12-04T13:48:22.0932333Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.to_sparse_coo:0 2025-12-04T13:48:22.0938356Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.dim_order:0, line 1554 <- wrt source file 2025-12-04T13:48:22.0938821Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py::Tensor.dim_order:0 2025-12-04T13:48:22.0939229Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor_str.py::set_printoptions:0, line 53 <- wrt source file 2025-12-04T13:48:22.0939698Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor_str.py::set_printoptions:0 2025-12-04T13:48:22.0940116Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::broadcast_tensors:0, line 64 <- wrt source file 2025-12-04T13:48:22.0940536Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::broadcast_tensors:0 2025-12-04T13:48:22.0940944Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::broadcast_shapes:0, line 92 <- wrt source file 2025-12-04T13:48:22.0941372Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::broadcast_shapes:0 2025-12-04T13:48:22.0941763Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::split:0, line 144 <- wrt source file 2025-12-04T13:48:22.0942227Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::split:0 2025-12-04T13:48:22.0942604Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::einsum:0, line 258 <- wrt source file 2025-12-04T13:48:22.0942999Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::einsum:0 2025-12-04T13:48:22.0943379Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::meshgrid:0, line 450 <- wrt source file 2025-12-04T13:48:22.0960086Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::meshgrid:0 2025-12-04T13:48:22.0960484Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::_unique_impl:0, line 835 <- wrt source file 2025-12-04T13:48:22.0979815Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::_unique_impl:0 2025-12-04T13:48:22.0980248Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::_unique_consecutive_impl:0, line 992 <- wrt source file 2025-12-04T13:48:22.0984898Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::_unique_consecutive_impl:0 2025-12-04T13:48:22.0985324Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::tensordot:0, line 1267 <- wrt source file 2025-12-04T13:48:22.0990392Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::tensordot:0 2025-12-04T13:48:22.0990796Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::cartesian_prod:0, line 1351 <- wrt source file 2025-12-04T13:48:22.0994218Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::cartesian_prod:0 2025-12-04T13:48:22.0994669Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::block_diag:0, line 1385 <- wrt source file 2025-12-04T13:48:22.0999058Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::block_diag:0 2025-12-04T13:48:22.0999442Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::cdist:0, line 1441 <- wrt source file 2025-12-04T13:48:22.1006181Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::cdist:0 2025-12-04T13:48:22.1006620Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::atleast_1d:0, line 1482 <- wrt source file 2025-12-04T13:48:22.1013061Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::atleast_1d:0 2025-12-04T13:48:22.1013485Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::atleast_2d:0, line 1520 <- wrt source file 2025-12-04T13:48:22.1020322Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::atleast_2d:0 2025-12-04T13:48:22.1020720Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::atleast_3d:0, line 1560 <- wrt source file 2025-12-04T13:48:22.1030078Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::atleast_3d:0 2025-12-04T13:48:22.1030466Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::norm:0, line 1735 <- wrt source file 2025-12-04T13:48:22.1043264Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::norm:0 2025-12-04T13:48:22.1043659Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::unravel_index:0, line 1905 <- wrt source file 2025-12-04T13:48:22.1055610Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::unravel_index:0 2025-12-04T13:48:22.1056017Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::chain_matmul:0, line 2005 <- wrt source file 2025-12-04T13:48:22.1056424Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::chain_matmul:0 2025-12-04T13:48:22.1056812Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::_lu_impl:0, line 2106 <- wrt source file 2025-12-04T13:48:22.1057236Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/functional.py::_lu_impl:0 2025-12-04T13:48:22.1057614Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::list:0, line 477 <- wrt source file 2025-12-04T13:48:22.1057974Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::list:0 2025-12-04T13:48:22.1058325Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::help:0, line 537 <- wrt source file 2025-12-04T13:48:22.1058679Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::help:0 2025-12-04T13:48:22.1059028Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::load:0, line 628 <- wrt source file 2025-12-04T13:48:22.1059388Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::load:0 2025-12-04T13:48:22.1059748Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::_load_local:0, line 676 <- wrt source file 2025-12-04T13:48:22.1060122Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::_load_local:0 2025-12-04T13:48:22.1060508Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::download_url_to_file:0, line 711 <- wrt source file 2025-12-04T13:48:22.1060939Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::download_url_to_file:0 2025-12-04T13:48:22.1061334Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::load_state_dict_from_url:0, line 852 <- wrt source file 2025-12-04T13:48:22.1061744Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/hub.py::load_state_dict_from_url:0 2025-12-04T13:48:22.1062223Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::Library.define:0, line 145 <- wrt source file 2025-12-04T13:48:22.1062630Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::Library.define:0 2025-12-04T13:48:22.1063051Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::Library._impl_with_aoti_compile:0, line 239 <- wrt source file 2025-12-04T13:48:22.1067164Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::Library._impl_with_aoti_compile:0 2025-12-04T13:48:22.1067585Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::Library.impl:0, line 300 <- wrt source file 2025-12-04T13:48:22.1069203Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::Library.impl:0 2025-12-04T13:48:22.1069587Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::define:0, line 521 <- wrt source file 2025-12-04T13:48:22.1076152Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::define:0 2025-12-04T13:48:22.1076524Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::impl:0, line 627 <- wrt source file 2025-12-04T13:48:22.1084142Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::impl:0 2025-12-04T13:48:22.1084531Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_kernel:0, line 809 <- wrt source file 2025-12-04T13:48:22.1084930Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_kernel:0 2025-12-04T13:48:22.1085327Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_autocast:0, line 877 <- wrt source file 2025-12-04T13:48:22.1085745Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_autocast:0 2025-12-04T13:48:22.1086147Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_autograd:0, line 1164 <- wrt source file 2025-12-04T13:48:22.1166938Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_autograd:0 2025-12-04T13:48:22.1167356Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_torch_dispatch:0, line 1280 <- wrt source file 2025-12-04T13:48:22.1204223Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_torch_dispatch:0 2025-12-04T13:48:22.1204636Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_vmap:0, line 1369 <- wrt source file 2025-12-04T13:48:22.1279390Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::register_vmap:0 2025-12-04T13:48:22.1279788Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::opcheck:0, line 1694 <- wrt source file 2025-12-04T13:48:22.1280172Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py::opcheck:0 2025-12-04T13:48:22.1280621Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::get_ignored_functions:0, line 117 <- wrt source file 2025-12-04T13:48:22.1284112Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::get_ignored_functions:0 2025-12-04T13:48:22.1284541Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::get_testing_overrides:0, line 435 <- wrt source file 2025-12-04T13:48:22.1304801Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::get_testing_overrides:0 2025-12-04T13:48:22.1305271Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::wrap_torch_function:0, line 1589 <- wrt source file 2025-12-04T13:48:22.1306569Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::wrap_torch_function:0 2025-12-04T13:48:22.1307015Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::handle_torch_function:0, line 1725 <- wrt source file 2025-12-04T13:48:22.1308046Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::handle_torch_function:0 2025-12-04T13:48:22.1308492Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::is_tensor_method_or_property:0, line 1974 <- wrt source file 2025-12-04T13:48:22.1324841Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::is_tensor_method_or_property:0 2025-12-04T13:48:22.1325269Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::is_tensor_like:0, line 1993 <- wrt source file 2025-12-04T13:48:22.1328635Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/overrides.py::is_tensor_like:0 2025-12-04T13:48:22.1329051Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/quasirandom.py::SobolEngine:0, line 39 <- wrt source file 2025-12-04T13:48:22.1329462Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/quasirandom.py::SobolEngine:0 2025-12-04T13:48:22.1329875Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::add_safe_globals:0, line 300 <- wrt source file 2025-12-04T13:48:22.1330311Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::add_safe_globals:0 2025-12-04T13:48:22.1330733Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::safe_globals:0, line 325 <- wrt source file 2025-12-04T13:48:22.1331148Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::safe_globals:0 2025-12-04T13:48:22.1331546Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::skip_data:0, line 401 <- wrt source file 2025-12-04T13:48:22.1332003Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::skip_data:0 2025-12-04T13:48:22.1332413Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::register_package:0, line 473 <- wrt source file 2025-12-04T13:48:22.1333602Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::register_package:0 2025-12-04T13:48:22.1334012Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::save:0, line 960 <- wrt source file 2025-12-04T13:48:22.1334405Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::save:0 2025-12-04T13:48:22.1334791Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::load:0, line 1379 <- wrt source file 2025-12-04T13:48:22.1336609Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/serialization.py::load:0 2025-12-04T13:48:22.1337012Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/torch_version.py::TorchVersion:0, line 19 <- wrt source file 2025-12-04T13:48:22.1337433Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/torch_version.py::TorchVersion:0 2025-12-04T13:48:22.1337857Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/__init__.py::list_mode_options:0, line 349 <- wrt source file 2025-12-04T13:48:22.1338378Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/__init__.py::list_mode_options:0 2025-12-04T13:48:22.1338802Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/__init__.py::list_options:0, line 388 <- wrt source file 2025-12-04T13:48:22.1346019Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/__init__.py::list_options:0 2025-12-04T13:48:22.1346485Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_prims_common/__init__.py::compute_required_storage_length:0, line 1911 <- wrt source file 2025-12-04T13:48:22.1348564Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_prims_common/__init__.py::compute_required_storage_length:0 2025-12-04T13:48:22.1349044Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py::current_accelerator:0, line 117 <- wrt source file 2025-12-04T13:48:22.3193160Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py::current_accelerator:0 2025-12-04T13:48:22.3193644Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py::get_device_capability:0, line 171 <- wrt source file 2025-12-04T13:48:22.3195056Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py::get_device_capability:0 2025-12-04T13:48:22.3195518Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py::device_index:0, line 276 <- wrt source file 2025-12-04T13:48:22.3195959Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py::device_index:0 2025-12-04T13:48:22.3196390Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::allow_in_graph:0, line 130 <- wrt source file 2025-12-04T13:48:22.3196815Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::allow_in_graph:0 2025-12-04T13:48:22.3197239Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::substitute_in_graph:0, line 186 <- wrt source file 2025-12-04T13:48:22.6835032Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::substitute_in_graph:0 2025-12-04T13:48:22.6835556Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::wrap_numpy:0, line 416 <- wrt source file 2025-12-04T13:48:22.6835990Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::wrap_numpy:0 2025-12-04T13:48:22.6836405Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::is_compiling:0, line 448 <- wrt source file 2025-12-04T13:48:22.6836922Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::is_compiling:0 2025-12-04T13:48:22.6837364Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::is_dynamo_compiling:0, line 469 <- wrt source file 2025-12-04T13:48:22.6838384Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::is_dynamo_compiling:0 2025-12-04T13:48:22.6838816Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::is_exporting:0, line 487 <- wrt source file 2025-12-04T13:48:22.6839774Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::is_exporting:0 2025-12-04T13:48:22.6840296Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::save_cache_artifacts:0, line 502 <- wrt source file 2025-12-04T13:48:22.6840823Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::save_cache_artifacts:0 2025-12-04T13:48:22.6841263Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::load_cache_artifacts:0, line 522 <- wrt source file 2025-12-04T13:48:22.6841759Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/compiler/__init__.py::load_cache_artifacts:0 2025-12-04T13:48:22.6842335Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py::_compile_kernel:0, line 1788 <- wrt source file 2025-12-04T13:48:22.6842775Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py::_compile_kernel:0 2025-12-04T13:48:22.6843178Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/__init__.py::save:0, line 349 <- wrt source file 2025-12-04T13:48:22.6843577Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/__init__.py::save:0 2025-12-04T13:48:22.6843959Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/__init__.py::load:0, line 422 <- wrt source file 2025-12-04T13:48:22.6844358Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/__init__.py::load:0 2025-12-04T13:48:22.6844766Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/__init__.py::register_dataclass:0, line 581 <- wrt source file 2025-12-04T13:48:22.6845204Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/__init__.py::register_dataclass:0 2025-12-04T13:48:22.6845619Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::Future.then:0, line 152 <- wrt source file 2025-12-04T13:48:22.6846052Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::Future.then:0 2025-12-04T13:48:22.6846481Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::Future.add_done_callback:0, line 201 <- wrt source file 2025-12-04T13:48:22.6846947Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::Future.add_done_callback:0 2025-12-04T13:48:22.6847392Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::Future.set_result:0, line 235 <- wrt source file 2025-12-04T13:48:22.6847826Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::Future.set_result:0 2025-12-04T13:48:22.6848256Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::Future.set_exception:0, line 265 <- wrt source file 2025-12-04T13:48:22.6848702Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::Future.set_exception:0 2025-12-04T13:48:22.6849119Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::collect_all:0, line 299 <- wrt source file 2025-12-04T13:48:22.6849571Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/futures/__init__.py::collect_all:0 2025-12-04T13:48:22.6849964Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/__init__.py::annotate:0, line 147 <- wrt source file 2025-12-04T13:48:22.6850354Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/__init__.py::annotate:0 2025-12-04T13:48:22.6850775Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/monitor/__init__.py::TensorboardEventHandler:0, line 22 <- wrt source file 2025-12-04T13:48:22.6855326Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/monitor/__init__.py::TensorboardEventHandler:0 2025-12-04T13:48:22.6855763Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/mps/__init__.py::compile_shader:0, line 148 <- wrt source file 2025-12-04T13:48:22.6856194Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/mps/__init__.py::compile_shader:0 2025-12-04T13:48:22.6856601Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::as_nested_tensor:0, line 61 <- wrt source file 2025-12-04T13:48:22.6984615Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::as_nested_tensor:0 2025-12-04T13:48:22.7015149Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::nested_tensor:0, line 240 <- wrt source file 2025-12-04T13:48:22.7092703Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::nested_tensor:0 2025-12-04T13:48:22.7093169Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::narrow:0, line 315 <- wrt source file 2025-12-04T13:48:22.7093587Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::narrow:0 2025-12-04T13:48:22.7094032Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::nested_tensor_from_jagged:0, line 405 <- wrt source file 2025-12-04T13:48:22.7094634Z W1204 13:48:22.708000 9232 site-packages/torch/fx/_symbolic_trace.py:53] is_fx_tracing will return true for both fx.symbolic_trace and torch.export. Please use is_fx_tracing_symbolic_tracing() for specifically fx.symbolic_trace or torch.compiler.is_compiling() for specifically torch.export/compile. 2025-12-04T13:48:22.7101679Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::nested_tensor_from_jagged:0 2025-12-04T13:48:22.7102166Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::masked_select:0, line 481 <- wrt source file 2025-12-04T13:48:22.7110689Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py::masked_select:0 2025-12-04T13:48:22.7111096Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/sparse/__init__.py::sum:0, line 223 <- wrt source file 2025-12-04T13:48:22.7183365Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/sparse/__init__.py::sum:0 2025-12-04T13:48:22.7193068Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/sparse/__init__.py::check_sparse_tensor_invariants:0, line 475 <- wrt source file 2025-12-04T13:48:22.7193627Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/sparse/__init__.py::check_sparse_tensor_invariants:0 2025-12-04T13:48:22.7194096Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/sparse/__init__.py::as_sparse_gradcheck:0, line 561 <- wrt source file 2025-12-04T13:48:22.8223587Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/sparse/__init__.py::as_sparse_gradcheck:0 2025-12-04T13:48:22.8224211Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/decorators.py::substitute_in_graph:0, line 361 <- wrt source file 2025-12-04T13:48:22.8224673Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/decorators.py::substitute_in_graph:0 2025-12-04T13:48:22.8225145Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/variables/base.py::VariableTracker.python_type:0, line 328 <- wrt source file 2025-12-04T13:48:22.8225745Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/variables/base.py::VariableTracker.python_type:0 2025-12-04T13:48:22.8226301Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/variables/higher_order_ops.py::speculate_subgraph_with_auto_output_flattening:0, line 1316 <- wrt source file 2025-12-04T13:48:22.8227004Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/variables/higher_order_ops.py::speculate_subgraph_with_auto_output_flattening:0 2025-12-04T13:48:22.8227531Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_export/utils.py::register_module_as_pytree_input_node:0, line 1441 <- wrt source file 2025-12-04T13:48:22.8228017Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_export/utils.py::register_module_as_pytree_input_node:0 2025-12-04T13:48:22.8228538Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_export/wrappers.py::mark_subclass_constructor_exportable_experimental:0, line 194 <- wrt source file 2025-12-04T13:48:22.8229077Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_export/wrappers.py::mark_subclass_constructor_exportable_experimental:0 2025-12-04T13:48:22.8229565Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_export/wrappers.py::allow_in_pre_dispatch_graph:0, line 262 <- wrt source file 2025-12-04T13:48:22.8230034Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_export/wrappers.py::allow_in_pre_dispatch_graph:0 2025-12-04T13:48:22.8230483Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py::aot_function:0, line 771 <- wrt source file 2025-12-04T13:48:22.8389652Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py::aot_function:0 2025-12-04T13:48:22.8390257Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/benchmark_utils.py::benchmark_utilization:0, line 184 <- wrt source file 2025-12-04T13:48:22.8390772Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/benchmark_utils.py::benchmark_utilization:0 2025-12-04T13:48:22.8391241Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::vjp:0, line 234 <- wrt source file 2025-12-04T13:48:22.8410502Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::vjp:0 2025-12-04T13:48:22.8410936Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::jacrev:0, line 476 <- wrt source file 2025-12-04T13:48:22.8441649Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::jacrev:0 2025-12-04T13:48:22.8442130Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::jvp:0, line 1024 <- wrt source file 2025-12-04T13:48:22.8880835Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::jvp:0 2025-12-04T13:48:22.8881842Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::jacfwd:0, line 1182 <- wrt source file 2025-12-04T13:48:22.8907694Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::jacfwd:0 2025-12-04T13:48:22.8908228Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::hessian:0, line 1342 <- wrt source file 2025-12-04T13:48:22.8916528Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::hessian:0 2025-12-04T13:48:22.8917051Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::functionalize:0, line 1506 <- wrt source file 2025-12-04T13:48:22.8918832Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::functionalize:0 2025-12-04T13:48:22.8919347Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::linearize:0, line 1705 <- wrt source file 2025-12-04T13:48:22.9015020Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/eager_transforms.py::linearize:0 2025-12-04T13:48:22.9015477Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/functional_call.py::functional_call:0, line 36 <- wrt source file 2025-12-04T13:48:22.9017634Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/functional_call.py::functional_call:0 2025-12-04T13:48:22.9018093Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/fx_minifier.py::minifier:0, line 194 <- wrt source file 2025-12-04T13:48:22.9018538Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/fx_minifier.py::minifier:0 2025-12-04T13:48:22.9019047Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/schemas.py::CompilerWrapper.post_compile:0, line 1111 <- wrt source file 2025-12-04T13:48:22.9019576Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/schemas.py::CompilerWrapper.post_compile:0 2025-12-04T13:48:22.9020092Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/schemas.py::InductorWrapper.post_compile:0, line 1166 <- wrt source file 2025-12-04T13:48:22.9020618Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/schemas.py::InductorWrapper.post_compile:0 2025-12-04T13:48:22.9021111Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/associative_scan.py::associative_scan:0, line 183 <- wrt source file 2025-12-04T13:48:22.9021617Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/associative_scan.py::associative_scan:0 2025-12-04T13:48:22.9022155Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/associative_scan.py::generic_associative_scan:0, line 319 <- wrt source file 2025-12-04T13:48:22.9022669Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/associative_scan.py::generic_associative_scan:0 2025-12-04T13:48:22.9023120Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/cond.py::cond:0, line 139 <- wrt source file 2025-12-04T13:48:22.9023535Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/cond.py::cond:0 2025-12-04T13:48:22.9023977Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/flat_apply.py::FlatApply.__call__:0, line 80 <- wrt source file 2025-12-04T13:48:22.9024523Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/flat_apply.py::FlatApply.__call__:0 2025-12-04T13:48:22.9024949Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/map.py::map:0, line 80 <- wrt source file 2025-12-04T13:48:22.9025367Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/map.py::map:0 2025-12-04T13:48:22.9025892Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/partitioner.py::HopPartitionedGraph._reorder_fw_output:0, line 133 <- wrt source file 2025-12-04T13:48:22.9026444Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/partitioner.py::HopPartitionedGraph._reorder_fw_output:0 2025-12-04T13:48:22.9026960Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/scan.py::scan:0, line 130 <- wrt source file 2025-12-04T13:48:22.9027377Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_higher_order_ops/scan.py::scan:0 2025-12-04T13:48:22.9027804Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/codecache.py::WritableTempFile:0, line 385 <- wrt source file 2025-12-04T13:48:22.9028254Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/codecache.py::WritableTempFile:0 2025-12-04T13:48:22.9028727Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/cpp_builder.py::get_name_and_dir_from_output_file_path:0, line 1845 <- wrt source file 2025-12-04T13:48:22.9029235Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/cpp_builder.py::get_name_and_dir_from_output_file_path:0 2025-12-04T13:48:22.9029727Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/select_algorithm.py::add_preprocessing_fn:0, line 4328 <- wrt source file 2025-12-04T13:48:22.9030210Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/select_algorithm.py::add_preprocessing_fn:0 2025-12-04T13:48:22.9030677Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/fx_passes/graph_view.py::_clean_stack_name:0, line 100 <- wrt source file 2025-12-04T13:48:22.9031151Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/fx_passes/graph_view.py::_clean_stack_name:0 2025-12-04T13:48:22.9031621Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/kernel/custom_op.py::CustomOpConfig:0, line 56 <- wrt source file 2025-12-04T13:48:22.9032136Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/kernel/custom_op.py::CustomOpConfig:0 2025-12-04T13:48:22.9032611Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/kernel/custom_op.py::register_custom_op_autotuning:0, line 423 <- wrt source file 2025-12-04T13:48:22.9033115Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/kernel/custom_op.py::register_custom_op_autotuning:0 2025-12-04T13:48:22.9033612Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/runtime/caching/locks.py::_acquire_lock_with_timeout:0, line 69 <- wrt source file 2025-12-04T13:48:22.9034127Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/runtime/caching/locks.py::_acquire_lock_with_timeout:0 2025-12-04T13:48:22.9034635Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/runtime/caching/locks.py::_unsafe_acquire_lock_with_timeout:0, line 105 <- wrt source file 2025-12-04T13:48:22.9035190Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/runtime/caching/locks.py::_unsafe_acquire_lock_with_timeout:0 2025-12-04T13:48:22.9035701Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/runtime/caching/locks.py::_acquire_flock_with_timeout:0, line 142 <- wrt source file 2025-12-04T13:48:22.9036216Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/runtime/caching/locks.py::_acquire_flock_with_timeout:0 2025-12-04T13:48:22.9036764Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/runtime/caching/locks.py::_unsafe_acquire_flock_with_timeout:0, line 179 <- wrt source file 2025-12-04T13:48:22.9037299Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/runtime/caching/locks.py::_unsafe_acquire_flock_with_timeout:0 2025-12-04T13:48:22.9037853Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/template_heuristics/registry.py::register_template_heuristic:0, line 54 <- wrt source file 2025-12-04T13:48:22.9038390Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_inductor/template_heuristics/registry.py::register_template_heuristic:0 2025-12-04T13:48:22.9038848Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::custom_op:0, line 101 <- wrt source file 2025-12-04T13:48:22.9222663Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::custom_op:0 2025-12-04T13:48:22.9223124Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.set_kernel_enabled:0, line 241 <- wrt source file 2025-12-04T13:48:22.9263918Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.set_kernel_enabled:0 2025-12-04T13:48:22.9264404Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.register_kernel:0, line 310 <- wrt source file 2025-12-04T13:48:22.9264885Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.register_kernel:0 2025-12-04T13:48:22.9265359Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.register_autograd:0, line 549 <- wrt source file 2025-12-04T13:48:22.9343488Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.register_autograd:0 2025-12-04T13:48:22.9343964Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.register_vmap:0, line 724 <- wrt source file 2025-12-04T13:48:22.9421109Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.register_vmap:0 2025-12-04T13:48:22.9421586Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.register_autocast:0, line 810 <- wrt source file 2025-12-04T13:48:22.9422118Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py::CustomOpDef.register_autocast:0 2025-12-04T13:48:22.9422598Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/fake_class_registry.py::register_fake_class:0, line 273 <- wrt source file 2025-12-04T13:48:22.9423073Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/fake_class_registry.py::register_fake_class:0 2025-12-04T13:48:22.9423543Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/fake_impl.py::FakeImplCtx.new_dynamic_size:0, line 175 <- wrt source file 2025-12-04T13:48:22.9457124Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/fake_impl.py::FakeImplCtx.new_dynamic_size:0 2025-12-04T13:48:22.9457578Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/infer_schema.py::infer_schema:0, line 53 <- wrt source file 2025-12-04T13:48:22.9460622Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/infer_schema.py::infer_schema:0 2025-12-04T13:48:22.9461092Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/triton.py::triton_op:0, line 136 <- wrt source file 2025-12-04T13:48:22.9461499Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/triton.py::triton_op:0 2025-12-04T13:48:22.9461954Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/triton.py::wrap_triton:0, line 307 <- wrt source file 2025-12-04T13:48:22.9462367Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/triton.py::wrap_triton:0 2025-12-04T13:48:22.9462767Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_logging/_internal.py::set_logs:0, line 460 <- wrt source file 2025-12-04T13:48:22.9463183Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_logging/_internal.py::set_logs:0 2025-12-04T13:48:22.9463601Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_equal:0, line 171 <- wrt source file 2025-12-04T13:48:22.9483250Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_equal:0 2025-12-04T13:48:22.9483718Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::print_assert_equal:0, line 302 <- wrt source file 2025-12-04T13:48:22.9484179Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::print_assert_equal:0 2025-12-04T13:48:22.9484623Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_almost_equal:0, line 375 <- wrt source file 2025-12-04T13:48:22.9503248Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_almost_equal:0 2025-12-04T13:48:22.9503703Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_approx_equal:0, line 496 <- wrt source file 2025-12-04T13:48:22.9504926Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_approx_equal:0 2025-12-04T13:48:22.9505381Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_equal:0, line 793 <- wrt source file 2025-12-04T13:48:22.9531513Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_equal:0 2025-12-04T13:48:22.9532053Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_almost_equal:0, line 899 <- wrt source file 2025-12-04T13:48:22.9555765Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_almost_equal:0 2025-12-04T13:48:22.9556234Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_less:0, line 1008 <- wrt source file 2025-12-04T13:48:22.9577625Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_less:0 2025-12-04T13:48:22.9578075Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_string_equal:0, line 1073 <- wrt source file 2025-12-04T13:48:22.9578563Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_string_equal:0 2025-12-04T13:48:22.9579003Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_allclose:0, line 1294 <- wrt source file 2025-12-04T13:48:22.9585752Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_allclose:0 2025-12-04T13:48:22.9586275Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_almost_equal_nulp:0, line 1360 <- wrt source file 2025-12-04T13:48:22.9587865Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_almost_equal_nulp:0 2025-12-04T13:48:22.9588360Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_max_ulp:0, line 1423 <- wrt source file 2025-12-04T13:48:22.9589901Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_array_max_ulp:0 2025-12-04T13:48:22.9590346Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::nulp_diff:0, line 1468 <- wrt source file 2025-12-04T13:48:22.9590783Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::nulp_diff:0 2025-12-04T13:48:22.9591212Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_warns:0, line 1578 <- wrt source file 2025-12-04T13:48:22.9593751Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::assert_warns:0 2025-12-04T13:48:22.9594206Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::clear_and_catch_warnings:0, line 1881 <- wrt source file 2025-12-04T13:48:22.9594685Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_numpy/testing/utils.py::clear_and_catch_warnings:0 2025-12-04T13:48:22.9595122Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_prims/context.py::TorchRefsMode:0, line 95 <- wrt source file 2025-12-04T13:48:22.9595556Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_prims/context.py::TorchRefsMode:0 2025-12-04T13:48:22.9596015Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_subclasses/complex_tensor/_ops/common.py::is_complex_tensor:0, line 47 <- wrt source file 2025-12-04T13:48:22.9596520Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_subclasses/complex_tensor/_ops/common.py::is_complex_tensor:0 2025-12-04T13:48:22.9596969Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/amp/grad_scaler.py::GradScaler:0, line 64 <- wrt source file 2025-12-04T13:48:22.9597397Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/amp/grad_scaler.py::GradScaler:0 2025-12-04T13:48:22.9597862Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/qat/modules/linear_relu.py::LinearReLU:0, line 34 <- wrt source file 2025-12-04T13:48:22.9598365Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/qat/modules/linear_relu.py::LinearReLU:0 2025-12-04T13:48:22.9598888Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/quantized/dynamic/modules/linear_relu.py::LinearReLU:0, line 24 <- wrt source file 2025-12-04T13:48:22.9599482Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/quantized/dynamic/modules/linear_relu.py::LinearReLU:0 2025-12-04T13:48:22.9600011Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/quantized/modules/linear_relu.py::LinearReLU:0, line 25 <- wrt source file 2025-12-04T13:48:22.9600543Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/quantized/modules/linear_relu.py::LinearReLU:0 2025-12-04T13:48:22.9601105Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/quantized/modules/linear_relu.py::LinearLeakyReLU:0, line 67 <- wrt source file 2025-12-04T13:48:22.9601656Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/quantized/modules/linear_relu.py::LinearLeakyReLU:0 2025-12-04T13:48:22.9602286Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/quantized/modules/linear_relu.py::LinearTanh:0, line 142 <- wrt source file 2025-12-04T13:48:22.9602826Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/intrinsic/quantized/modules/linear_relu.py::LinearTanh:0 2025-12-04T13:48:22.9603305Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantizable/modules/rnn.py::LSTMCell:0, line 29 <- wrt source file 2025-12-04T13:48:23.0074587Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantizable/modules/rnn.py::LSTMCell:0 2025-12-04T13:48:23.0075072Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantizable/modules/rnn.py::LSTM:0, line 413 <- wrt source file 2025-12-04T13:48:23.1580058Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantizable/modules/rnn.py::LSTM:0 2025-12-04T13:48:23.1580563Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/functional.py::conv1d:0, line 210 <- wrt source file 2025-12-04T13:48:23.1581043Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/functional.py::conv1d:0 2025-12-04T13:48:23.1581481Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/functional.py::conv2d:0, line 282 <- wrt source file 2025-12-04T13:48:23.1582117Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/functional.py::conv2d:0 2025-12-04T13:48:23.1582563Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/functional.py::conv3d:0, line 358 <- wrt source file 2025-12-04T13:48:23.1583003Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/functional.py::conv3d:0 2025-12-04T13:48:23.1583455Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/__init__.py::Quantize:0, line 95 <- wrt source file 2025-12-04T13:48:23.1583935Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/__init__.py::Quantize:0 2025-12-04T13:48:23.1584392Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/__init__.py::DeQuantize:0, line 145 <- wrt source file 2025-12-04T13:48:23.1585777Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/__init__.py::DeQuantize:0 2025-12-04T13:48:23.1586258Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::Conv1d:0, line 43 <- wrt source file 2025-12-04T13:48:23.1586743Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::Conv1d:0 2025-12-04T13:48:23.1587482Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::Conv2d:0, line 126 <- wrt source file 2025-12-04T13:48:23.1587974Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::Conv2d:0 2025-12-04T13:48:23.1588442Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::Conv3d:0, line 212 <- wrt source file 2025-12-04T13:48:23.1589007Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::Conv3d:0 2025-12-04T13:48:23.1589506Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::ConvTranspose1d:0, line 300 <- wrt source file 2025-12-04T13:48:23.1590028Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::ConvTranspose1d:0 2025-12-04T13:48:23.1590583Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::ConvTranspose2d:0, line 383 <- wrt source file 2025-12-04T13:48:23.1591105Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::ConvTranspose2d:0 2025-12-04T13:48:23.1591605Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::ConvTranspose3d:0, line 466 <- wrt source file 2025-12-04T13:48:23.1592159Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/conv.py::ConvTranspose3d:0 2025-12-04T13:48:23.1592651Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/linear.py::Linear:0, line 30 <- wrt source file 2025-12-04T13:48:23.1593148Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/linear.py::Linear:0 2025-12-04T13:48:23.1593618Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::LSTM:0, line 516 <- wrt source file 2025-12-04T13:48:23.1594086Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::LSTM:0 2025-12-04T13:48:23.1594541Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::GRU:0, line 803 <- wrt source file 2025-12-04T13:48:23.1595006Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::GRU:0 2025-12-04T13:48:23.1595469Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::RNNCell:0, line 1209 <- wrt source file 2025-12-04T13:48:23.1595956Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::RNNCell:0 2025-12-04T13:48:23.1596426Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::LSTMCell:0, line 1276 <- wrt source file 2025-12-04T13:48:23.1596906Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::LSTMCell:0 2025-12-04T13:48:23.1597378Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::GRUCell:0, line 1329 <- wrt source file 2025-12-04T13:48:23.1597852Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/dynamic/modules/rnn.py::GRUCell:0 2025-12-04T13:48:23.1598314Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/activation.py::ReLU6:0, line 36 <- wrt source file 2025-12-04T13:48:23.1598823Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/activation.py::ReLU6:0 2025-12-04T13:48:23.1599269Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::Conv1d:0, line 376 <- wrt source file 2025-12-04T13:48:23.1599715Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::Conv1d:0 2025-12-04T13:48:23.1600182Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::Conv2d:0, line 506 <- wrt source file 2025-12-04T13:48:23.1600626Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::Conv2d:0 2025-12-04T13:48:23.1601095Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::Conv3d:0, line 636 <- wrt source file 2025-12-04T13:48:23.1601543Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::Conv3d:0 2025-12-04T13:48:23.1602045Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::ConvTranspose1d:0, line 893 <- wrt source file 2025-12-04T13:48:23.1602522Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::ConvTranspose1d:0 2025-12-04T13:48:23.1602994Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::ConvTranspose2d:0, line 1015 <- wrt source file 2025-12-04T13:48:23.1603473Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::ConvTranspose2d:0 2025-12-04T13:48:23.1603943Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::ConvTranspose3d:0, line 1141 <- wrt source file 2025-12-04T13:48:23.1604421Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/conv.py::ConvTranspose3d:0 2025-12-04T13:48:23.1604896Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/embedding_ops.py::Embedding:0, line 111 <- wrt source file 2025-12-04T13:48:23.1814600Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/embedding_ops.py::Embedding:0 2025-12-04T13:48:23.1815099Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/embedding_ops.py::EmbeddingBag:0, line 275 <- wrt source file 2025-12-04T13:48:23.2034004Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/embedding_ops.py::EmbeddingBag:0 2025-12-04T13:48:23.2034533Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/functional_modules.py::FloatFunctional:0, line 23 <- wrt source file 2025-12-04T13:48:23.2037094Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/functional_modules.py::FloatFunctional:0 2025-12-04T13:48:23.2037614Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/functional_modules.py::QFunctional:0, line 176 <- wrt source file 2025-12-04T13:48:23.2039415Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/functional_modules.py::QFunctional:0 2025-12-04T13:48:23.2039900Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/linear.py::Linear:0, line 135 <- wrt source file 2025-12-04T13:48:23.2040578Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/linear.py::Linear:0 2025-12-04T13:48:23.2042008Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/rnn.py::LSTM:0, line 24 <- wrt source file 2025-12-04T13:48:23.2042609Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/nn/quantized/modules/rnn.py::LSTM:0 2025-12-04T13:48:23.2043531Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/_experimental/data_scheduler/base_data_scheduler.py::BaseDataScheduler.get_schedule_param:0, line 98 <- wrt source file 2025-12-04T13:48:23.2060387Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/_experimental/data_scheduler/base_data_scheduler.py::BaseDataScheduler.get_schedule_param:0 2025-12-04T13:48:23.2061104Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/_experimental/data_sparsifier/base_data_sparsifier.py::BaseDataSparsifier:0, line 55 <- wrt source file 2025-12-04T13:48:23.2061944Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/_experimental/data_sparsifier/base_data_sparsifier.py::BaseDataSparsifier:0 2025-12-04T13:48:23.2062483Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/scheduler/lambda_scheduler.py::LambdaSL:0, line 24 <- wrt source file 2025-12-04T13:48:23.2064409Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/scheduler/lambda_scheduler.py::LambdaSL:0 2025-12-04T13:48:23.2065490Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/sparsifier/base_sparsifier.py::BaseSparsifier:0, line 47 <- wrt source file 2025-12-04T13:48:23.2066020Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/sparsifier/base_sparsifier.py::BaseSparsifier:0 2025-12-04T13:48:23.2066553Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/sparsifier/base_sparsifier.py::BaseSparsifier.squash_mask:0, line 251 <- wrt source file 2025-12-04T13:48:23.2068336Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/sparsifier/base_sparsifier.py::BaseSparsifier.squash_mask:0 2025-12-04T13:48:23.2068837Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuse_modules.py::fuse_modules:0, line 175 <- wrt source file 2025-12-04T13:48:23.2069314Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuse_modules.py::fuse_modules:0 2025-12-04T13:48:23.2069792Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuser_method_mappings.py::fuse_conv_bn:0, line 32 <- wrt source file 2025-12-04T13:48:23.2077436Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuser_method_mappings.py::fuse_conv_bn:0 2025-12-04T13:48:23.2077933Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuser_method_mappings.py::fuse_conv_bn_relu:0, line 83 <- wrt source file 2025-12-04T13:48:23.2081516Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuser_method_mappings.py::fuse_conv_bn_relu:0 2025-12-04T13:48:23.2082669Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuser_method_mappings.py::fuse_linear_bn:0, line 143 <- wrt source file 2025-12-04T13:48:23.2085991Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuser_method_mappings.py::fuse_linear_bn:0 2025-12-04T13:48:23.2086769Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuser_method_mappings.py::fuse_convtranspose_bn:0, line 182 <- wrt source file 2025-12-04T13:48:23.2091147Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fuser_method_mappings.py::fuse_convtranspose_bn:0 2025-12-04T13:48:23.2091629Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/observer.py::_with_args:0, line 110 <- wrt source file 2025-12-04T13:48:23.2092169Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/observer.py::_with_args:0 2025-12-04T13:48:23.2092662Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/observer.py::_with_callable_args:0, line 132 <- wrt source file 2025-12-04T13:48:23.2093131Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/observer.py::_with_callable_args:0 2025-12-04T13:48:23.2093619Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::fuse_fx:0, line 218 <- wrt source file 2025-12-04T13:48:23.2094148Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::fuse_fx:0 2025-12-04T13:48:23.2095732Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::prepare_fx:0, line 288 <- wrt source file 2025-12-04T13:48:23.2096198Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::prepare_fx:0 2025-12-04T13:48:23.2097261Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::prepare_qat_fx:0, line 427 <- wrt source file 2025-12-04T13:48:23.2098376Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::prepare_qat_fx:0 2025-12-04T13:48:23.2098911Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::convert_fx:0, line 608 <- wrt source file 2025-12-04T13:48:23.2099376Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::convert_fx:0 2025-12-04T13:48:23.2099888Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::convert_to_reference_fx:0, line 668 <- wrt source file 2025-12-04T13:48:23.2100421Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::convert_to_reference_fx:0 2025-12-04T13:48:23.2100929Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::_convert_to_reference_decomposed_fx:0, line 720 <- wrt source file 2025-12-04T13:48:23.2101466Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_fx.py::_convert_to_reference_decomposed_fx:0 2025-12-04T13:48:23.2102037Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_pt2e.py::prepare_pt2e:0, line 51 <- wrt source file 2025-12-04T13:48:23.2102506Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_pt2e.py::prepare_pt2e:0 2025-12-04T13:48:23.2102967Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_pt2e.py::prepare_qat_pt2e:0, line 130 <- wrt source file 2025-12-04T13:48:23.2103522Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_pt2e.py::prepare_qat_pt2e:0 2025-12-04T13:48:23.2103987Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_pt2e.py::convert_pt2e:0, line 228 <- wrt source file 2025-12-04T13:48:23.2104647Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/quantize_pt2e.py::convert_pt2e:0 2025-12-04T13:48:23.2105101Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::get_combined_dict:0, line 171 <- wrt source file 2025-12-04T13:48:23.2105853Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::get_combined_dict:0 2025-12-04T13:48:23.2106458Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_get_path_of_module:0, line 553 <- wrt source file 2025-12-04T13:48:23.2106941Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_get_path_of_module:0 2025-12-04T13:48:23.2107422Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_get_signature_locals:0, line 575 <- wrt source file 2025-12-04T13:48:23.2107941Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_get_signature_locals:0 2025-12-04T13:48:23.2108397Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_get_default_kwargs:0, line 589 <- wrt source file 2025-12-04T13:48:23.2123326Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_get_default_kwargs:0 2025-12-04T13:48:23.2123810Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_normalize_kwargs:0, line 611 <- wrt source file 2025-12-04T13:48:23.2124261Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_normalize_kwargs:0 2025-12-04T13:48:23.2124706Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_get_num_pos_args:0, line 738 <- wrt source file 2025-12-04T13:48:23.2125154Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/utils.py::_get_num_pos_args:0 2025-12-04T13:48:23.2125639Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/backend_config/backend_config.py::DTypeConfig:0, line 216 <- wrt source file 2025-12-04T13:48:23.2126159Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/backend_config/backend_config.py::DTypeConfig:0 2025-12-04T13:48:23.2126702Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/backend_config/onednn.py::_fuse_linear_bn_leaky_relu:0, line 85 <- wrt source file 2025-12-04T13:48:23.2134157Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/backend_config/onednn.py::_fuse_linear_bn_leaky_relu:0 2025-12-04T13:48:23.2134696Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report.py::ModelReport:0, line 85 <- wrt source file 2025-12-04T13:48:23.2135224Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report.py::ModelReport:0 2025-12-04T13:48:23.2135821Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report_visualizer.py::ModelReportVisualizer.generate_filtered_tables:0, line 341 <- wrt source file 2025-12-04T13:48:23.2136537Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report_visualizer.py::ModelReportVisualizer.generate_filtered_tables:0 2025-12-04T13:48:23.2137205Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report_visualizer.py::ModelReportVisualizer.generate_table_visualization:0, line 429 <- wrt source file 2025-12-04T13:48:23.2138027Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report_visualizer.py::ModelReportVisualizer.generate_table_visualization:0 2025-12-04T13:48:23.2138702Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report_visualizer.py::ModelReportVisualizer.generate_plot_visualization:0, line 591 <- wrt source file 2025-12-04T13:48:23.2139436Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report_visualizer.py::ModelReportVisualizer.generate_plot_visualization:0 2025-12-04T13:48:23.2140149Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report_visualizer.py::ModelReportVisualizer.generate_histogram_visualization:0, line 664 <- wrt source file 2025-12-04T13:48:23.2140850Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/fx/_model_report/model_report_visualizer.py::ModelReportVisualizer.generate_histogram_visualization:0 2025-12-04T13:48:23.2141451Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/pt2e/_affine_quantization.py::_get_reduction_params:0, line 104 <- wrt source file 2025-12-04T13:48:23.2142057Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/pt2e/_affine_quantization.py::_get_reduction_params:0 2025-12-04T13:48:23.2142586Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/pt2e/_affine_quantization.py::_register_custom_op:0, line 155 <- wrt source file 2025-12-04T13:48:23.2143123Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/pt2e/_affine_quantization.py::_register_custom_op:0 2025-12-04T13:48:23.2143647Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/pt2e/prepare.py::_get_edge_or_node_to_group_id:0, line 189 <- wrt source file 2025-12-04T13:48:23.2144174Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/pt2e/prepare.py::_get_edge_or_node_to_group_id:0 2025-12-04T13:48:23.2144709Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/pt2e/utils.py::_replace_literals_with_new_placeholders:0, line 442 <- wrt source file 2025-12-04T13:48:23.2145271Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/quantization/pt2e/utils.py::_replace_literals_with_new_placeholders:0 2025-12-04T13:48:23.2145759Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/anomaly_mode.py::detect_anomaly:0, line 28 <- wrt source file 2025-12-04T13:48:23.2146214Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/anomaly_mode.py::detect_anomaly:0 2025-12-04T13:48:23.2146639Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/forward_ad.py::make_dual:0, line 82 <- wrt source file 2025-12-04T13:48:23.2147066Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/forward_ad.py::make_dual:0 2025-12-04T13:48:23.2147494Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/forward_ad.py::unpack_dual:0, line 151 <- wrt source file 2025-12-04T13:48:23.2147927Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/forward_ad.py::unpack_dual:0 2025-12-04T13:48:23.2148346Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/forward_ad.py::dual_level:0, line 187 <- wrt source file 2025-12-04T13:48:23.2148795Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/forward_ad.py::dual_level:0 2025-12-04T13:48:23.2149244Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.save_for_backward:0, line 72 <- wrt source file 2025-12-04T13:48:23.2149734Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.save_for_backward:0 2025-12-04T13:48:23.2150248Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.save_for_forward:0, line 116 <- wrt source file 2025-12-04T13:48:23.2150732Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.save_for_forward:0 2025-12-04T13:48:23.2151217Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.mark_dirty:0, line 169 <- wrt source file 2025-12-04T13:48:23.2151684Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.mark_dirty:0 2025-12-04T13:48:23.2152211Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.mark_non_differentiable:0, line 216 <- wrt source file 2025-12-04T13:48:23.2152721Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.mark_non_differentiable:0 2025-12-04T13:48:23.2153219Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.set_materialize_grads:0, line 245 <- wrt source file 2025-12-04T13:48:23.2153722Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::FunctionCtx.set_materialize_grads:0 2025-12-04T13:48:23.2154174Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::Function:0, line 487 <- wrt source file 2025-12-04T13:48:23.2154594Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/function.py::Function:0 2025-12-04T13:48:23.2154998Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::vjp:0, line 300 <- wrt source file 2025-12-04T13:48:23.2155443Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::vjp:0 2025-12-04T13:48:23.2155846Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::jvp:0, line 402 <- wrt source file 2025-12-04T13:48:23.2156259Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::jvp:0 2025-12-04T13:48:23.2156675Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::jacobian:0, line 642 <- wrt source file 2025-12-04T13:48:23.2157104Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::jacobian:0 2025-12-04T13:48:23.2157521Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::hessian:0, line 907 <- wrt source file 2025-12-04T13:48:23.2157943Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::hessian:0 2025-12-04T13:48:23.2158353Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::vhp:0, line 1026 <- wrt source file 2025-12-04T13:48:23.2158766Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::vhp:0 2025-12-04T13:48:23.2159168Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::hvp:0, line 1125 <- wrt source file 2025-12-04T13:48:23.2159608Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/functional.py::hvp:0 2025-12-04T13:48:23.2160009Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/grad_mode.py::no_grad:0, line 50 <- wrt source file 2025-12-04T13:48:23.2160420Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/grad_mode.py::no_grad:0 2025-12-04T13:48:23.2160845Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/grad_mode.py::enable_grad:0, line 108 <- wrt source file 2025-12-04T13:48:23.2161297Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/grad_mode.py::enable_grad:0 2025-12-04T13:48:23.2161724Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/grad_mode.py::set_grad_enabled:0, line 166 <- wrt source file 2025-12-04T13:48:23.2162248Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/grad_mode.py::set_grad_enabled:0 2025-12-04T13:48:23.2162680Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/grad_mode.py::inference_mode:0, line 252 <- wrt source file 2025-12-04T13:48:23.2163114Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/grad_mode.py::inference_mode:0 2025-12-04T13:48:23.2163526Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::Node.name:0, line 60 <- wrt source file 2025-12-04T13:48:23.2163935Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::Node.name:0 2025-12-04T13:48:23.2164346Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::Node.register_hook:0, line 117 <- wrt source file 2025-12-04T13:48:23.2164782Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::Node.register_hook:0 2025-12-04T13:48:23.2165211Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::Node.register_prehook:0, line 154 <- wrt source file 2025-12-04T13:48:23.2165671Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::Node.register_prehook:0 2025-12-04T13:48:23.2166105Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::saved_tensors_hooks:0, line 292 <- wrt source file 2025-12-04T13:48:23.2166544Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::saved_tensors_hooks:0 2025-12-04T13:48:23.2166957Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::save_on_cpu:0, line 362 <- wrt source file 2025-12-04T13:48:23.2167368Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::save_on_cpu:0 2025-12-04T13:48:23.2167796Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::disable_saved_tensors_hooks:0, line 419 <- wrt source file 2025-12-04T13:48:23.2168257Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::disable_saved_tensors_hooks:0 2025-12-04T13:48:23.2168707Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::register_multi_grad_hook:0, line 503 <- wrt source file 2025-12-04T13:48:23.2170731Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::register_multi_grad_hook:0 2025-12-04T13:48:23.2171199Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::allow_mutation_on_saved_tensors:0, line 777 <- wrt source file 2025-12-04T13:48:23.2181990Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py::allow_mutation_on_saved_tensors:0 2025-12-04T13:48:23.2182428Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler.py::profile:0, line 182 <- wrt source file 2025-12-04T13:48:23.2182871Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler.py::profile:0 2025-12-04T13:48:23.2183338Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler.py::record_function:0, line 760 <- wrt source file 2025-12-04T13:48:23.2183801Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler.py::record_function:0 2025-12-04T13:48:23.2184222Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler.py::emit_itt:0, line 899 <- wrt source file 2025-12-04T13:48:23.2184672Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler.py::emit_itt:0 2025-12-04T13:48:23.2185088Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler.py::emit_nvtx:0, line 972 <- wrt source file 2025-12-04T13:48:23.2185499Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler.py::emit_nvtx:0 2025-12-04T13:48:23.2185911Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler_util.py::EventList:0, line 60 <- wrt source file 2025-12-04T13:48:23.2186354Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/profiler_util.py::EventList:0 2025-12-04T13:48:23.2186759Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/gds.py::gds_register_buffer:0, line 43 <- wrt source file 2025-12-04T13:48:23.2187174Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/gds.py::gds_register_buffer:0 2025-12-04T13:48:23.2187577Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/gds.py::gds_deregister_buffer:0, line 59 <- wrt source file 2025-12-04T13:48:23.2187994Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/gds.py::gds_deregister_buffer:0 2025-12-04T13:48:23.2188381Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/gds.py::GdsFile:0, line 86 <- wrt source file 2025-12-04T13:48:23.2188762Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/gds.py::GdsFile:0 2025-12-04T13:48:23.2189165Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/jiterator.py::_create_jit_fn:0, line 114 <- wrt source file 2025-12-04T13:48:23.2189581Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/jiterator.py::_create_jit_fn:0 2025-12-04T13:48:23.2189984Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/jiterator.py::_create_jit_fn:1, line 125 <- wrt source file 2025-12-04T13:48:23.2190395Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/jiterator.py::_create_jit_fn:1 2025-12-04T13:48:23.2190797Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/jiterator.py::_create_jit_fn:2, line 140 <- wrt source file 2025-12-04T13:48:23.2191213Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/jiterator.py::_create_jit_fn:2 2025-12-04T13:48:23.2191637Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/jiterator.py::_create_multi_output_jit_fn:0, line 173 <- wrt source file 2025-12-04T13:48:23.2192138Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/jiterator.py::_create_multi_output_jit_fn:0 2025-12-04T13:48:23.2192569Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/profiler.py::profile:0, line 75 <- wrt source file 2025-12-04T13:48:23.2192960Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/profiler.py::profile:0 2025-12-04T13:48:23.2193400Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_mesh_layout.py::_MeshLayout.composition:0, line 125 <- wrt source file 2025-12-04T13:48:23.2193928Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_mesh_layout.py::_MeshLayout.composition:0 2025-12-04T13:48:23.2194407Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_mesh_layout.py::_MeshLayout.complement:0, line 142 <- wrt source file 2025-12-04T13:48:23.2194904Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_mesh_layout.py::_MeshLayout.complement:0 2025-12-04T13:48:23.2195380Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_mesh_layout.py::_MeshLayout.remap_to_tensor:0, line 281 <- wrt source file 2025-12-04T13:48:23.2195869Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_mesh_layout.py::_MeshLayout.remap_to_tensor:0 2025-12-04T13:48:23.2196337Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/device_mesh.py::DeviceMesh:0, line 167 <- wrt source file 2025-12-04T13:48:23.2196777Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/device_mesh.py::DeviceMesh:0 2025-12-04T13:48:23.2197232Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/device_mesh.py::DeviceMesh.get_local_rank:0, line 1027 <- wrt source file 2025-12-04T13:48:23.2197719Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/device_mesh.py::DeviceMesh.get_local_rank:0 2025-12-04T13:48:23.2198179Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/device_mesh.py::init_device_mesh:0, line 1317 <- wrt source file 2025-12-04T13:48:23.2198660Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/device_mesh.py::init_device_mesh:0 2025-12-04T13:48:23.2199126Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::_coalescing_manager:0, line 2652 <- wrt source file 2025-12-04T13:48:23.2199613Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::_coalescing_manager:0 2025-12-04T13:48:23.2200081Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::_time_estimator:0, line 2754 <- wrt source file 2025-12-04T13:48:23.2200555Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::_time_estimator:0 2025-12-04T13:48:23.2201019Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::batch_isend_irecv:0, line 2801 <- wrt source file 2025-12-04T13:48:23.2201496Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::batch_isend_irecv:0 2025-12-04T13:48:23.2202002Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_reduce:0, line 2938 <- wrt source file 2025-12-04T13:48:23.2202458Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_reduce:0 2025-12-04T13:48:23.2202924Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_gather_object:0, line 3221 <- wrt source file 2025-12-04T13:48:23.2203419Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_gather_object:0 2025-12-04T13:48:23.2203878Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::gather_object:0, line 3325 <- wrt source file 2025-12-04T13:48:23.2204364Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::gather_object:0 2025-12-04T13:48:23.2204840Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::send_object_list:0, line 3457 <- wrt source file 2025-12-04T13:48:23.2205313Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::send_object_list:0 2025-12-04T13:48:23.2205792Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::recv_object_list:0, line 3574 <- wrt source file 2025-12-04T13:48:23.2206265Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::recv_object_list:0 2025-12-04T13:48:23.2206741Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::broadcast_object_list:0, line 3719 <- wrt source file 2025-12-04T13:48:23.2207233Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::broadcast_object_list:0 2025-12-04T13:48:23.2207714Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::scatter_object_list:0, line 3844 <- wrt source file 2025-12-04T13:48:23.2208199Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::scatter_object_list:0 2025-12-04T13:48:23.2208658Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_gather:0, line 3947 <- wrt source file 2025-12-04T13:48:23.2209114Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_gather:0 2025-12-04T13:48:23.2209577Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_gather_into_tensor:0, line 4054 <- wrt source file 2025-12-04T13:48:23.2210073Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_gather_into_tensor:0 2025-12-04T13:48:23.2210575Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_gather_coalesced:0, line 4192 <- wrt source file 2025-12-04T13:48:23.2211062Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_gather_coalesced:0 2025-12-04T13:48:23.2211517Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::gather:0, line 4298 <- wrt source file 2025-12-04T13:48:23.2212038Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::gather:0 2025-12-04T13:48:23.2212476Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::scatter:0, line 4383 <- wrt source file 2025-12-04T13:48:23.2212930Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::scatter:0 2025-12-04T13:48:23.2213388Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::reduce_scatter_tensor:0, line 4521 <- wrt source file 2025-12-04T13:48:23.2213898Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::reduce_scatter_tensor:0 2025-12-04T13:48:23.2214371Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_to_all_single:0, line 4663 <- wrt source file 2025-12-04T13:48:23.2214844Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_to_all_single:0 2025-12-04T13:48:23.2215338Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_to_all:0, line 4797 <- wrt source file 2025-12-04T13:48:23.2215801Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::all_to_all:0 2025-12-04T13:48:23.2216274Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::monitored_barrier:0, line 5009 <- wrt source file 2025-12-04T13:48:23.2216757Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::monitored_barrier:0 2025-12-04T13:48:23.2217218Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::new_subgroups:0, line 5562 <- wrt source file 2025-12-04T13:48:23.2217685Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::new_subgroups:0 2025-12-04T13:48:23.2218168Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::new_subgroups_by_enumeration:0, line 5656 <- wrt source file 2025-12-04T13:48:23.2218684Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py::new_subgroups_by_enumeration:0 2025-12-04T13:48:23.2219134Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/launch.py::__doc__:0, line 84 <- wrt source file 2025-12-04T13:48:23.2219546Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/launch.py::__doc__:0 2025-12-04T13:48:23.2219936Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/run.py::__doc__:0, line 57 <- wrt source file 2025-12-04T13:48:23.2220339Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/run.py::__doc__:0 2025-12-04T13:48:23.2220757Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/autograd/__init__.py::context:0, line 47 <- wrt source file 2025-12-04T13:48:23.2221204Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/autograd/__init__.py::context:0 2025-12-04T13:48:23.2221679Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_composable/checkpoint_activation.py::checkpoint:0, line 53 <- wrt source file 2025-12-04T13:48:23.2222231Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_composable/checkpoint_activation.py::checkpoint:0 2025-12-04T13:48:23.2222703Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_composable/contract.py::contract:0, line 67 <- wrt source file 2025-12-04T13:48:23.2223176Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_composable/contract.py::contract:0 2025-12-04T13:48:23.2223637Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_composable/replicate.py::replicate:0, line 190 <- wrt source file 2025-12-04T13:48:23.2228430Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_composable/replicate.py::replicate:0 2025-12-04T13:48:23.2228951Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_composable/replicate_with_fsdp.py::replicate:0, line 265 <- wrt source file 2025-12-04T13:48:23.2229453Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_composable/replicate_with_fsdp.py::replicate:0 2025-12-04T13:48:23.2229977Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_optim/__init__.py::named_params_with_sharded_tensor:0, line 31 <- wrt source file 2025-12-04T13:48:23.2230576Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_optim/__init__.py::named_params_with_sharded_tensor:0 2025-12-04T13:48:23.2231112Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/__init__.py::init_from_local_shards:0, line 384 <- wrt source file 2025-12-04T13:48:23.2231673Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/__init__.py::init_from_local_shards:0 2025-12-04T13:48:23.2232242Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/__init__.py::custom_sharded_op_impl:0, line 457 <- wrt source file 2025-12-04T13:48:23.2232776Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/__init__.py::custom_sharded_op_impl:0 2025-12-04T13:48:23.2233321Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/api.py::ShardedTensor._init_from_local_tensor:0, line 860 <- wrt source file 2025-12-04T13:48:23.2233898Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/api.py::ShardedTensor._init_from_local_tensor:0 2025-12-04T13:48:23.2234443Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/api.py::ShardedTensor.reshard:0, line 1098 <- wrt source file 2025-12-04T13:48:23.2234986Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/api.py::ShardedTensor.reshard:0 2025-12-04T13:48:23.2235502Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/_ops/_common.py::_sharded_op_common:0, line 18 <- wrt source file 2025-12-04T13:48:23.2236036Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharded_tensor/_ops/_common.py::_sharded_op_common:0 2025-12-04T13:48:23.2236535Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharding_plan/api.py::ShardingPlan:0, line 36 <- wrt source file 2025-12-04T13:48:23.2237032Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_shard/sharding_plan/api.py::ShardingPlan:0 2025-12-04T13:48:23.2237508Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::put:0, line 275 <- wrt source file 2025-12-04T13:48:23.2237996Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::put:0 2025-12-04T13:48:23.2238473Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::get:0, line 328 <- wrt source file 2025-12-04T13:48:23.2238964Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::get:0 2025-12-04T13:48:23.2239446Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::get_nbi:0, line 378 <- wrt source file 2025-12-04T13:48:23.2239970Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::get_nbi:0 2025-12-04T13:48:23.2240476Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::putmem_signal_block:0, line 453 <- wrt source file 2025-12-04T13:48:23.2241014Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::putmem_signal_block:0 2025-12-04T13:48:23.2241566Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::wait_until:0, line 531 <- wrt source file 2025-12-04T13:48:23.2242154Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::wait_until:0 2025-12-04T13:48:23.2242684Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::signal_wait_until:0, line 593 <- wrt source file 2025-12-04T13:48:23.2243222Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::signal_wait_until:0 2025-12-04T13:48:23.2243728Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::signal_op:0, line 651 <- wrt source file 2025-12-04T13:48:23.2244259Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::signal_op:0 2025-12-04T13:48:23.2244751Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::fence:0, line 704 <- wrt source file 2025-12-04T13:48:23.2245245Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::fence:0 2025-12-04T13:48:23.2245729Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::quiet:0, line 750 <- wrt source file 2025-12-04T13:48:23.2246221Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::quiet:0 2025-12-04T13:48:23.2246700Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::my_pe:0, line 794 <- wrt source file 2025-12-04T13:48:23.2247197Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::my_pe:0 2025-12-04T13:48:23.2247675Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::n_pes:0, line 837 <- wrt source file 2025-12-04T13:48:23.2248171Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::n_pes:0 2025-12-04T13:48:23.2248665Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::barrier_all:0, line 888 <- wrt source file 2025-12-04T13:48:23.2249178Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::barrier_all:0 2025-12-04T13:48:23.2249671Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::sync_all:0, line 934 <- wrt source file 2025-12-04T13:48:23.2250177Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::sync_all:0 2025-12-04T13:48:23.2250689Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::alltoall:0, line 973 <- wrt source file 2025-12-04T13:48:23.2251201Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::alltoall:0 2025-12-04T13:48:23.2251699Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::broadcast:0, line 1028 <- wrt source file 2025-12-04T13:48:23.2252279Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::broadcast:0 2025-12-04T13:48:23.2252776Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::reduce:0, line 1089 <- wrt source file 2025-12-04T13:48:23.2253290Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::reduce:0 2025-12-04T13:48:23.2253806Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::reduce_extern_wrapper:0, line 1135 <- wrt source file 2025-12-04T13:48:23.2254356Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_symmetric_memory/_nvshmem_triton.py::reduce_extern_wrapper:0 2025-12-04T13:48:23.2254858Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_tools/memory_tracker.py::MemoryTracker:0, line 55 <- wrt source file 2025-12-04T13:48:23.2255341Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/_tools/memory_tracker.py::MemoryTracker:0 2025-12-04T13:48:23.2255796Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/join.py::Join:0, line 141 <- wrt source file 2025-12-04T13:48:23.2256239Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/join.py::Join:0 2025-12-04T13:48:23.2256728Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/__init__.py::register_ddp_comm_hook:0, line 137 <- wrt source file 2025-12-04T13:48:23.2257284Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/__init__.py::register_ddp_comm_hook:0 2025-12-04T13:48:23.2257823Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/debugging_hooks.py::noop_hook:0, line 23 <- wrt source file 2025-12-04T13:48:23.2258366Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/debugging_hooks.py::noop_hook:0 2025-12-04T13:48:23.2258907Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::allreduce_hook:0, line 51 <- wrt source file 2025-12-04T13:48:23.2259471Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::allreduce_hook:0 2025-12-04T13:48:23.2260018Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::fp16_compress_hook:0, line 110 <- wrt source file 2025-12-04T13:48:23.2260584Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::fp16_compress_hook:0 2025-12-04T13:48:23.2261136Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::bf16_compress_hook:0, line 131 <- wrt source file 2025-12-04T13:48:23.2261753Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::bf16_compress_hook:0 2025-12-04T13:48:23.2262356Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::fp16_compress_wrapper:0, line 149 <- wrt source file 2025-12-04T13:48:23.2262930Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::fp16_compress_wrapper:0 2025-12-04T13:48:23.2263523Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::bf16_compress_wrapper:0, line 188 <- wrt source file 2025-12-04T13:48:23.2264095Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/default_hooks.py::bf16_compress_wrapper:0 2025-12-04T13:48:23.2264704Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/post_localSGD_hook.py::post_localSGD_hook:0, line 91 <- wrt source file 2025-12-04T13:48:23.2265281Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/post_localSGD_hook.py::post_localSGD_hook:0 2025-12-04T13:48:23.2265831Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py::powerSGD_hook:0, line 395 <- wrt source file 2025-12-04T13:48:23.2266385Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py::powerSGD_hook:0 2025-12-04T13:48:23.2266934Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py::batched_powerSGD_hook:0, line 708 <- wrt source file 2025-12-04T13:48:23.2267510Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py::batched_powerSGD_hook:0 2025-12-04T13:48:23.2268095Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/quantization_hooks.py::quantization_pertensor_hook:0, line 64 <- wrt source file 2025-12-04T13:48:23.2268708Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/quantization_hooks.py::quantization_pertensor_hook:0 2025-12-04T13:48:23.2269316Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/quantization_hooks.py::quantization_perchannel_hook:0, line 146 <- wrt source file 2025-12-04T13:48:23.2269938Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/ddp_comm_hooks/quantization_hooks.py::quantization_perchannel_hook:0 2025-12-04T13:48:23.2270523Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/model_averaging/averagers.py::PeriodicModelAverager:0, line 56 <- wrt source file 2025-12-04T13:48:23.2271104Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/model_averaging/averagers.py::PeriodicModelAverager:0 2025-12-04T13:48:23.2271706Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/model_averaging/hierarchical_model_averager.py::HierarchicalModelAverager:0, line 53 <- wrt source file 2025-12-04T13:48:23.2272406Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/algorithms/model_averaging/hierarchical_model_averager.py::HierarchicalModelAverager:0 2025-12-04T13:48:23.2273021Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/format_utils.py::BroadcastingTorchSaveReader:0, line 49 <- wrt source file 2025-12-04T13:48:23.2273571Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/format_utils.py::BroadcastingTorchSaveReader:0 2025-12-04T13:48:23.2274106Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/format_utils.py::DynamicMetaLoadPlanner:0, line 173 <- wrt source file 2025-12-04T13:48:23.2274721Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/format_utils.py::DynamicMetaLoadPlanner:0 2025-12-04T13:48:23.2275258Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/optimizer.py::load_sharded_optimizer_state_dict:0, line 228 <- wrt source file 2025-12-04T13:48:23.2275829Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/optimizer.py::load_sharded_optimizer_state_dict:0 2025-12-04T13:48:23.2276345Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict.py::get_state_dict:0, line 1276 <- wrt source file 2025-12-04T13:48:23.2276835Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict.py::get_state_dict:0 2025-12-04T13:48:23.2277336Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict.py::_patch_model_state_dict:0, line 1531 <- wrt source file 2025-12-04T13:48:23.2277858Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict.py::_patch_model_state_dict:0 2025-12-04T13:48:23.2278375Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict.py::_patch_optimizer_state_dict:0, line 1590 <- wrt source file 2025-12-04T13:48:23.2278914Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict.py::_patch_optimizer_state_dict:0 2025-12-04T13:48:23.2279409Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict_loader.py::load:0, line 131 <- wrt source file 2025-12-04T13:48:23.2279896Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict_loader.py::load:0 2025-12-04T13:48:23.2280369Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict_saver.py::save:0, line 160 <- wrt source file 2025-12-04T13:48:23.2280850Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict_saver.py::save:0 2025-12-04T13:48:23.2281329Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict_saver.py::async_save:0, line 275 <- wrt source file 2025-12-04T13:48:23.2281828Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/state_dict_saver.py::async_save:0 2025-12-04T13:48:23.2282380Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/barriers.py::BarrierConfig:0, line 50 <- wrt source file 2025-12-04T13:48:23.2282923Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/barriers.py::BarrierConfig:0 2025-12-04T13:48:23.2283456Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/builder.py::make_sync_checkpointer:0, line 78 <- wrt source file 2025-12-04T13:48:23.2284013Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/builder.py::make_sync_checkpointer:0 2025-12-04T13:48:23.2284582Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/builder.py::make_async_checkpointer:0, line 139 <- wrt source file 2025-12-04T13:48:23.2285141Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/builder.py::make_async_checkpointer:0 2025-12-04T13:48:23.2285720Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/checkpointer.py::SyncCheckpointer:0, line 104 <- wrt source file 2025-12-04T13:48:23.2286277Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/checkpointer.py::SyncCheckpointer:0 2025-12-04T13:48:23.2286847Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/checkpointer.py::SyncCheckpointer.save:0, line 142 <- wrt source file 2025-12-04T13:48:23.2287423Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/checkpointer.py::SyncCheckpointer.save:0 2025-12-04T13:48:23.2287976Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/checkpointer.py::AsyncCheckpointer:0, line 213 <- wrt source file 2025-12-04T13:48:23.2288538Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/checkpointer.py::AsyncCheckpointer:0 2025-12-04T13:48:23.2289095Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/checkpointer.py::AsyncCheckpointer.save:0, line 260 <- wrt source file 2025-12-04T13:48:23.2289678Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/checkpointer.py::AsyncCheckpointer.save:0 2025-12-04T13:48:23.2290230Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/staging.py::DefaultStager.close:0, line 211 <- wrt source file 2025-12-04T13:48:23.2290784Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/_experimental/staging.py::DefaultStager.close:0 2025-12-04T13:48:23.2291320Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/elastic/events/__init__.py::construct_and_record_rdzv_event:0, line 110 <- wrt source file 2025-12-04T13:48:23.2291913Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/elastic/events/__init__.py::construct_and_record_rdzv_event:0 2025-12-04T13:48:23.2292462Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/elastic/rendezvous/api.py::RendezvousHandler.shutdown:0, line 232 <- wrt source file 2025-12-04T13:48:23.2293010Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/elastic/rendezvous/api.py::RendezvousHandler.shutdown:0 2025-12-04T13:48:23.2293527Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/elastic/utils/distributed.py::get_free_port:0, line 140 <- wrt source file 2025-12-04T13:48:23.2294037Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/elastic/utils/distributed.py::get_free_port:0 2025-12-04T13:48:23.2294507Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/api.py::MixedPrecision:0, line 202 <- wrt source file 2025-12-04T13:48:23.2294957Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/api.py::MixedPrecision:0 2025-12-04T13:48:23.2295427Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/api.py::StateDictType:0, line 262 <- wrt source file 2025-12-04T13:48:23.2295870Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/api.py::StateDictType:0 2025-12-04T13:48:23.2296370Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel:0, line 125 <- wrt source file 2025-12-04T13:48:23.2296968Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel:0 2025-12-04T13:48:23.2297553Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.set_state_dict_type:0, line 651 <- wrt source file 2025-12-04T13:48:23.2298196Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.set_state_dict_type:0 2025-12-04T13:48:23.2298806Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.state_dict_type:0, line 805 <- wrt source file 2025-12-04T13:48:23.2299418Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.state_dict_type:0 2025-12-04T13:48:23.2300047Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.shard_full_optim_state_dict:0, line 1513 <- wrt source file 2025-12-04T13:48:23.2300696Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.shard_full_optim_state_dict:0 2025-12-04T13:48:23.2301339Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.scatter_full_optim_state_dict:0, line 1633 <- wrt source file 2025-12-04T13:48:23.2302066Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.scatter_full_optim_state_dict:0 2025-12-04T13:48:23.2302696Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.rekey_optim_state_dict:0, line 1718 <- wrt source file 2025-12-04T13:48:23.2303322Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.rekey_optim_state_dict:0 2025-12-04T13:48:23.2303948Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.optim_state_dict:0, line 1850 <- wrt source file 2025-12-04T13:48:23.2304559Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.optim_state_dict:0 2025-12-04T13:48:23.2305176Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.optim_state_dict_to_load:0, line 1937 <- wrt source file 2025-12-04T13:48:23.2305814Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py::FullyShardedDataParallel.optim_state_dict_to_load:0 2025-12-04T13:48:23.2306370Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/sharded_grad_scaler.py::ShardedGradScaler:0, line 57 <- wrt source file 2025-12-04T13:48:23.2306898Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/sharded_grad_scaler.py::ShardedGradScaler:0 2025-12-04T13:48:23.2307361Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py::CustomPolicy:0, line 227 <- wrt source file 2025-12-04T13:48:23.2307802Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py::CustomPolicy:0 2025-12-04T13:48:23.2308274Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/nn/functional.py::_all_gather_base:0, line 134 <- wrt source file 2025-12-04T13:48:23.2308738Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/nn/functional.py::_all_gather_base:0 2025-12-04T13:48:23.2309236Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py::_RemoteModule.__init__:0, line 196 <- wrt source file 2025-12-04T13:48:23.2309749Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py::_RemoteModule.__init__:0 2025-12-04T13:48:23.2310273Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py::_RemoteModule.init_from_module_rref:0, line 527 <- wrt source file 2025-12-04T13:48:23.2310833Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py::_RemoteModule.init_from_module_rref:0 2025-12-04T13:48:23.2311333Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py::RemoteModule:0, line 658 <- wrt source file 2025-12-04T13:48:23.2311809Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/nn/api/remote_module.py::RemoteModule:0 2025-12-04T13:48:23.2312370Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/apply_optimizer_in_backward.py::_apply_optimizer_in_backward:0, line 43 <- wrt source file 2025-12-04T13:48:23.2312932Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/apply_optimizer_in_backward.py::_apply_optimizer_in_backward:0 2025-12-04T13:48:23.2313483Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/apply_optimizer_in_backward.py::_get_in_backward_optimizers:0, line 114 <- wrt source file 2025-12-04T13:48:23.2314046Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/apply_optimizer_in_backward.py::_get_in_backward_optimizers:0 2025-12-04T13:48:23.2314560Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/named_optimizer.py::_NamedOptimizer:0, line 43 <- wrt source file 2025-12-04T13:48:23.2315062Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/named_optimizer.py::_NamedOptimizer:0 2025-12-04T13:48:23.2315553Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/optimizer.py::DistributedOptimizer:0, line 161 <- wrt source file 2025-12-04T13:48:23.2316050Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/optimizer.py::DistributedOptimizer:0 2025-12-04T13:48:23.2316565Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/post_localSGD_optimizer.py::PostLocalSGDOptimizer:0, line 19 <- wrt source file 2025-12-04T13:48:23.2317114Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/post_localSGD_optimizer.py::PostLocalSGDOptimizer:0 2025-12-04T13:48:23.2317650Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/utils.py::register_functional_optim:0, line 37 <- wrt source file 2025-12-04T13:48:23.2318146Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/utils.py::register_functional_optim:0 2025-12-04T13:48:23.2318664Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/zero_redundancy_optimizer.py::ZeroRedundancyOptimizer:0, line 341 <- wrt source file 2025-12-04T13:48:23.2319254Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/optim/zero_redundancy_optimizer.py::ZeroRedundancyOptimizer:0 2025-12-04T13:48:23.2319747Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/pipelining/_IR.py::pipe_split:0, line 345 <- wrt source file 2025-12-04T13:48:23.2320228Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/pipelining/_IR.py::pipe_split:0 2025-12-04T13:48:23.2320705Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/pipelining/microbatch.py::_CustomReducer:0, line 36 <- wrt source file 2025-12-04T13:48:23.2321230Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/pipelining/microbatch.py::_CustomReducer:0 2025-12-04T13:48:23.2321947Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/pipelining/microbatch.py::TensorChunkSpec.from_tuple:0, line 85 <- wrt source file 2025-12-04T13:48:23.2322498Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/pipelining/microbatch.py::TensorChunkSpec.from_tuple:0 2025-12-04T13:48:23.2323032Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/pipelining/microbatch.py::TensorChunkSpec.from_dict:0, line 104 <- wrt source file 2025-12-04T13:48:23.2323579Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/pipelining/microbatch.py::TensorChunkSpec.from_dict:0 2025-12-04T13:48:23.2324046Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::_wait_all:0, line 174 <- wrt source file 2025-12-04T13:48:23.2324475Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::_wait_all:0 2025-12-04T13:48:23.2324897Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::shutdown:0, line 343 <- wrt source file 2025-12-04T13:48:23.2325326Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::shutdown:0 2025-12-04T13:48:23.2325736Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::remote:0, line 605 <- wrt source file 2025-12-04T13:48:23.2326160Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::remote:0 2025-12-04T13:48:23.2326568Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::rpc_sync:0, line 786 <- wrt source file 2025-12-04T13:48:23.2326988Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::rpc_sync:0 2025-12-04T13:48:23.2327402Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::rpc_async:0, line 878 <- wrt source file 2025-12-04T13:48:23.2327828Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/api.py::rpc_async:0 2025-12-04T13:48:23.2328267Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/functions.py::async_execution:0, line 34 <- wrt source file 2025-12-04T13:48:23.2328787Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/functions.py::async_execution:0 2025-12-04T13:48:23.2329306Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/options.py::TensorPipeRpcBackendOptions.set_device_map:0, line 126 <- wrt source file 2025-12-04T13:48:23.2329868Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/options.py::TensorPipeRpcBackendOptions.set_device_map:0 2025-12-04T13:48:23.2330455Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/server_process_global_profiler.py::_server_process_global_profile:0, line 62 <- wrt source file 2025-12-04T13:48:23.2331037Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/rpc/server_process_global_profiler.py::_server_process_global_profile:0 2025-12-04T13:48:23.2331572Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_api.py::_shard_tensor:0, line 887 <- wrt source file 2025-12-04T13:48:23.2332066Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_api.py::_shard_tensor:0 2025-12-04T13:48:23.2332540Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_dtensor_spec.py::ShardOrderEntry:0, line 32 <- wrt source file 2025-12-04T13:48:23.2333033Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_dtensor_spec.py::ShardOrderEntry:0 2025-12-04T13:48:23.2333570Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_dtensor_spec.py::DTensorSpec._convert_shard_order_to_StridedShard:0, line 165 <- wrt source file 2025-12-04T13:48:23.2334166Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_dtensor_spec.py::DTensorSpec._convert_shard_order_to_StridedShard:0 2025-12-04T13:48:23.2334755Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_dtensor_spec.py::DTensorSpec._maybe_convert_StridedShard_to_shard_order:0, line 241 <- wrt source file 2025-12-04T13:48:23.2335365Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_dtensor_spec.py::DTensorSpec._maybe_convert_StridedShard_to_shard_order:0 2025-12-04T13:48:23.2335941Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_dtensor_spec.py::DTensorSpec.format_shard_order_str:0, line 461 <- wrt source file 2025-12-04T13:48:23.2336496Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_dtensor_spec.py::DTensorSpec.format_shard_order_str:0 2025-12-04T13:48:23.2337050Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_random.py::OffsetBasedRNGTracker._set_pre_op_offset:0, line 310 <- wrt source file 2025-12-04T13:48:23.2337602Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_random.py::OffsetBasedRNGTracker._set_pre_op_offset:0 2025-12-04T13:48:23.2338116Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_ops/_common_rules.py::pointwise_rule:0, line 234 <- wrt source file 2025-12-04T13:48:23.2338620Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/_ops/_common_rules.py::pointwise_rule:0 2025-12-04T13:48:23.2339115Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_func_map.py::local_map:0, line 103 <- wrt source file 2025-12-04T13:48:23.2339620Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_func_map.py::local_map:0 2025-12-04T13:48:23.2340174Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_register_sharding.py::register_sharding:0, line 46 <- wrt source file 2025-12-04T13:48:23.2340736Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_register_sharding.py::register_sharding:0 2025-12-04T13:48:23.2341358Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_context_parallel/_load_balancer.py::_LoadBalancer._generate_indices:0, line 30 <- wrt source file 2025-12-04T13:48:23.2342055Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_context_parallel/_load_balancer.py::_LoadBalancer._generate_indices:0 2025-12-04T13:48:23.2342723Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_context_parallel/_load_balancer.py::_HeadTailLoadBalancer._generate_indices:0, line 102 <- wrt source file 2025-12-04T13:48:23.2343421Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_context_parallel/_load_balancer.py::_HeadTailLoadBalancer._generate_indices:0 2025-12-04T13:48:23.2344107Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_context_parallel/_load_balancer.py::_PerDocumentHeadTailLoadBalancer._generate_indices:0, line 213 <- wrt source file 2025-12-04T13:48:23.2344837Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_context_parallel/_load_balancer.py::_PerDocumentHeadTailLoadBalancer._generate_indices:0 2025-12-04T13:48:23.2345508Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_context_parallel/_load_balancer.py::_PTRRLoadBalancer.ptrr_scheduling:0, line 339 <- wrt source file 2025-12-04T13:48:23.2346170Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_context_parallel/_load_balancer.py::_PTRRLoadBalancer.ptrr_scheduling:0 2025-12-04T13:48:23.2346816Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_context_parallel/_load_balancer.py::_PTRRLoadBalancer._generate_indices:0, line 397 <- wrt source file 2025-12-04T13:48:23.2347478Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/experimental/_context_parallel/_load_balancer.py::_PTRRLoadBalancer._generate_indices:0 2025-12-04T13:48:23.2348042Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/api.py::parallelize_module:0, line 55 <- wrt source file 2025-12-04T13:48:23.2348549Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/api.py::parallelize_module:0 2025-12-04T13:48:23.2349047Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/ddp.py::_pre_dp_module_transform:0, line 88 <- wrt source file 2025-12-04T13:48:23.2349568Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/ddp.py::_pre_dp_module_transform:0 2025-12-04T13:48:23.2350072Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/loss.py::loss_parallel:0, line 56 <- wrt source file 2025-12-04T13:48:23.2350568Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/loss.py::loss_parallel:0 2025-12-04T13:48:23.2351059Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::ColwiseParallel:0, line 64 <- wrt source file 2025-12-04T13:48:23.2351584Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::ColwiseParallel:0 2025-12-04T13:48:23.2352115Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::RowwiseParallel:0, line 198 <- wrt source file 2025-12-04T13:48:23.2352638Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::RowwiseParallel:0 2025-12-04T13:48:23.2353153Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::SequenceParallel:0, line 350 <- wrt source file 2025-12-04T13:48:23.2353662Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::SequenceParallel:0 2025-12-04T13:48:23.2354188Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::PrepareModuleInput:0, line 452 <- wrt source file 2025-12-04T13:48:23.2354708Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::PrepareModuleInput:0 2025-12-04T13:48:23.2355216Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::PrepareModuleOutput:0, line 614 <- wrt source file 2025-12-04T13:48:23.2355740Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::PrepareModuleOutput:0 2025-12-04T13:48:23.2356265Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::PrepareModuleInputOutput:0, line 740 <- wrt source file 2025-12-04T13:48:23.2356816Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/tensor/parallel/style.py::PrepareModuleInputOutput:0 2025-12-04T13:48:23.2357297Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/bernoulli.py::Bernoulli:0, line 30 <- wrt source file 2025-12-04T13:48:23.2357744Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/bernoulli.py::Bernoulli:0 2025-12-04T13:48:23.2358160Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/beta.py::Beta:0, line 21 <- wrt source file 2025-12-04T13:48:23.2358572Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/beta.py::Beta:0 2025-12-04T13:48:23.2358990Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/binomial.py::Binomial:0, line 31 <- wrt source file 2025-12-04T13:48:23.2359429Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/binomial.py::Binomial:0 2025-12-04T13:48:23.2359868Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/categorical.py::Categorical:0, line 42 <- wrt source file 2025-12-04T13:48:23.2360330Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/categorical.py::Categorical:0 2025-12-04T13:48:23.2360761Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/cauchy.py::Cauchy:0, line 23 <- wrt source file 2025-12-04T13:48:23.2361185Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/cauchy.py::Cauchy:0 2025-12-04T13:48:23.2361586Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/chi2.py::Chi2:0, line 18 <- wrt source file 2025-12-04T13:48:23.2362110Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/chi2.py::Chi2:0 2025-12-04T13:48:23.2362553Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/constraints.py::is_dependent:0, line 167 <- wrt source file 2025-12-04T13:48:23.2363015Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/constraints.py::is_dependent:0 2025-12-04T13:48:23.2363476Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/constraints.py::_DependentProperty:0, line 188 <- wrt source file 2025-12-04T13:48:23.2363996Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/constraints.py::_DependentProperty:0 2025-12-04T13:48:23.2364492Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/continuous_bernoulli.py::ContinuousBernoulli:0, line 35 <- wrt source file 2025-12-04T13:48:23.2365043Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/continuous_bernoulli.py::ContinuousBernoulli:0 2025-12-04T13:48:23.2365509Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/dirichlet.py::Dirichlet:0, line 44 <- wrt source file 2025-12-04T13:48:23.2365949Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/dirichlet.py::Dirichlet:0 2025-12-04T13:48:23.2366395Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/exponential.py::Exponential:0, line 20 <- wrt source file 2025-12-04T13:48:23.2366854Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/exponential.py::Exponential:0 2025-12-04T13:48:23.2367315Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/fishersnedecor.py::FisherSnedecor:0, line 21 <- wrt source file 2025-12-04T13:48:23.2367797Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/fishersnedecor.py::FisherSnedecor:0 2025-12-04T13:48:23.2368228Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/gamma.py::Gamma:0, line 24 <- wrt source file 2025-12-04T13:48:23.2368641Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/gamma.py::Gamma:0 2025-12-04T13:48:23.2369093Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/generalized_pareto.py::GeneralizedPareto:0, line 26 <- wrt source file 2025-12-04T13:48:23.2369597Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/generalized_pareto.py::GeneralizedPareto:0 2025-12-04T13:48:23.2370055Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/geometric.py::Geometric:0, line 36 <- wrt source file 2025-12-04T13:48:23.2370496Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/geometric.py::Geometric:0 2025-12-04T13:48:23.2370918Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/gumbel.py::Gumbel:0, line 23 <- wrt source file 2025-12-04T13:48:23.2371337Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/gumbel.py::Gumbel:0 2025-12-04T13:48:23.2371762Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/half_cauchy.py::HalfCauchy:0, line 24 <- wrt source file 2025-12-04T13:48:23.2372238Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/half_cauchy.py::HalfCauchy:0 2025-12-04T13:48:23.2372672Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/half_normal.py::HalfNormal:0, line 24 <- wrt source file 2025-12-04T13:48:23.2373138Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/half_normal.py::HalfNormal:0 2025-12-04T13:48:23.2373583Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/independent.py::Independent:0, line 27 <- wrt source file 2025-12-04T13:48:23.2374041Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/independent.py::Independent:0 2025-12-04T13:48:23.2374525Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/inverse_gamma.py::InverseGamma:0, line 24 <- wrt source file 2025-12-04T13:48:23.2374990Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/inverse_gamma.py::InverseGamma:0 2025-12-04T13:48:23.2375440Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/kumaraswamy.py::Kumaraswamy:0, line 30 <- wrt source file 2025-12-04T13:48:23.2375914Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/kumaraswamy.py::Kumaraswamy:0 2025-12-04T13:48:23.2376347Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/laplace.py::Laplace:0, line 20 <- wrt source file 2025-12-04T13:48:23.2376776Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/laplace.py::Laplace:0 2025-12-04T13:48:23.2377208Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/lkj_cholesky.py::LKJCholesky:0, line 43 <- wrt source file 2025-12-04T13:48:23.2433414Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/lkj_cholesky.py::LKJCholesky:0 2025-12-04T13:48:23.2433934Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/log_normal.py::LogNormal:0, line 23 <- wrt source file 2025-12-04T13:48:23.2434410Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/log_normal.py::LogNormal:0 2025-12-04T13:48:23.2434882Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/logistic_normal.py::LogisticNormal:0, line 28 <- wrt source file 2025-12-04T13:48:23.2435368Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/logistic_normal.py::LogisticNormal:0 2025-12-04T13:48:23.2435903Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/lowrank_multivariate_normal.py::LowRankMultivariateNormal:0, line 63 <- wrt source file 2025-12-04T13:48:23.2436470Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/lowrank_multivariate_normal.py::LowRankMultivariateNormal:0 2025-12-04T13:48:23.2436998Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/mixture_same_family.py::MixtureSameFamily:0, line 24 <- wrt source file 2025-12-04T13:48:23.2437512Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/mixture_same_family.py::MixtureSameFamily:0 2025-12-04T13:48:23.2437981Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/multinomial.py::Multinomial:0, line 38 <- wrt source file 2025-12-04T13:48:23.2438443Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/multinomial.py::Multinomial:0 2025-12-04T13:48:23.2438929Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/multivariate_normal.py::MultivariateNormal:0, line 103 <- wrt source file 2025-12-04T13:48:23.2439444Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/multivariate_normal.py::MultivariateNormal:0 2025-12-04T13:48:23.2440111Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/normal.py::Normal:0, line 22 <- wrt source file 2025-12-04T13:48:23.2440539Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/normal.py::Normal:0 2025-12-04T13:48:23.2441005Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/one_hot_categorical.py::OneHotCategorical:0, line 34 <- wrt source file 2025-12-04T13:48:23.2441612Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/one_hot_categorical.py::OneHotCategorical:0 2025-12-04T13:48:23.2442149Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/pareto.py::Pareto:0, line 20 <- wrt source file 2025-12-04T13:48:23.2442570Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/pareto.py::Pareto:0 2025-12-04T13:48:23.2443057Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/poisson.py::Poisson:0, line 25 <- wrt source file 2025-12-04T13:48:23.2443486Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/poisson.py::Poisson:0 2025-12-04T13:48:23.2443940Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/relaxed_bernoulli.py::RelaxedBernoulli:0, line 130 <- wrt source file 2025-12-04T13:48:23.2444437Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/relaxed_bernoulli.py::RelaxedBernoulli:0 2025-12-04T13:48:23.2444948Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/relaxed_categorical.py::RelaxedOneHotCategorical:0, line 117 <- wrt source file 2025-12-04T13:48:23.2445480Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/relaxed_categorical.py::RelaxedOneHotCategorical:0 2025-12-04T13:48:23.2445951Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/studentT.py::StudentT:0, line 22 <- wrt source file 2025-12-04T13:48:23.2446385Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/studentT.py::StudentT:0 2025-12-04T13:48:23.2446828Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/transforms.py::CatTransform:0, line 1076 <- wrt source file 2025-12-04T13:48:23.2447287Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/transforms.py::CatTransform:0 2025-12-04T13:48:23.2447743Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/transforms.py::StackTransform:0, line 1190 <- wrt source file 2025-12-04T13:48:23.2448212Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/transforms.py::StackTransform:0 2025-12-04T13:48:23.2448708Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/transforms.py::CumulativeDistributionTransform:0, line 1268 <- wrt source file 2025-12-04T13:48:23.2449246Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/transforms.py::CumulativeDistributionTransform:0 2025-12-04T13:48:23.2449706Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/uniform.py::Uniform:0, line 21 <- wrt source file 2025-12-04T13:48:23.2450130Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/uniform.py::Uniform:0 2025-12-04T13:48:23.2450551Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/utils.py::clamp_probs:0, line 114 <- wrt source file 2025-12-04T13:48:23.2451060Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/utils.py::clamp_probs:0 2025-12-04T13:48:23.2451483Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/von_mises.py::VonMises:0, line 119 <- wrt source file 2025-12-04T13:48:23.2451988Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/von_mises.py::VonMises:0 2025-12-04T13:48:23.2452409Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/weibull.py::Weibull:0, line 22 <- wrt source file 2025-12-04T13:48:23.2452886Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/weibull.py::Weibull:0 2025-12-04T13:48:23.2453320Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/wishart.py::Wishart:0, line 39 <- wrt source file 2025-12-04T13:48:23.2453771Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributions/wishart.py::Wishart:0 2025-12-04T13:48:23.2454202Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/_unlift.py::_convert_guards_code_to_fn:0, line 158 <- wrt source file 2025-12-04T13:48:23.2454658Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/_unlift.py::_convert_guards_code_to_fn:0 2025-12-04T13:48:23.2455086Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/dynamic_shapes.py::Dim:0, line 123 <- wrt source file 2025-12-04T13:48:23.2455510Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/dynamic_shapes.py::Dim:0 2025-12-04T13:48:23.2455940Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/dynamic_shapes.py::ShapesCollection:0, line 737 <- wrt source file 2025-12-04T13:48:23.2456400Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/dynamic_shapes.py::ShapesCollection:0 2025-12-04T13:48:23.2456863Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/dynamic_shapes.py::ShapesCollection:1, line 753 <- wrt source file 2025-12-04T13:48:23.2457316Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/dynamic_shapes.py::ShapesCollection:1 2025-12-04T13:48:23.2457767Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/dynamic_shapes.py::AdditionalInputs:0, line 837 <- wrt source file 2025-12-04T13:48:23.2458223Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/export/dynamic_shapes.py::AdditionalInputs:0 2025-12-04T13:48:23.2458635Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph.py::_snake_case:0, line 104 <- wrt source file 2025-12-04T13:48:23.2459031Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph.py::_snake_case:0 2025-12-04T13:48:23.2459438Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph.py::Graph.eliminate_dead_code:0, line 2043 <- wrt source file 2025-12-04T13:48:23.2459874Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph.py::Graph.eliminate_dead_code:0 2025-12-04T13:48:23.2460299Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph.py::Graph.on_generate_code:0, line 2137 <- wrt source file 2025-12-04T13:48:23.2460724Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/graph.py::Graph.on_generate_code:0 2025-12-04T13:48:23.2461140Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/interpreter.py::Interpreter:0, line 75 <- wrt source file 2025-12-04T13:48:23.2461559Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/interpreter.py::Interpreter:0 2025-12-04T13:48:23.2462033Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/interpreter.py::Transformer:0, line 519 <- wrt source file 2025-12-04T13:48:23.2462447Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/interpreter.py::Transformer:0 2025-12-04T13:48:23.2462870Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/subgraph_rewriter.py::replace_pattern:0, line 126 <- wrt source file 2025-12-04T13:48:23.2463354Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/subgraph_rewriter.py::replace_pattern:0 2025-12-04T13:48:23.2463769Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/tensor_type.py::TensorType:0, line 12 <- wrt source file 2025-12-04T13:48:23.2464173Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/tensor_type.py::TensorType:0 2025-12-04T13:48:23.2464598Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/tensor_type.py::is_consistent:0, line 65 <- wrt source file 2025-12-04T13:48:23.2465010Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/tensor_type.py::is_consistent:0 2025-12-04T13:48:23.2465414Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/tensor_type.py::is_more_precise:0, line 93 <- wrt source file 2025-12-04T13:48:23.2465831Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/tensor_type.py::is_more_precise:0 2025-12-04T13:48:23.2466230Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/traceback.py::annotate:0, line 300 <- wrt source file 2025-12-04T13:48:23.2466625Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/traceback.py::annotate:0 2025-12-04T13:48:23.2467015Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/traceback.py::annotate_fn:0, line 344 <- wrt source file 2025-12-04T13:48:23.2467418Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/traceback.py::annotate_fn:0 2025-12-04T13:48:23.2467876Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/rewriter.py::AST_Rewriter.visit_AnnAssign:0, line 97 <- wrt source file 2025-12-04T13:48:23.2468383Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/rewriter.py::AST_Rewriter.visit_AnnAssign:0 2025-12-04T13:48:23.2468865Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/core.py::reify:0, line 58 <- wrt source file 2025-12-04T13:48:23.2469329Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/core.py::reify:0 2025-12-04T13:48:23.2469802Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/match.py::VarDispatcher:0, line 48 <- wrt source file 2025-12-04T13:48:23.2470296Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/match.py::VarDispatcher:0 2025-12-04T13:48:23.2470769Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/more.py::unifiable:0, line 19 <- wrt source file 2025-12-04T13:48:23.2471254Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/more.py::unifiable:0 2025-12-04T13:48:23.2471722Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/more.py::reify_object:0, line 45 <- wrt source file 2025-12-04T13:48:23.2472259Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/more.py::reify_object:0 2025-12-04T13:48:23.2472761Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/more.py::unify_object:0, line 102 <- wrt source file 2025-12-04T13:48:23.2473244Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/more.py::unify_object:0 2025-12-04T13:48:23.2473735Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::merge:0, line 37 <- wrt source file 2025-12-04T13:48:23.2474294Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::merge:0 2025-12-04T13:48:23.2474798Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::merge_with:0, line 64 <- wrt source file 2025-12-04T13:48:23.2475340Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::merge_with:0 2025-12-04T13:48:23.2475851Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::valmap:0, line 90 <- wrt source file 2025-12-04T13:48:23.2476367Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::valmap:0 2025-12-04T13:48:23.2476871Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::keymap:0, line 106 <- wrt source file 2025-12-04T13:48:23.2477387Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::keymap:0 2025-12-04T13:48:23.2477893Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::itemmap:0, line 122 <- wrt source file 2025-12-04T13:48:23.2478413Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::itemmap:0 2025-12-04T13:48:23.2478922Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::valfilter:0, line 138 <- wrt source file 2025-12-04T13:48:23.2479447Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::valfilter:0 2025-12-04T13:48:23.2479965Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::keyfilter:0, line 158 <- wrt source file 2025-12-04T13:48:23.2480493Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::keyfilter:0 2025-12-04T13:48:23.2481011Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::itemfilter:0, line 178 <- wrt source file 2025-12-04T13:48:23.2481541Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::itemfilter:0 2025-12-04T13:48:23.2482130Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::assoc:0, line 204 <- wrt source file 2025-12-04T13:48:23.2482645Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::assoc:0 2025-12-04T13:48:23.2483153Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::dissoc:0, line 221 <- wrt source file 2025-12-04T13:48:23.2483698Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::dissoc:0 2025-12-04T13:48:23.2484205Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::assoc_in:0, line 247 <- wrt source file 2025-12-04T13:48:23.2484727Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::assoc_in:0 2025-12-04T13:48:23.2485273Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::update_in:0, line 275 <- wrt source file 2025-12-04T13:48:23.2485809Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::update_in:0 2025-12-04T13:48:23.2486341Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::get_in:0, line 329 <- wrt source file 2025-12-04T13:48:23.2486855Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::get_in:0 2025-12-04T13:48:23.2487360Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::groupby:0, line 376 <- wrt source file 2025-12-04T13:48:23.2487880Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::groupby:0 2025-12-04T13:48:23.2488383Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::first:0, line 417 <- wrt source file 2025-12-04T13:48:23.2488893Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/unification_tools.py::first:0 2025-12-04T13:48:23.2489392Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/utils.py::transitive_get:0, line 15 <- wrt source file 2025-12-04T13:48:23.2489892Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/utils.py::transitive_get:0 2025-12-04T13:48:23.2490369Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/utils.py::_toposort:0, line 42 <- wrt source file 2025-12-04T13:48:23.2490863Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/utils.py::_toposort:0 2025-12-04T13:48:23.2491336Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/utils.py::reverse_dict:0, line 70 <- wrt source file 2025-12-04T13:48:23.2491826Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/utils.py::reverse_dict:0 2025-12-04T13:48:23.2492357Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/utils.py::freeze:0, line 95 <- wrt source file 2025-12-04T13:48:23.2492832Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/utils.py::freeze:0 2025-12-04T13:48:23.2493303Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/variable.py::variables:0, line 67 <- wrt source file 2025-12-04T13:48:23.2493796Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/variable.py::variables:0 2025-12-04T13:48:23.2494316Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/core.py::dispatch:0, line 28 <- wrt source file 2025-12-04T13:48:23.2494908Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/core.py::dispatch:0 2025-12-04T13:48:23.2495465Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::Dispatcher:0, line 113 <- wrt source file 2025-12-04T13:48:23.2496052Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::Dispatcher:0 2025-12-04T13:48:23.2496672Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::Dispatcher.register:0, line 138 <- wrt source file 2025-12-04T13:48:23.2497286Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::Dispatcher.register:0 2025-12-04T13:48:23.2497937Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::Dispatcher.add:0, line 191 <- wrt source file 2025-12-04T13:48:23.2498529Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::Dispatcher.add:0 2025-12-04T13:48:23.2499120Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::Dispatcher.dispatch:0, line 305 <- wrt source file 2025-12-04T13:48:23.2499735Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::Dispatcher.dispatch:0 2025-12-04T13:48:23.2500321Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::str_signature:0, line 436 <- wrt source file 2025-12-04T13:48:23.2500917Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/dispatcher.py::str_signature:0 2025-12-04T13:48:23.2501487Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::expand_tuples:0, line 18 <- wrt source file 2025-12-04T13:48:23.2502115Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::expand_tuples:0 2025-12-04T13:48:23.2502662Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::_toposort:0, line 41 <- wrt source file 2025-12-04T13:48:23.2503213Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::_toposort:0 2025-12-04T13:48:23.2503761Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::reverse_dict:0, line 68 <- wrt source file 2025-12-04T13:48:23.2504323Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::reverse_dict:0 2025-12-04T13:48:23.2504862Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::groupby:0, line 87 <- wrt source file 2025-12-04T13:48:23.2505406Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::groupby:0 2025-12-04T13:48:23.2505945Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::typename:0, line 117 <- wrt source file 2025-12-04T13:48:23.2506498Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/utils.py::typename:0 2025-12-04T13:48:23.2507077Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/variadic.py::isvariadic:0, line 47 <- wrt source file 2025-12-04T13:48:23.2507643Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/variadic.py::isvariadic:0 2025-12-04T13:48:23.2508229Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/variadic.py::Variadic:0, line 83 <- wrt source file 2025-12-04T13:48:23.2508792Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/unification/multipledispatch/variadic.py::Variadic:0 2025-12-04T13:48:23.2509324Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/passes/graph_drawer.py::FxGraphDrawer.get_dot_graph:0, line 129 <- wrt source file 2025-12-04T13:48:23.2525196Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/passes/graph_drawer.py::FxGraphDrawer.get_dot_graph:0 2025-12-04T13:48:23.2526078Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/passes/shape_prop.py::ShapeProp:0, line 99 <- wrt source file 2025-12-04T13:48:23.2526641Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/passes/shape_prop.py::ShapeProp:0 2025-12-04T13:48:23.2527153Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/passes/split_module.py::split_module:0, line 94 <- wrt source file 2025-12-04T13:48:23.2527671Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/passes/split_module.py::split_module:0 2025-12-04T13:48:23.2528230Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/passes/utils/matcher_with_name_node_map_utils.py::SubgraphMatcherWithNameNodeMap:0, line 51 <- wrt source file 2025-12-04T13:48:23.2528904Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/passes/utils/matcher_with_name_node_map_utils.py::SubgraphMatcherWithNameNodeMap:0 2025-12-04T13:48:23.2529500Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/_check.py::AttributeTypeIsSupportedChecker:0, line 37 <- wrt source file 2025-12-04T13:48:23.2530067Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/_check.py::AttributeTypeIsSupportedChecker:0 2025-12-04T13:48:23.2530584Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/mobile/__init__.py::_load_for_lite_interpreter:0, line 22 <- wrt source file 2025-12-04T13:48:23.2531104Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/mobile/__init__.py::_load_for_lite_interpreter:0 2025-12-04T13:48:23.2531635Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/mobile/__init__.py::_get_mobile_model_contained_types:0, line 125 <- wrt source file 2025-12-04T13:48:23.2532242Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/mobile/__init__.py::_get_mobile_model_contained_types:0 2025-12-04T13:48:23.2532770Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/mobile/__init__.py::_get_model_ops_and_info:0, line 225 <- wrt source file 2025-12-04T13:48:23.2533289Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/jit/mobile/__init__.py::_get_model_ops_and_info:0 2025-12-04T13:48:23.2533765Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/masked/_ops.py::logaddexp:0, line 1538 <- wrt source file 2025-12-04T13:48:23.2537525Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/masked/_ops.py::logaddexp:0 2025-12-04T13:48:23.2538076Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/masked/maskedtensor/core.py::is_masked_tensor:0, line 25 <- wrt source file 2025-12-04T13:48:23.2538540Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/masked/maskedtensor/core.py::is_masked_tensor:0 2025-12-04T13:48:23.2539007Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::fractional_max_pool2d_with_indices:0, line 470 <- wrt source file 2025-12-04T13:48:23.2674757Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::fractional_max_pool2d_with_indices:0 2025-12-04T13:48:23.2713927Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::fractional_max_pool3d_with_indices:0, line 589 <- wrt source file 2025-12-04T13:48:23.2978009Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::fractional_max_pool3d_with_indices:0 2025-12-04T13:48:23.3052186Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::gumbel_softmax:0, line 2198 <- wrt source file 2025-12-04T13:48:23.3194081Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::gumbel_softmax:0 2025-12-04T13:48:23.3194557Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::embedding:0, line 2503 <- wrt source file 2025-12-04T13:48:23.3232413Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::embedding:0 2025-12-04T13:48:23.3232871Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::embedding_bag:0, line 2645 <- wrt source file 2025-12-04T13:48:23.3264978Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::embedding_bag:0 2025-12-04T13:48:23.3273052Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::ctc_loss:0, line 3087 <- wrt source file 2025-12-04T13:48:23.3493061Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::ctc_loss:0 2025-12-04T13:48:23.3493482Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::nll_loss:0, line 3157 <- wrt source file 2025-12-04T13:48:23.3712728Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::nll_loss:0 2025-12-04T13:48:23.3713147Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::cross_entropy:0, line 3476 <- wrt source file 2025-12-04T13:48:23.4163149Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::cross_entropy:0 2025-12-04T13:48:23.4163613Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::binary_cross_entropy:0, line 3542 <- wrt source file 2025-12-04T13:48:23.4167342Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::binary_cross_entropy:0 2025-12-04T13:48:23.4167830Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::binary_cross_entropy_with_logits:0, line 3613 <- wrt source file 2025-12-04T13:48:23.4233572Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::binary_cross_entropy_with_logits:0 2025-12-04T13:48:23.4234019Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::pad:0, line 5387 <- wrt source file 2025-12-04T13:48:23.4239323Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/functional.py::pad:0 2025-12-04T13:48:23.4239849Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv1d_input:0, line 32 <- wrt source file 2025-12-04T13:48:23.4245148Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv1d_input:0 2025-12-04T13:48:23.4245567Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv1d_weight:0, line 79 <- wrt source file 2025-12-04T13:48:23.4247996Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv1d_weight:0 2025-12-04T13:48:23.4248435Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv2d_input:0, line 130 <- wrt source file 2025-12-04T13:48:23.4312817Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv2d_input:0 2025-12-04T13:48:23.4313274Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv2d_weight:0, line 177 <- wrt source file 2025-12-04T13:48:23.4383180Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv2d_weight:0 2025-12-04T13:48:23.4383577Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv3d_input:0, line 228 <- wrt source file 2025-12-04T13:48:23.5215217Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv3d_input:0 2025-12-04T13:48:23.5215700Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv3d_weight:0, line 275 <- wrt source file 2025-12-04T13:48:23.6019331Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/grad.py::conv3d_weight:0 2025-12-04T13:48:23.6020279Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::calculate_gain:0, line 172 <- wrt source file 2025-12-04T13:48:23.6021644Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::calculate_gain:0 2025-12-04T13:48:23.6022497Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::uniform_:0, line 231 <- wrt source file 2025-12-04T13:48:23.6023961Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::uniform_:0 2025-12-04T13:48:23.6024638Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::normal_:0, line 258 <- wrt source file 2025-12-04T13:48:23.6025714Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::normal_:0 2025-12-04T13:48:23.6026622Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::trunc_normal_:0, line 293 <- wrt source file 2025-12-04T13:48:23.6028126Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::trunc_normal_:0 2025-12-04T13:48:23.6029032Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::constant_:0, line 307 <- wrt source file 2025-12-04T13:48:23.6029878Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::constant_:0 2025-12-04T13:48:23.6030648Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::ones_:0, line 324 <- wrt source file 2025-12-04T13:48:23.6031468Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::ones_:0 2025-12-04T13:48:23.6032232Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::zeros_:0, line 337 <- wrt source file 2025-12-04T13:48:23.6033017Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::zeros_:0 2025-12-04T13:48:23.6034086Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::eye_:0, line 353 <- wrt source file 2025-12-04T13:48:23.6034879Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::eye_:0 2025-12-04T13:48:23.6035545Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::dirac_:0, line 375 <- wrt source file 2025-12-04T13:48:23.6036326Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::dirac_:0 2025-12-04T13:48:23.6037113Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::xavier_uniform_:0, line 461 <- wrt source file 2025-12-04T13:48:23.6037954Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::xavier_uniform_:0 2025-12-04T13:48:23.6038701Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::xavier_normal_:0, line 493 <- wrt source file 2025-12-04T13:48:23.6039524Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::xavier_normal_:0 2025-12-04T13:48:23.6040207Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::kaiming_uniform_:0, line 545 <- wrt source file 2025-12-04T13:48:23.6041051Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::kaiming_uniform_:0 2025-12-04T13:48:23.6041782Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::kaiming_normal_:0, line 610 <- wrt source file 2025-12-04T13:48:23.6042220Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::kaiming_normal_:0 2025-12-04T13:48:23.6043055Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::orthogonal_:0, line 649 <- wrt source file 2025-12-04T13:48:23.6043883Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::orthogonal_:0 2025-12-04T13:48:23.6044259Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::sparse_:0, line 702 <- wrt source file 2025-12-04T13:48:23.6045051Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/init.py::sparse_:0 2025-12-04T13:48:23.6045761Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/attention/__init__.py::sdpa_kernel:0, line 124 <- wrt source file 2025-12-04T13:48:23.6046208Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/attention/__init__.py::sdpa_kernel:0 2025-12-04T13:48:23.6046679Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/attention/_registry.py::register_flash_attention_impl:0, line 47 <- wrt source file 2025-12-04T13:48:23.6047184Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/attention/_registry.py::register_flash_attention_impl:0 2025-12-04T13:48:23.6048130Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/attention/_registry.py::activate_flash_attention_impl:0, line 78 <- wrt source file 2025-12-04T13:48:23.6049014Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/attention/_registry.py::activate_flash_attention_impl:0 2025-12-04T13:48:23.6049462Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/attention/bias.py::CausalBias:0, line 94 <- wrt source file 2025-12-04T13:48:23.6049884Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/attention/bias.py::CausalBias:0 2025-12-04T13:48:23.6050305Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/attention/varlen.py::varlen_attn:0, line 166 <- wrt source file 2025-12-04T13:48:23.6050769Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/attention/varlen.py::varlen_attn:0 2025-12-04T13:48:23.6051193Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Threshold:0, line 72 <- wrt source file 2025-12-04T13:48:23.6052384Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Threshold:0 2025-12-04T13:48:23.6053231Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::ReLU:0, line 120 <- wrt source file 2025-12-04T13:48:23.6054202Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::ReLU:0 2025-12-04T13:48:23.6055004Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::RReLU:0, line 185 <- wrt source file 2025-12-04T13:48:23.6055979Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::RReLU:0 2025-12-04T13:48:23.6056392Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Hardtanh:0, line 247 <- wrt source file 2025-12-04T13:48:23.6057252Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Hardtanh:0 2025-12-04T13:48:23.6058078Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::ReLU6:0, line 318 <- wrt source file 2025-12-04T13:48:23.6058915Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::ReLU6:0 2025-12-04T13:48:23.6059673Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Sigmoid:0, line 349 <- wrt source file 2025-12-04T13:48:23.6060522Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Sigmoid:0 2025-12-04T13:48:23.6061314Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Hardsigmoid:0, line 384 <- wrt source file 2025-12-04T13:48:23.6062263Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Hardsigmoid:0 2025-12-04T13:48:23.6063083Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Tanh:0, line 420 <- wrt source file 2025-12-04T13:48:23.6064008Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Tanh:0 2025-12-04T13:48:23.6064872Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::SiLU:0, line 456 <- wrt source file 2025-12-04T13:48:23.6065720Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::SiLU:0 2025-12-04T13:48:23.6066827Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Mish:0, line 501 <- wrt source file 2025-12-04T13:48:23.6067251Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Mish:0 2025-12-04T13:48:23.6067667Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Hardswish:0, line 552 <- wrt source file 2025-12-04T13:48:23.6068578Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Hardswish:0 2025-12-04T13:48:23.6068994Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::ELU:0, line 598 <- wrt source file 2025-12-04T13:48:23.6070341Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::ELU:0 2025-12-04T13:48:23.6071215Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::CELU:0, line 646 <- wrt source file 2025-12-04T13:48:23.6071643Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::CELU:0 2025-12-04T13:48:23.6072146Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::SELU:0, line 705 <- wrt source file 2025-12-04T13:48:23.6072995Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::SELU:0 2025-12-04T13:48:23.6073432Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::GLU:0, line 751 <- wrt source file 2025-12-04T13:48:23.6075831Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::GLU:0 2025-12-04T13:48:23.6076420Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::GELU:0, line 799 <- wrt source file 2025-12-04T13:48:23.6153713Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::GELU:0 2025-12-04T13:48:23.6154160Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Hardshrink:0, line 848 <- wrt source file 2025-12-04T13:48:23.6154620Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Hardshrink:0 2025-12-04T13:48:23.6155065Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::LeakyReLU:0, line 903 <- wrt source file 2025-12-04T13:48:23.6155501Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::LeakyReLU:0 2025-12-04T13:48:23.6155942Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::LogSigmoid:0, line 945 <- wrt source file 2025-12-04T13:48:23.6232246Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::LogSigmoid:0 2025-12-04T13:48:23.6234193Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softplus:0, line 981 <- wrt source file 2025-12-04T13:48:23.6234803Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softplus:0 2025-12-04T13:48:23.6235312Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softshrink:0, line 1030 <- wrt source file 2025-12-04T13:48:23.6235762Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softshrink:0 2025-12-04T13:48:23.6236223Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::MultiheadAttention:0, line 1148 <- wrt source file 2025-12-04T13:48:23.6236709Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::MultiheadAttention:0 2025-12-04T13:48:23.6237144Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::PReLU:0, line 1613 <- wrt source file 2025-12-04T13:48:23.6237567Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::PReLU:0 2025-12-04T13:48:23.6237991Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softsign:0, line 1664 <- wrt source file 2025-12-04T13:48:23.6238422Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softsign:0 2025-12-04T13:48:23.6238846Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Tanhshrink:0, line 1690 <- wrt source file 2025-12-04T13:48:23.6239548Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Tanhshrink:0 2025-12-04T13:48:23.6239964Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softmin:0, line 1728 <- wrt source file 2025-12-04T13:48:23.6303191Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softmin:0 2025-12-04T13:48:23.6304164Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softmax:0, line 1792 <- wrt source file 2025-12-04T13:48:23.6342205Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softmax:0 2025-12-04T13:48:23.6342704Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softmax2d:0, line 1839 <- wrt source file 2025-12-04T13:48:23.6402795Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::Softmax2d:0 2025-12-04T13:48:23.6403225Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::LogSoftmax:0, line 1878 <- wrt source file 2025-12-04T13:48:23.6482217Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/activation.py::LogSoftmax:0 2025-12-04T13:48:23.6482659Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::BatchNorm1d:0, line 341 <- wrt source file 2025-12-04T13:48:23.6762709Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::BatchNorm1d:0 2025-12-04T13:48:23.6763138Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::BatchNorm2d:0, line 453 <- wrt source file 2025-12-04T13:48:23.7024097Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::BatchNorm2d:0 2025-12-04T13:48:23.7024537Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::BatchNorm3d:0, line 565 <- wrt source file 2025-12-04T13:48:23.7944114Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::BatchNorm3d:0 2025-12-04T13:48:23.7944643Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::SyncBatchNorm:0, line 690 <- wrt source file 2025-12-04T13:48:23.7945099Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::SyncBatchNorm:0 2025-12-04T13:48:23.7945589Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::SyncBatchNorm.convert_sync_batchnorm:0, line 857 <- wrt source file 2025-12-04T13:48:23.7946118Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py::SyncBatchNorm.convert_sync_batchnorm:0 2025-12-04T13:48:23.7946616Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/channelshuffle.py::ChannelShuffle:0, line 21 <- wrt source file 2025-12-04T13:48:23.7987410Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/channelshuffle.py::ChannelShuffle:0 2025-12-04T13:48:23.7992654Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::Sequential:0, line 81 <- wrt source file 2025-12-04T13:48:23.7993090Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::Sequential:0 2025-12-04T13:48:23.7993780Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::Sequential.append:0, line 263 <- wrt source file 2025-12-04T13:48:23.7994247Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::Sequential.append:0 2025-12-04T13:48:23.7994691Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::Sequential.insert:0, line 286 <- wrt source file 2025-12-04T13:48:23.7997773Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::Sequential.insert:0 2025-12-04T13:48:23.7998288Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::Sequential.extend:0, line 317 <- wrt source file 2025-12-04T13:48:23.8002260Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::Sequential.extend:0 2025-12-04T13:48:23.8002752Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::ModuleList:0, line 346 <- wrt source file 2025-12-04T13:48:23.8003191Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::ModuleList:0 2025-12-04T13:48:23.8003619Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::ModuleDict:0, line 529 <- wrt source file 2025-12-04T13:48:23.8004053Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::ModuleDict:0 2025-12-04T13:48:23.8004488Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::ParameterList:0, line 661 <- wrt source file 2025-12-04T13:48:23.8004934Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::ParameterList:0 2025-12-04T13:48:23.8005373Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::ParameterDict:0, line 819 <- wrt source file 2025-12-04T13:48:23.8005813Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/container.py::ParameterDict:0 2025-12-04T13:48:23.8006252Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/distance.py::PairwiseDistance:0, line 38 <- wrt source file 2025-12-04T13:48:23.8008554Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/distance.py::PairwiseDistance:0 2025-12-04T13:48:23.8009002Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/distance.py::CosineSimilarity:0, line 81 <- wrt source file 2025-12-04T13:48:23.8011798Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/distance.py::CosineSimilarity:0 2025-12-04T13:48:23.8012263Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::Dropout:0, line 60 <- wrt source file 2025-12-04T13:48:23.8014379Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::Dropout:0 2025-12-04T13:48:23.8014803Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::Dropout1d:0, line 108 <- wrt source file 2025-12-04T13:48:23.8016646Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::Dropout1d:0 2025-12-04T13:48:23.8062879Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::Dropout2d:0, line 163 <- wrt source file 2025-12-04T13:48:23.8063300Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::Dropout2d:0 2025-12-04T13:48:23.8063721Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::Dropout3d:0, line 211 <- wrt source file 2025-12-04T13:48:23.8143183Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::Dropout3d:0 2025-12-04T13:48:23.8143603Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::AlphaDropout:0, line 257 <- wrt source file 2025-12-04T13:48:23.8145236Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::AlphaDropout:0 2025-12-04T13:48:23.8145718Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::FeatureAlphaDropout:0, line 309 <- wrt source file 2025-12-04T13:48:23.8352730Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/dropout.py::FeatureAlphaDropout:0 2025-12-04T13:48:23.8353234Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/flatten.py::Flatten:0, line 29 <- wrt source file 2025-12-04T13:48:23.8356260Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/flatten.py::Flatten:0 2025-12-04T13:48:23.8357429Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/flatten.py::Unflatten:0, line 86 <- wrt source file 2025-12-04T13:48:23.8505984Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/flatten.py::Unflatten:0 2025-12-04T13:48:23.8506457Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/fold.py::Fold:0, line 224 <- wrt source file 2025-12-04T13:48:23.8508975Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/fold.py::Fold:0 2025-12-04T13:48:23.8509383Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/fold.py::Unfold:0, line 395 <- wrt source file 2025-12-04T13:48:23.8909215Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/fold.py::Unfold:0 2025-12-04T13:48:23.8909653Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/instancenorm.py::InstanceNorm1d:0, line 188 <- wrt source file 2025-12-04T13:48:23.9213364Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/instancenorm.py::InstanceNorm1d:0 2025-12-04T13:48:23.9213837Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/instancenorm.py::InstanceNorm2d:0, line 304 <- wrt source file 2025-12-04T13:48:23.9544501Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/instancenorm.py::InstanceNorm2d:0 2025-12-04T13:48:23.9545130Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/instancenorm.py::InstanceNorm3d:0, line 420 <- wrt source file 2025-12-04T13:48:24.0478034Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/instancenorm.py::InstanceNorm3d:0 2025-12-04T13:48:24.0492561Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/lazy.py::LazyModuleMixin:0, line 77 <- wrt source file 2025-12-04T13:48:24.0493070Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/lazy.py::LazyModuleMixin:0 2025-12-04T13:48:24.0493524Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/linear.py::Identity:0, line 34 <- wrt source file 2025-12-04T13:48:24.0493953Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/linear.py::Identity:0 2025-12-04T13:48:24.0494363Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/linear.py::Linear:0, line 83 <- wrt source file 2025-12-04T13:48:24.0555222Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/linear.py::Linear:0 2025-12-04T13:48:24.0555715Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/linear.py::Bilinear:0, line 191 <- wrt source file 2025-12-04T13:48:24.6469907Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/linear.py::Bilinear:0 2025-12-04T13:48:24.6470679Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::L1Loss:0, line 116 <- wrt source file 2025-12-04T13:48:24.6476353Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::L1Loss:0 2025-12-04T13:48:24.6476775Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::NLLLoss:0, line 213 <- wrt source file 2025-12-04T13:48:24.7583431Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::NLLLoss:0 2025-12-04T13:48:24.7584176Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::PoissonNLLLoss:0, line 327 <- wrt source file 2025-12-04T13:48:24.7591803Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::PoissonNLLLoss:0 2025-12-04T13:48:24.7592863Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::GaussianNLLLoss:0, line 416 <- wrt source file 2025-12-04T13:48:24.7604139Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::GaussianNLLLoss:0 2025-12-04T13:48:24.7604626Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::KLDivLoss:0, line 531 <- wrt source file 2025-12-04T13:48:24.7832374Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::KLDivLoss:0 2025-12-04T13:48:24.7832873Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::MSELoss:0, line 613 <- wrt source file 2025-12-04T13:48:24.7833339Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::MSELoss:0 2025-12-04T13:48:24.7833776Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::BCELoss:0, line 696 <- wrt source file 2025-12-04T13:48:24.7834236Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::BCELoss:0 2025-12-04T13:48:24.7834689Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::BCEWithLogitsLoss:0, line 762 <- wrt source file 2025-12-04T13:48:24.7868501Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::BCEWithLogitsLoss:0 2025-12-04T13:48:24.7868976Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::BCEWithLogitsLoss:1, line 810 <- wrt source file 2025-12-04T13:48:24.7934229Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::BCEWithLogitsLoss:1 2025-12-04T13:48:24.7934719Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::MultiLabelMarginLoss:0, line 958 <- wrt source file 2025-12-04T13:48:24.7938934Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::MultiLabelMarginLoss:0 2025-12-04T13:48:24.7939442Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::CrossEntropyLoss:0, line 1284 <- wrt source file 2025-12-04T13:48:24.8430882Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::CrossEntropyLoss:0 2025-12-04T13:48:24.8431506Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::CrossEntropyLoss:1, line 1311 <- wrt source file 2025-12-04T13:48:24.8431995Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::CrossEntropyLoss:1 2025-12-04T13:48:24.8432440Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::CosineEmbeddingLoss:0, line 1464 <- wrt source file 2025-12-04T13:48:24.8433008Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::CosineEmbeddingLoss:0 2025-12-04T13:48:24.8433448Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::MarginRankingLoss:0, line 1531 <- wrt source file 2025-12-04T13:48:24.8433891Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::MarginRankingLoss:0 2025-12-04T13:48:24.8434367Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::MultiMarginLoss:0, line 1612 <- wrt source file 2025-12-04T13:48:24.8434798Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::MultiMarginLoss:0 2025-12-04T13:48:24.8435225Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::TripletMarginLoss:0, line 1714 <- wrt source file 2025-12-04T13:48:24.8443731Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::TripletMarginLoss:0 2025-12-04T13:48:24.8444204Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::TripletMarginWithDistanceLoss:0, line 1827 <- wrt source file 2025-12-04T13:48:24.8812855Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::TripletMarginWithDistanceLoss:0 2025-12-04T13:48:24.8843014Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::CTCLoss:0, line 1959 <- wrt source file 2025-12-04T13:48:24.9253965Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/loss.py::CTCLoss:0 2025-12-04T13:48:24.9342788Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.register_buffer:0, line 554 <- wrt source file 2025-12-04T13:48:24.9343322Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.register_buffer:0 2025-12-04T13:48:24.9343772Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.apply:0, line 1048 <- wrt source file 2025-12-04T13:48:24.9344212Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.apply:0 2025-12-04T13:48:24.9344630Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.to:0, line 1299 <- wrt source file 2025-12-04T13:48:24.9345052Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.to:0 2025-12-04T13:48:24.9345479Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.state_dict:0, line 2232 <- wrt source file 2025-12-04T13:48:24.9345924Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.state_dict:0 2025-12-04T13:48:24.9346362Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.parameters:0, line 2678 <- wrt source file 2025-12-04T13:48:24.9346807Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.parameters:0 2025-12-04T13:48:24.9347505Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.named_parameters:0, line 2706 <- wrt source file 2025-12-04T13:48:24.9347971Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.named_parameters:0 2025-12-04T13:48:24.9348412Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.buffers:0, line 2733 <- wrt source file 2025-12-04T13:48:24.9348950Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.buffers:0 2025-12-04T13:48:24.9349391Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.named_buffers:0, line 2760 <- wrt source file 2025-12-04T13:48:24.9349896Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.named_buffers:0 2025-12-04T13:48:24.9350347Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.named_children:0, line 2791 <- wrt source file 2025-12-04T13:48:24.9350804Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.named_children:0 2025-12-04T13:48:24.9351237Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.modules:0, line 2815 <- wrt source file 2025-12-04T13:48:24.9351672Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.modules:0 2025-12-04T13:48:24.9352215Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.named_modules:0, line 2853 <- wrt source file 2025-12-04T13:48:24.9352669Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/module.py::Module.named_modules:0 2025-12-04T13:48:24.9353136Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/normalization.py::LocalResponseNorm:0, line 38 <- wrt source file 2025-12-04T13:48:25.0424868Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/normalization.py::LocalResponseNorm:0 2025-12-04T13:48:25.0425376Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/normalization.py::LayerNorm:0, line 163 <- wrt source file 2025-12-04T13:48:25.0563288Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/normalization.py::LayerNorm:0 2025-12-04T13:48:25.0563742Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/normalization.py::GroupNorm:0, line 274 <- wrt source file 2025-12-04T13:48:25.0673046Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/normalization.py::GroupNorm:0 2025-12-04T13:48:25.0673484Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/normalization.py::RMSNorm:0, line 369 <- wrt source file 2025-12-04T13:48:25.0677505Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/normalization.py::RMSNorm:0 2025-12-04T13:48:25.0678344Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::CircularPad1d:0, line 70 <- wrt source file 2025-12-04T13:48:25.0682140Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::CircularPad1d:0 2025-12-04T13:48:25.0682708Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::CircularPad2d:0, line 123 <- wrt source file 2025-12-04T13:48:25.0697065Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::CircularPad2d:0 2025-12-04T13:48:25.0697650Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::CircularPad3d:0, line 189 <- wrt source file 2025-12-04T13:48:25.3530437Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::CircularPad3d:0 2025-12-04T13:48:25.3544188Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ConstantPad1d:0, line 244 <- wrt source file 2025-12-04T13:48:25.3553309Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ConstantPad1d:0 2025-12-04T13:48:25.3553760Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ConstantPad2d:0, line 298 <- wrt source file 2025-12-04T13:48:25.3557140Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ConstantPad2d:0 2025-12-04T13:48:25.3557580Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ConstantPad3d:0, line 355 <- wrt source file 2025-12-04T13:48:25.3862778Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ConstantPad3d:0 2025-12-04T13:48:25.3902408Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReflectionPad1d:0, line 401 <- wrt source file 2025-12-04T13:48:25.4014387Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReflectionPad1d:0 2025-12-04T13:48:25.4014912Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReflectionPad2d:0, line 446 <- wrt source file 2025-12-04T13:48:25.4124262Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReflectionPad2d:0 2025-12-04T13:48:25.4162872Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReflectionPad3d:0, line 505 <- wrt source file 2025-12-04T13:48:25.4203568Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReflectionPad3d:0 2025-12-04T13:48:25.4204169Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReplicationPad1d:0, line 565 <- wrt source file 2025-12-04T13:48:25.4353198Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReplicationPad1d:0 2025-12-04T13:48:25.4353656Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReplicationPad2d:0, line 610 <- wrt source file 2025-12-04T13:48:25.4503797Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReplicationPad2d:0 2025-12-04T13:48:25.4504255Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReplicationPad3d:0, line 669 <- wrt source file 2025-12-04T13:48:25.5997715Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ReplicationPad3d:0 2025-12-04T13:48:25.5998202Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ZeroPad1d:0, line 704 <- wrt source file 2025-12-04T13:48:25.6000782Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ZeroPad1d:0 2025-12-04T13:48:25.6001201Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ZeroPad2d:0, line 762 <- wrt source file 2025-12-04T13:48:25.6003078Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ZeroPad2d:0 2025-12-04T13:48:25.6003877Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ZeroPad3d:0, line 824 <- wrt source file 2025-12-04T13:48:25.6382344Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/padding.py::ZeroPad3d:0 2025-12-04T13:48:25.6382843Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pixelshuffle.py::PixelShuffle:0, line 40 <- wrt source file 2025-12-04T13:48:25.6387626Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pixelshuffle.py::PixelShuffle:0 2025-12-04T13:48:25.6402961Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pixelshuffle.py::PixelUnshuffle:0, line 99 <- wrt source file 2025-12-04T13:48:25.6443042Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pixelshuffle.py::PixelUnshuffle:0 2025-12-04T13:48:25.6443929Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxPool1d:0, line 129 <- wrt source file 2025-12-04T13:48:25.6542462Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxPool1d:0 2025-12-04T13:48:25.6542931Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxPool2d:0, line 207 <- wrt source file 2025-12-04T13:48:25.6594357Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxPool2d:0 2025-12-04T13:48:25.6602860Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxPool3d:0, line 291 <- wrt source file 2025-12-04T13:48:25.7153439Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxPool3d:0 2025-12-04T13:48:25.7262455Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxUnpool1d:0, line 366 <- wrt source file 2025-12-04T13:48:25.7383236Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxUnpool1d:0 2025-12-04T13:48:25.7412740Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxUnpool2d:0, line 452 <- wrt source file 2025-12-04T13:48:25.7534967Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxUnpool2d:0 2025-12-04T13:48:25.7535416Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxUnpool3d:0, line 550 <- wrt source file 2025-12-04T13:48:25.7926091Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::MaxUnpool3d:0 2025-12-04T13:48:25.7926581Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AvgPool1d:0, line 642 <- wrt source file 2025-12-04T13:48:25.8007112Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AvgPool1d:0 2025-12-04T13:48:25.8007577Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AvgPool2d:0, line 738 <- wrt source file 2025-12-04T13:48:25.8083231Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AvgPool2d:0 2025-12-04T13:48:25.8083657Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AvgPool3d:0, line 855 <- wrt source file 2025-12-04T13:48:25.8614764Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AvgPool3d:0 2025-12-04T13:48:25.8622948Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::FractionalMaxPool2d:0, line 946 <- wrt source file 2025-12-04T13:48:25.8705591Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::FractionalMaxPool2d:0 2025-12-04T13:48:25.8706069Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::FractionalMaxPool3d:0, line 1033 <- wrt source file 2025-12-04T13:48:25.9025007Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::FractionalMaxPool3d:0 2025-12-04T13:48:25.9025596Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::LPPool1d:0, line 1156 <- wrt source file 2025-12-04T13:48:25.9203102Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::LPPool1d:0 2025-12-04T13:48:25.9203631Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::LPPool2d:0, line 1212 <- wrt source file 2025-12-04T13:48:25.9861698Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::LPPool2d:0 2025-12-04T13:48:25.9862165Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::LPPool3d:0, line 1276 <- wrt source file 2025-12-04T13:48:26.0972554Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::LPPool3d:0 2025-12-04T13:48:26.0973073Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveMaxPool1d:0, line 1332 <- wrt source file 2025-12-04T13:48:26.1013341Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveMaxPool1d:0 2025-12-04T13:48:26.1013824Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveMaxPool2d:0, line 1367 <- wrt source file 2025-12-04T13:48:26.1232953Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveMaxPool2d:0 2025-12-04T13:48:26.1233411Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveMaxPool3d:0, line 1411 <- wrt source file 2025-12-04T13:48:26.1464507Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveMaxPool3d:0 2025-12-04T13:48:26.1465069Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveAvgPool1d:0, line 1459 <- wrt source file 2025-12-04T13:48:26.1533060Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveAvgPool1d:0 2025-12-04T13:48:26.1572929Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveAvgPool2d:0, line 1493 <- wrt source file 2025-12-04T13:48:26.1756589Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveAvgPool2d:0 2025-12-04T13:48:26.1757044Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveAvgPool3d:0, line 1533 <- wrt source file 2025-12-04T13:48:26.1983648Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/pooling.py::AdaptiveAvgPool3d:0 2025-12-04T13:48:26.1984078Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::RNN:0, line 598 <- wrt source file 2025-12-04T13:48:26.2863824Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::RNN:0 2025-12-04T13:48:26.2881560Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::LSTM:0, line 963 <- wrt source file 2025-12-04T13:48:26.4013368Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::LSTM:0 2025-12-04T13:48:26.4013774Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::GRU:0, line 1305 <- wrt source file 2025-12-04T13:48:26.4903194Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::GRU:0 2025-12-04T13:48:26.4903916Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::RNNCell:0, line 1561 <- wrt source file 2025-12-04T13:48:26.5783766Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::RNNCell:0 2025-12-04T13:48:26.5784188Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::LSTMCell:0, line 1683 <- wrt source file 2025-12-04T13:48:26.6074115Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::LSTMCell:0 2025-12-04T13:48:26.6074524Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::GRUCell:0, line 1797 <- wrt source file 2025-12-04T13:48:26.6953735Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/rnn.py::GRUCell:0 2025-12-04T13:48:26.6954201Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py::Embedding:0, line 71 <- wrt source file 2025-12-04T13:48:26.6961838Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py::Embedding:0 2025-12-04T13:48:26.6963155Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py::Embedding.from_pretrained:0, line 243 <- wrt source file 2025-12-04T13:48:26.6965424Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py::Embedding.from_pretrained:0 2025-12-04T13:48:26.6966207Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py::EmbeddingBag:0, line 324 <- wrt source file 2025-12-04T13:48:26.7027524Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py::EmbeddingBag:0 2025-12-04T13:48:26.7027990Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py::EmbeddingBag.from_pretrained:0, line 523 <- wrt source file 2025-12-04T13:48:26.7031743Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py::EmbeddingBag.from_pretrained:0 2025-12-04T13:48:26.7032293Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::Transformer:0, line 91 <- wrt source file 2025-12-04T13:48:32.5817164Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::Transformer:0 2025-12-04T13:48:32.5834125Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::Transformer.forward:0, line 267 <- wrt source file 2025-12-04T13:48:32.5835495Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::Transformer.forward:0 2025-12-04T13:48:32.5836553Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::TransformerEncoder:0, line 345 <- wrt source file 2025-12-04T13:48:33.8986437Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::TransformerEncoder:0 2025-12-04T13:48:33.8993295Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::TransformerDecoder:0, line 578 <- wrt source file 2025-12-04T13:48:36.5565937Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::TransformerDecoder:0 2025-12-04T13:48:36.5573515Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::TransformerEncoderLayer:0, line 702 <- wrt source file 2025-12-04T13:48:36.7504906Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::TransformerEncoderLayer:0 2025-12-04T13:48:36.7506914Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::TransformerDecoderLayer:0, line 1014 <- wrt source file 2025-12-04T13:48:37.1665782Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py::TransformerDecoderLayer:0 2025-12-04T13:48:37.1759085Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/upsampling.py::Upsample:0, line 77 <- wrt source file 2025-12-04T13:48:37.1780111Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/upsampling.py::Upsample:0 2025-12-04T13:48:37.1780967Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/upsampling.py::UpsamplingNearest2d:0, line 229 <- wrt source file 2025-12-04T13:48:37.1789170Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/upsampling.py::UpsamplingNearest2d:0 2025-12-04T13:48:37.1790125Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/upsampling.py::UpsamplingBilinear2d:0, line 279 <- wrt source file 2025-12-04T13:48:37.1795342Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/upsampling.py::UpsamplingBilinear2d:0 2025-12-04T13:48:37.1795824Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py::DataParallel:0, line 128 <- wrt source file 2025-12-04T13:48:37.1796332Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/data_parallel.py::DataParallel:0 2025-12-04T13:48:37.1796931Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel:0, line 644 <- wrt source file 2025-12-04T13:48:37.1798832Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel:0 2025-12-04T13:48:37.1799476Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel.no_sync:0, line 1451 <- wrt source file 2025-12-04T13:48:37.1800038Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel.no_sync:0 2025-12-04T13:48:37.1800577Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel.join:0, line 1838 <- wrt source file 2025-12-04T13:48:37.1801108Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel.join:0 2025-12-04T13:48:37.1801645Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel.register_comm_hook:0, line 2004 <- wrt source file 2025-12-04T13:48:37.1802386Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel.register_comm_hook:0 2025-12-04T13:48:37.1802934Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel.register_comm_hook:1, line 2014 <- wrt source file 2025-12-04T13:48:37.1803821Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel.register_comm_hook:1 2025-12-04T13:48:37.1804388Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel._register_builtin_comm_hook:0, line 2049 <- wrt source file 2025-12-04T13:48:37.1805001Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel._register_builtin_comm_hook:0 2025-12-04T13:48:37.1805684Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel._register_fused_optim:0, line 2107 <- wrt source file 2025-12-04T13:48:37.1806252Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/parallel/distributed.py::DistributedDataParallel._register_fused_optim:0 2025-12-04T13:48:37.1806819Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/_per_sample_grad.py::call_for_per_sample_grads:0, line 35 <- wrt source file 2025-12-04T13:48:37.1807303Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/_per_sample_grad.py::call_for_per_sample_grads:0 2025-12-04T13:48:37.1807730Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/init.py::skip_init:0, line 33 <- wrt source file 2025-12-04T13:48:37.1812267Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/init.py::skip_init:0 2025-12-04T13:48:37.1812755Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/memory_format.py::convert_conv2d_weight_memory_format:0, line 64 <- wrt source file 2025-12-04T13:48:37.1813502Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/memory_format.py::convert_conv2d_weight_memory_format:0 2025-12-04T13:48:37.1814029Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/memory_format.py::convert_conv3d_weight_memory_format:0, line 143 <- wrt source file 2025-12-04T13:48:37.1814579Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/memory_format.py::convert_conv3d_weight_memory_format:0 2025-12-04T13:48:37.1815073Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/parametrizations.py::orthogonal:0, line 267 <- wrt source file 2025-12-04T13:48:37.1815695Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/parametrizations.py::orthogonal:0 2025-12-04T13:48:37.1816263Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/parametrizations.py::weight_norm:0, line 362 <- wrt source file 2025-12-04T13:48:37.1821830Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/parametrizations.py::weight_norm:0 2025-12-04T13:48:37.1822338Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/parametrizations.py::spectral_norm:0, line 593 <- wrt source file 2025-12-04T13:48:37.1822928Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/parametrizations.py::spectral_norm:0 2025-12-04T13:48:37.1823378Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::identity:0, line 852 <- wrt source file 2025-12-04T13:48:37.1823824Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::identity:0 2025-12-04T13:48:37.1824346Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::random_unstructured:0, line 888 <- wrt source file 2025-12-04T13:48:37.1824951Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::random_unstructured:0 2025-12-04T13:48:37.1825389Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::l1_unstructured:0, line 931 <- wrt source file 2025-12-04T13:48:37.1825822Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::l1_unstructured:0 2025-12-04T13:48:37.1826266Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::random_structured:0, line 971 <- wrt source file 2025-12-04T13:48:37.1827082Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::random_structured:0 2025-12-04T13:48:37.1827522Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::ln_structured:0, line 1017 <- wrt source file 2025-12-04T13:48:37.1836812Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::ln_structured:0 2025-12-04T13:48:37.1838543Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::global_unstructured:0, line 1072 <- wrt source file 2025-12-04T13:48:37.1847334Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::global_unstructured:0 2025-12-04T13:48:37.1847764Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::custom_from_mask:0, line 1175 <- wrt source file 2025-12-04T13:48:37.1852052Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::custom_from_mask:0 2025-12-04T13:48:37.1852499Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::remove:0, line 1203 <- wrt source file 2025-12-04T13:48:37.1855801Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::remove:0 2025-12-04T13:48:37.1856206Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::is_pruned:0, line 1231 <- wrt source file 2025-12-04T13:48:37.1859882Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/prune.py::is_pruned:0 2025-12-04T13:48:37.1860307Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::pad_packed_sequence:0, line 359 <- wrt source file 2025-12-04T13:48:37.1868464Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::pad_packed_sequence:0 2025-12-04T13:48:37.1868881Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::pad_sequence:0, line 439 <- wrt source file 2025-12-04T13:48:37.1871242Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::pad_sequence:0 2025-12-04T13:48:37.1871652Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::unpad_sequence:0, line 500 <- wrt source file 2025-12-04T13:48:37.2142408Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::unpad_sequence:0 2025-12-04T13:48:37.2142841Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::pack_sequence:0, line 556 <- wrt source file 2025-12-04T13:48:37.2143255Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::pack_sequence:0 2025-12-04T13:48:37.2143659Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::unpack_sequence:0, line 584 <- wrt source file 2025-12-04T13:48:37.2144076Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/rnn.py::unpack_sequence:0 2025-12-04T13:48:37.2144558Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/spectral_norm.py::spectral_norm:0, line 314 <- wrt source file 2025-12-04T13:48:37.2145003Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/spectral_norm.py::spectral_norm:0 2025-12-04T13:48:37.2145451Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/spectral_norm.py::remove_spectral_norm:0, line 347 <- wrt source file 2025-12-04T13:48:37.2145970Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/spectral_norm.py::remove_spectral_norm:0 2025-12-04T13:48:37.2146425Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/stateless.py::functional_call:0, line 193 <- wrt source file 2025-12-04T13:48:37.2146867Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/stateless.py::functional_call:0 2025-12-04T13:48:37.2147317Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py::weight_norm:0, line 134 <- wrt source file 2025-12-04T13:48:37.2172432Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py::weight_norm:0 2025-12-04T13:48:37.2172888Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py::remove_weight_norm:0, line 156 <- wrt source file 2025-12-04T13:48:37.2326582Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py::remove_weight_norm:0 2025-12-04T13:48:37.2332334Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/_expanded_weights/conv_utils.py::unfold3d:0, line 315 <- wrt source file 2025-12-04T13:48:37.2332867Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/_expanded_weights/conv_utils.py::unfold3d:0 2025-12-04T13:48:37.2333410Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/_expanded_weights/expanded_weights_utils.py::sum_over_all_but_batch_and_last_n:0, line 178 <- wrt source file 2025-12-04T13:48:37.2334011Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/_expanded_weights/expanded_weights_utils.py::sum_over_all_but_batch_and_last_n:0 2025-12-04T13:48:37.2334510Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::LambdaLR:0, line 357 <- wrt source file 2025-12-04T13:48:37.2334948Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::LambdaLR:0 2025-12-04T13:48:37.2335378Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::MultiplicativeLR:0, line 483 <- wrt source file 2025-12-04T13:48:37.2335833Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::MultiplicativeLR:0 2025-12-04T13:48:37.2336261Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::StepLR:0, line 608 <- wrt source file 2025-12-04T13:48:37.2336674Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::StepLR:0 2025-12-04T13:48:37.2337087Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::MultiStepLR:0, line 695 <- wrt source file 2025-12-04T13:48:37.2337514Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::MultiStepLR:0 2025-12-04T13:48:37.2337924Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::ConstantLR:0, line 791 <- wrt source file 2025-12-04T13:48:37.2338494Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::ConstantLR:0 2025-12-04T13:48:37.2338922Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::LinearLR:0, line 898 <- wrt source file 2025-12-04T13:48:37.2339335Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::LinearLR:0 2025-12-04T13:48:37.2339752Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::ExponentialLR:0, line 1020 <- wrt source file 2025-12-04T13:48:37.2340241Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::ExponentialLR:0 2025-12-04T13:48:37.2340666Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::SequentialLR:0, line 1097 <- wrt source file 2025-12-04T13:48:37.2341115Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::SequentialLR:0 2025-12-04T13:48:37.2341536Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::PolynomialLR:0, line 1249 <- wrt source file 2025-12-04T13:48:37.2342024Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::PolynomialLR:0 2025-12-04T13:48:37.2342456Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CosineAnnealingLR:0, line 1378 <- wrt source file 2025-12-04T13:48:37.2342911Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CosineAnnealingLR:0 2025-12-04T13:48:37.2343349Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::ChainedScheduler:0, line 1490 <- wrt source file 2025-12-04T13:48:37.2343793Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::ChainedScheduler:0 2025-12-04T13:48:37.2344210Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CyclicLR:0, line 1863 <- wrt source file 2025-12-04T13:48:37.2344625Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CyclicLR:0 2025-12-04T13:48:37.2345077Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CosineAnnealingWarmRestarts:0, line 2129 <- wrt source file 2025-12-04T13:48:37.2345563Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CosineAnnealingWarmRestarts:0 2025-12-04T13:48:37.2346048Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CosineAnnealingWarmRestarts.step:0, line 2211 <- wrt source file 2025-12-04T13:48:37.2346554Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CosineAnnealingWarmRestarts.step:0 2025-12-04T13:48:37.2347048Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CosineAnnealingWarmRestarts.step:1, line 2227 <- wrt source file 2025-12-04T13:48:37.2347553Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::CosineAnnealingWarmRestarts.step:1 2025-12-04T13:48:37.2348008Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::OneCycleLR:0, line 2367 <- wrt source file 2025-12-04T13:48:37.2348427Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py::OneCycleLR:0 2025-12-04T13:48:37.2348860Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/optimizer.py::Optimizer.load_state_dict:0, line 900 <- wrt source file 2025-12-04T13:48:37.2349342Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/optimizer.py::Optimizer.load_state_dict:0 2025-12-04T13:48:37.2349775Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/swa_utils.py::AveragedModel:0, line 155 <- wrt source file 2025-12-04T13:48:37.2350196Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/swa_utils.py::AveragedModel:0 2025-12-04T13:48:37.2350633Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/swa_utils.py::AveragedModel:1, line 181 <- wrt source file 2025-12-04T13:48:37.2351050Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/swa_utils.py::AveragedModel:1 2025-12-04T13:48:37.2351453Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/swa_utils.py::update_bn:0, line 350 <- wrt source file 2025-12-04T13:48:37.2351919Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/swa_utils.py::update_bn:0 2025-12-04T13:48:37.2352311Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/swa_utils.py::SWALR:0, line 409 <- wrt source file 2025-12-04T13:48:37.2352713Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/swa_utils.py::SWALR:0 2025-12-04T13:48:37.2353116Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/package/glob_group.py::GlobGroup:0, line 22 <- wrt source file 2025-12-04T13:48:37.2353538Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/package/glob_group.py::GlobGroup:0 2025-12-04T13:48:37.2354012Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/profiler/profiler.py::_KinetoProfile.toggle_collection_dynamic:0, line 317 <- wrt source file 2025-12-04T13:48:37.2354534Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/profiler/profiler.py::_KinetoProfile.toggle_collection_dynamic:0 2025-12-04T13:48:37.2354989Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/profiler/profiler.py::profile:0, line 659 <- wrt source file 2025-12-04T13:48:37.2355400Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/profiler/profiler.py::profile:0 2025-12-04T13:48:37.2355836Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/sparse/semi_structured.py::to_sparse_semi_structured:0, line 342 <- wrt source file 2025-12-04T13:48:37.2356318Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/sparse/semi_structured.py::to_sparse_semi_structured:0 2025-12-04T13:48:37.2356771Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_comparison.py::assert_close:0, line 1477 <- wrt source file 2025-12-04T13:48:37.2376232Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_comparison.py::assert_close:0 2025-12-04T13:48:37.2376659Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_creation.py::make_tensor:0, line 114 <- wrt source file 2025-12-04T13:48:37.2377089Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_creation.py::make_tensor:0 2025-12-04T13:48:37.2377542Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::parametrize:0, line 648 <- wrt source file 2025-12-04T13:48:37.2378008Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::parametrize:0 2025-12-04T13:48:37.2378477Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::reparametrize:0, line 769 <- wrt source file 2025-12-04T13:48:37.2378994Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::reparametrize:0 2025-12-04T13:48:37.2379456Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::decorateIf:0, line 858 <- wrt source file 2025-12-04T13:48:37.2379918Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::decorateIf:0 2025-12-04T13:48:37.2380431Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::random_symmetric_psd_matrix:0, line 4839 <- wrt source file 2025-12-04T13:48:37.2380947Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::random_symmetric_psd_matrix:0 2025-12-04T13:48:37.2381469Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::random_hermitian_psd_matrix:0, line 4853 <- wrt source file 2025-12-04T13:48:37.2382033Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::random_hermitian_psd_matrix:0 2025-12-04T13:48:37.2382535Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::random_hermitian_pd_matrix:0, line 4883 <- wrt source file 2025-12-04T13:48:37.2383044Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py::random_hermitian_pd_matrix:0 2025-12-04T13:48:37.2383523Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/logging_utils.py::logs_to_string:0, line 194 <- wrt source file 2025-12-04T13:48:37.2383995Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/logging_utils.py::logs_to_string:0 2025-12-04T13:48:37.2384479Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/logging_utils.py::multiple_logs_to_string:0, line 220 <- wrt source file 2025-12-04T13:48:37.2384980Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/logging_utils.py::multiple_logs_to_string:0 2025-12-04T13:48:37.2385517Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/distributed/_tensor/common_dtensor.py::skip_unless_torch_gpu:0, line 341 <- wrt source file 2025-12-04T13:48:37.2386087Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/distributed/_tensor/common_dtensor.py::skip_unless_torch_gpu:0 2025-12-04T13:48:37.2386648Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/autograd_registration.py::autograd_registration_check:0, line 29 <- wrt source file 2025-12-04T13:48:37.2395945Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/optests/autograd_registration.py::autograd_registration_check:0 2025-12-04T13:48:37.2396442Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::register_pytree_node:0, line 159 <- wrt source file 2025-12-04T13:48:37.2396890Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::register_pytree_node:0 2025-12-04T13:48:37.2397312Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_is_leaf:0, line 316 <- wrt source file 2025-12-04T13:48:37.2399927Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_is_leaf:0 2025-12-04T13:48:37.2400347Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_flatten:0, line 359 <- wrt source file 2025-12-04T13:48:37.2403614Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_flatten:0 2025-12-04T13:48:37.2404043Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_unflatten:0, line 396 <- wrt source file 2025-12-04T13:48:37.2405480Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_unflatten:0 2025-12-04T13:48:37.2405961Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_iter:0, line 429 <- wrt source file 2025-12-04T13:48:37.2408677Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_iter:0 2025-12-04T13:48:37.2409101Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_leaves:0, line 464 <- wrt source file 2025-12-04T13:48:37.2410714Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_leaves:0 2025-12-04T13:48:37.2411130Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_structure:0, line 499 <- wrt source file 2025-12-04T13:48:37.2412602Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_structure:0 2025-12-04T13:48:37.2413018Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_map:0, line 536 <- wrt source file 2025-12-04T13:48:37.2415264Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::tree_map:0 2025-12-04T13:48:37.2415694Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::broadcast_prefix:0, line 929 <- wrt source file 2025-12-04T13:48:37.2419188Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_cxx_pytree.py::broadcast_prefix:0 2025-12-04T13:48:37.2419619Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_pytree.py::register_dataclass:0, line 308 <- wrt source file 2025-12-04T13:48:37.2425833Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_pytree.py::register_dataclass:0 2025-12-04T13:48:37.2426256Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_pytree.py::register_constant:0, line 428 <- wrt source file 2025-12-04T13:48:37.2430786Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_pytree.py::register_constant:0 2025-12-04T13:48:37.2431196Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_pytree.py::tree_is_leaf:0, line 1058 <- wrt source file 2025-12-04T13:48:37.2433714Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_pytree.py::tree_is_leaf:0 2025-12-04T13:48:37.2434113Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_pytree.py::tree_map:0, line 1497 <- wrt source file 2025-12-04T13:48:37.2436664Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_pytree.py::tree_map:0 2025-12-04T13:48:37.2437112Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/backend_registration.py::rename_privateuse1_backend:0, line 71 <- wrt source file 2025-12-04T13:48:37.2437616Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/backend_registration.py::rename_privateuse1_backend:0 2025-12-04T13:48:37.2438135Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/backend_registration.py::generate_methods_for_privateuse1_backend:0, line 382 <- wrt source file 2025-12-04T13:48:37.2438706Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/backend_registration.py::generate_methods_for_privateuse1_backend:0 2025-12-04T13:48:37.2439201Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/backend_registration.py::_get_custom_mod_func:0, line 417 <- wrt source file 2025-12-04T13:48:37.2439675Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/backend_registration.py::_get_custom_mod_func:0 2025-12-04T13:48:37.2440171Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/checkpoint.py::checkpoint_sequential:0, line 561 <- wrt source file 2025-12-04T13:48:37.2440632Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/checkpoint.py::checkpoint_sequential:0 2025-12-04T13:48:37.2441093Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/checkpoint.py::set_checkpoint_early_stop:0, line 763 <- wrt source file 2025-12-04T13:48:37.2441555Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/checkpoint.py::set_checkpoint_early_stop:0 2025-12-04T13:48:37.2442055Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/checkpoint.py::SelectiveCheckpointContext:0, line 1257 <- wrt source file 2025-12-04T13:48:37.2442532Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/checkpoint.py::SelectiveCheckpointContext:0 2025-12-04T13:48:37.2443011Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/checkpoint.py::create_selective_checkpoint_contexts:0, line 1421 <- wrt source file 2025-12-04T13:48:37.2448171Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/checkpoint.py::create_selective_checkpoint_contexts:0 2025-12-04T13:48:37.2448648Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::CppExtension:0, line 1247 <- wrt source file 2025-12-04T13:48:37.2449084Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::CppExtension:0 2025-12-04T13:48:37.2449522Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::CUDAExtension:0, line 1319 <- wrt source file 2025-12-04T13:48:37.2449961Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::CUDAExtension:0 2025-12-04T13:48:37.2450384Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::CUDAExtension:1, line 1397 <- wrt source file 2025-12-04T13:48:37.2450817Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::CUDAExtension:1 2025-12-04T13:48:37.2451244Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::SyclExtension:0, line 1509 <- wrt source file 2025-12-04T13:48:37.2451683Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::SyclExtension:0 2025-12-04T13:48:37.2452124Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::load:0, line 1759 <- wrt source file 2025-12-04T13:48:37.2452529Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::load:0 2025-12-04T13:48:37.2452935Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::load_inline:0, line 2032 <- wrt source file 2025-12-04T13:48:37.2453356Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py::load_inline:0 2025-12-04T13:48:37.2453785Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/dlpack.py::from_dlpack:0, line 93 <- wrt source file 2025-12-04T13:48:37.2459296Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/dlpack.py::from_dlpack:0 2025-12-04T13:48:37.2459739Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/throughput_benchmark.py::ThroughputBenchmark:0, line 78 <- wrt source file 2025-12-04T13:48:37.2460242Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/throughput_benchmark.py::ThroughputBenchmark:0 2025-12-04T13:48:37.2460738Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_sympy/functions.py::MinMaxBase._collapse_arguments:0, line 742 <- wrt source file 2025-12-04T13:48:37.2689599Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_sympy/functions.py::MinMaxBase._collapse_arguments:0 2025-12-04T13:48:37.2690257Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/dataset.py::IterableDataset:0, line 94 <- wrt source file 2025-12-04T13:48:37.2692099Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/dataset.py::IterableDataset:0 2025-12-04T13:48:37.2692555Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/dataset.py::StackDataset:0, line 218 <- wrt source file 2025-12-04T13:48:37.2693000Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/dataset.py::StackDataset:0 2025-12-04T13:48:37.2693415Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/dataset.py::random_split:0, line 438 <- wrt source file 2025-12-04T13:48:37.2693841Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/dataset.py::random_split:0 2025-12-04T13:48:37.2694292Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/distributed.py::DistributedSampler:0, line 55 <- wrt source file 2025-12-04T13:48:37.2694766Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/distributed.py::DistributedSampler:0 2025-12-04T13:48:37.2695190Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/sampler.py::Sampler:0, line 36 <- wrt source file 2025-12-04T13:48:37.2695604Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/sampler.py::Sampler:0 2025-12-04T13:48:37.2696038Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/sampler.py::WeightedRandomSampler:0, line 225 <- wrt source file 2025-12-04T13:48:37.2697138Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/sampler.py::WeightedRandomSampler:0 2025-12-04T13:48:37.2697581Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/sampler.py::BatchSampler:0, line 296 <- wrt source file 2025-12-04T13:48:37.2700182Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/sampler.py::BatchSampler:0 2025-12-04T13:48:37.2700615Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py::default_convert:0, line 39 <- wrt source file 2025-12-04T13:48:37.2701921Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py::default_convert:0 2025-12-04T13:48:37.2702361Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py::collate:0, line 137 <- wrt source file 2025-12-04T13:48:37.2704360Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py::collate:0 2025-12-04T13:48:37.2704854Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py::default_collate:0, line 367 <- wrt source file 2025-12-04T13:48:37.2707187Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/_utils/collate.py::default_collate:0 2025-12-04T13:48:37.2707656Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/datapipe.py::IterDataPipe:0, line 97 <- wrt source file 2025-12-04T13:48:37.2708700Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/datapipe.py::IterDataPipe:0 2025-12-04T13:48:37.2709171Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/datapipe.py::MapDataPipe:0, line 269 <- wrt source file 2025-12-04T13:48:37.2709667Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/datapipe.py::MapDataPipe:0 2025-12-04T13:48:37.2710157Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/callable.py::MapperIterDataPipe:0, line 53 <- wrt source file 2025-12-04T13:48:37.2710673Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/callable.py::MapperIterDataPipe:0 2025-12-04T13:48:37.2711184Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/callable.py::CollatorIterDataPipe:0, line 202 <- wrt source file 2025-12-04T13:48:37.2711709Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/callable.py::CollatorIterDataPipe:0 2025-12-04T13:48:37.2712259Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combinatorics.py::ShufflerIterDataPipe:0, line 90 <- wrt source file 2025-12-04T13:48:37.2712801Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combinatorics.py::ShufflerIterDataPipe:0 2025-12-04T13:48:37.2713347Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::ConcaterIterDataPipe:0, line 38 <- wrt source file 2025-12-04T13:48:37.2724344Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::ConcaterIterDataPipe:0 2025-12-04T13:48:37.2724887Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::ForkerIterDataPipe:0, line 89 <- wrt source file 2025-12-04T13:48:37.2725423Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::ForkerIterDataPipe:0 2025-12-04T13:48:37.2725942Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::_ChildDataPipe:0, line 308 <- wrt source file 2025-12-04T13:48:37.2726459Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::_ChildDataPipe:0 2025-12-04T13:48:37.2726992Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::DemultiplexerIterDataPipe:0, line 397 <- wrt source file 2025-12-04T13:48:37.2727555Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::DemultiplexerIterDataPipe:0 2025-12-04T13:48:37.2728096Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::MultiplexerIterDataPipe:0, line 615 <- wrt source file 2025-12-04T13:48:37.2728657Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::MultiplexerIterDataPipe:0 2025-12-04T13:48:37.2729270Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::ZipperIterDataPipe:0, line 685 <- wrt source file 2025-12-04T13:48:37.2729805Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/combining.py::ZipperIterDataPipe:0 2025-12-04T13:48:37.2730391Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/filelister.py::FileListerIterDataPipe:0, line 29 <- wrt source file 2025-12-04T13:48:37.2730938Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/filelister.py::FileListerIterDataPipe:0 2025-12-04T13:48:37.2731496Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/fileopener.py::FileOpenerIterDataPipe:0, line 33 <- wrt source file 2025-12-04T13:48:37.2732135Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/fileopener.py::FileOpenerIterDataPipe:0 2025-12-04T13:48:37.2732666Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/grouping.py::BatcherIterDataPipe:0, line 41 <- wrt source file 2025-12-04T13:48:37.2733196Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/grouping.py::BatcherIterDataPipe:0 2025-12-04T13:48:37.2733720Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/grouping.py::UnBatcherIterDataPipe:0, line 102 <- wrt source file 2025-12-04T13:48:37.2734277Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/grouping.py::UnBatcherIterDataPipe:0 2025-12-04T13:48:37.2734803Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/grouping.py::GrouperIterDataPipe:0, line 169 <- wrt source file 2025-12-04T13:48:37.2735333Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/grouping.py::GrouperIterDataPipe:0 2025-12-04T13:48:37.2735852Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/selecting.py::FilterIterDataPipe:0, line 37 <- wrt source file 2025-12-04T13:48:37.2736390Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/selecting.py::FilterIterDataPipe:0 2025-12-04T13:48:37.2736933Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/streamreader.py::StreamReaderIterDataPipe:0, line 24 <- wrt source file 2025-12-04T13:48:37.2737505Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/streamreader.py::StreamReaderIterDataPipe:0 2025-12-04T13:48:37.2738052Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/utils.py::IterableWrapperIterDataPipe:0, line 29 <- wrt source file 2025-12-04T13:48:37.2738604Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/iter/utils.py::IterableWrapperIterDataPipe:0 2025-12-04T13:48:37.2739138Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/callable.py::MapperMapDataPipe:0, line 36 <- wrt source file 2025-12-04T13:48:37.2739656Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/callable.py::MapperMapDataPipe:0 2025-12-04T13:48:37.2740186Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/combinatorics.py::ShufflerIterDataPipe:0, line 34 <- wrt source file 2025-12-04T13:48:37.2740772Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/combinatorics.py::ShufflerIterDataPipe:0 2025-12-04T13:48:37.2741300Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/combining.py::ConcaterMapDataPipe:0, line 29 <- wrt source file 2025-12-04T13:48:37.2741828Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/combining.py::ConcaterMapDataPipe:0 2025-12-04T13:48:37.2742430Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/combining.py::ZipperMapDataPipe:0, line 76 <- wrt source file 2025-12-04T13:48:37.2742953Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/combining.py::ZipperMapDataPipe:0 2025-12-04T13:48:37.2743481Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/grouping.py::BatcherMapDataPipe:0, line 29 <- wrt source file 2025-12-04T13:48:37.2744005Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/grouping.py::BatcherMapDataPipe:0 2025-12-04T13:48:37.2744532Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/utils.py::SequenceWrapperMapDataPipe:0, line 29 <- wrt source file 2025-12-04T13:48:37.2745074Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/map/utils.py::SequenceWrapperMapDataPipe:0 2025-12-04T13:48:37.2745590Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/utils/common.py::validate_input_col:0, line 37 <- wrt source file 2025-12-04T13:48:37.2746107Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/utils/common.py::validate_input_col:0 2025-12-04T13:48:37.2746609Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/utils/decoder.py::basichandlers:0, line 47 <- wrt source file 2025-12-04T13:48:37.2747114Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/data/datapipes/utils/decoder.py::basichandlers:0 2025-12-04T13:48:37.2747602Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/hipify/hipify_python.py::find_closure_group:0, line 439 <- wrt source file 2025-12-04T13:48:37.7458137Z * SUCCESS: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/hipify/hipify_python.py::find_closure_group:0 2025-12-04T13:48:37.7458968Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/hipify/hipify_python.py::replace_extern_shared:0, line 535 <- wrt source file 2025-12-04T13:48:37.7459782Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/hipify/hipify_python.py::replace_extern_shared:0 2025-12-04T13:48:37.7460581Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.__init__:0, line 217 <- wrt source file 2025-12-04T13:48:37.7461383Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.__init__:0 2025-12-04T13:48:37.7462334Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_hparams:0, line 322 <- wrt source file 2025-12-04T13:48:37.7463158Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_hparams:0 2025-12-04T13:48:37.7463961Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_scalar:0, line 370 <- wrt source file 2025-12-04T13:48:37.7464949Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_scalar:0 2025-12-04T13:48:37.7465749Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_scalars:0, line 402 <- wrt source file 2025-12-04T13:48:37.7466584Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_scalars:0 2025-12-04T13:48:37.7467403Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_tensor:0, line 450 <- wrt source file 2025-12-04T13:48:37.7468218Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_tensor:0 2025-12-04T13:48:37.7468897Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_histogram:0, line 489 <- wrt source file 2025-12-04T13:48:37.7469560Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_histogram:0 2025-12-04T13:48:37.7470218Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_histogram_raw:0, line 542 <- wrt source file 2025-12-04T13:48:37.7470914Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_histogram_raw:0 2025-12-04T13:48:37.7471600Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_image:0, line 608 <- wrt source file 2025-12-04T13:48:37.7472304Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_image:0 2025-12-04T13:48:37.7472932Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_images:0, line 657 <- wrt source file 2025-12-04T13:48:37.7473586Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_images:0 2025-12-04T13:48:37.7474222Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_text:0, line 820 <- wrt source file 2025-12-04T13:48:37.7474882Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_text:0 2025-12-04T13:48:37.7475528Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_embedding:0, line 887 <- wrt source file 2025-12-04T13:48:37.7476193Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_embedding:0 2025-12-04T13:48:37.7476830Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_pr_curve:0, line 998 <- wrt source file 2025-12-04T13:48:37.7477546Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_pr_curve:0 2025-12-04T13:48:37.7478258Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_custom_scalars_multilinechart:0, line 1072 <- wrt source file 2025-12-04T13:48:37.7479058Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_custom_scalars_multilinechart:0 2025-12-04T13:48:37.7479655Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_custom_scalars_marginchart:0, line 1093 <- wrt source file 2025-12-04T13:48:37.7480232Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_custom_scalars_marginchart:0 2025-12-04T13:48:37.7480784Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_custom_scalars:0, line 1118 <- wrt source file 2025-12-04T13:48:37.7481382Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_custom_scalars:0 2025-12-04T13:48:37.7481948Z * DOCTEST : /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_mesh:0, line 1164 <- wrt source file 2025-12-04T13:48:37.7482469Z * SKIPPED: /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/tensorboard/writer.py::SummaryWriter.add_mesh:0 2025-12-04T13:48:37.7482747Z ============ 2025-12-04T13:48:37.7482867Z Finished doctests 2025-12-04T13:48:37.7482964Z 378 / 894 passed 2025-12-04T13:48:37.7483064Z  2025-12-04T13:48:37.7483195Z === Found 17 parse-time warnings === 2025-12-04T13:48:37.7483368Z --- Parse Warning: 1 / 17 --- 2025-12-04T13:48:37.7483929Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=Library.fallback in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py line=368. 2025-12-04T13:48:37.7484355Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:37.7484581Z Registers the function implementation as the fallback for the given key. 2025-12-04T13:48:37.7484746Z 2025-12-04T13:48:37.7484880Z This function only works for a library with global namespace ("_"). 2025-12-04T13:48:37.7485035Z 2025-12-04T13:48:37.7485120Z Args: 2025-12-04T13:48:37.7485283Z fn: function used as fallback for the given dispatch key or :func:`~fallthrough_kernel` 2025-12-04T13:48:37.7485478Z to register a fallthrough. 2025-12-04T13:48:37.7485684Z dispatch_key: dispatch key that the input function should be registered for. By default, it uses 2025-12-04T13:48:37.7485908Z the dispatch key that the library was created with. 2025-12-04T13:48:37.7486154Z with_keyset: flag controlling if the current dispatcher call keyset should be passed as the first argument 2025-12-04T13:48:37.7486449Z to :attr:`fn` when calling. This should be used to create the appropriate keyset for redispatch calls. 2025-12-04T13:48:37.7486640Z 2025-12-04T13:48:37.7486728Z Example:: 2025-12-04T13:48:37.7486827Z 2025-12-04T13:48:37.7486924Z >>> my_lib = Library("_", "IMPL") 2025-12-04T13:48:37.7487069Z >>> def fallback_kernel(op, *args, **kwargs): 2025-12-04T13:48:37.7487220Z >>> # Handle all autocast ops generically 2025-12-04T13:48:37.7487352Z >>> # ... 2025-12-04T13:48:37.7487481Z >>> my_lib.fallback(fallback_kernel, "Autocast") 2025-12-04T13:48:37.7487615Z 2025-12-04T13:48:37.7487891Z Original Error: IndentationError('expected an indented block after function definition on line 2', ('', 5, 1, 'my_lib.fallback(fallback_kernel, "Autocast")\n', 5, 7)) 2025-12-04T13:48:37.7488176Z 2025-12-04T13:48:37.7488273Z my_lib.fallback(fallback_kernel, "Autocast") 2025-12-04T13:48:37.7488395Z ^ 2025-12-04T13:48:37.7488482Z warnings.warn(msg) 2025-12-04T13:48:37.7488580Z 2025-12-04T13:48:37.7488697Z --- Parse Warning: 2 / 17 --- 2025-12-04T13:48:37.7489084Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=register_fake in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py line=958. 2025-12-04T13:48:37.7489481Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:37.7489690Z Register a FakeTensor implementation ("fake impl") for this operator. 2025-12-04T13:48:37.7489842Z 2025-12-04T13:48:37.7489955Z Also sometimes known as a "meta kernel", "abstract impl". 2025-12-04T13:48:37.7490107Z 2025-12-04T13:48:37.7490237Z An "FakeTensor implementation" specifies the behavior of this operator on 2025-12-04T13:48:37.7490457Z Tensors that carry no data ("FakeTensor"). Given some input Tensors with 2025-12-04T13:48:37.7490658Z certain properties (sizes/strides/storage_offset/device), it specifies 2025-12-04T13:48:37.7490838Z what the properties of the output Tensors are. 2025-12-04T13:48:37.7490963Z 2025-12-04T13:48:37.7491141Z The FakeTensor implementation has the same signature as the operator. 2025-12-04T13:48:37.7491336Z It is run for both FakeTensors and meta tensors. To write a FakeTensor 2025-12-04T13:48:37.7491528Z implementation, assume that all Tensor inputs to the operator are 2025-12-04T13:48:37.7491719Z regular CPU/CUDA/Meta tensors, but they do not have storage, and 2025-12-04T13:48:37.7491954Z you are trying to return regular CPU/CUDA/Meta tensor(s) as output. 2025-12-04T13:48:37.7492149Z The FakeTensor implementation must consist of only PyTorch operations 2025-12-04T13:48:37.7492340Z (and may not directly access the storage or data of any input or 2025-12-04T13:48:37.7492488Z intermediate Tensors). 2025-12-04T13:48:37.7492594Z 2025-12-04T13:48:37.7492701Z This API may be used as a decorator (see examples). 2025-12-04T13:48:37.7492832Z 2025-12-04T13:48:37.7492932Z For a detailed guide on custom ops, please see 2025-12-04T13:48:37.7493136Z https://pytorch.org/tutorials/advanced/custom_ops_landing_page.html 2025-12-04T13:48:37.7493287Z 2025-12-04T13:48:37.7493365Z Args: 2025-12-04T13:48:37.7493500Z op_name: Operator name (along with the overload) or OpOverload object. 2025-12-04T13:48:37.7493667Z func: Fake tensor implementation. 2025-12-04T13:48:37.7493824Z lib (Optional[Library]): Library to register the fake tensor to. 2025-12-04T13:48:37.7494006Z allow_override: Flag controlling if we want to override an 2025-12-04T13:48:37.7494180Z existing registered fake impl. This is by default off, 2025-12-04T13:48:37.7494351Z and will error you're trying to register a fake impl to 2025-12-04T13:48:37.7494520Z an operator that already has a fake impl. This also only 2025-12-04T13:48:37.7494688Z applies if the custom operator was not created via 2025-12-04T13:48:37.7494859Z torch.library.custom_op, as overriding and existing fake 2025-12-04T13:48:37.7495016Z impl is already allowed. 2025-12-04T13:48:37.7495131Z 2025-12-04T13:48:37.7495212Z Examples: 2025-12-04T13:48:37.7495310Z >>> import torch 2025-12-04T13:48:37.7495428Z >>> import numpy as np 2025-12-04T13:48:37.7495553Z >>> from torch import Tensor 2025-12-04T13:48:37.7495675Z >>> 2025-12-04T13:48:37.7495806Z >>> # Example 1: an operator without data-dependent output shape 2025-12-04T13:48:37.7495997Z >>> @torch.library.custom_op("mylib::custom_linear", mutates_args=()) 2025-12-04T13:48:37.7496198Z >>> def custom_linear(x: Tensor, weight: Tensor, bias: Tensor) -> Tensor: 2025-12-04T13:48:37.7496389Z >>> raise NotImplementedError("Implementation goes here") 2025-12-04T13:48:37.7496556Z >>> 2025-12-04T13:48:37.7496683Z >>> @torch.library.register_fake("mylib::custom_linear") 2025-12-04T13:48:37.7496833Z >>> def _(x, weight, bias): 2025-12-04T13:48:37.7496960Z >>> assert x.dim() == 2 2025-12-04T13:48:37.7497086Z >>> assert weight.dim() == 2 2025-12-04T13:48:37.7497213Z >>> assert bias.dim() == 1 2025-12-04T13:48:37.7497348Z >>> assert x.shape[1] == weight.shape[1] 2025-12-04T13:48:37.7497491Z >>> assert weight.shape[0] == bias.shape[0] 2025-12-04T13:48:37.7497652Z >>> assert x.device == weight.device 2025-12-04T13:48:37.7497773Z >>> 2025-12-04T13:48:37.7497890Z >>> return (x @ weight.t()) + bias 2025-12-04T13:48:37.7498009Z >>> 2025-12-04T13:48:37.7498133Z >>> with torch._subclasses.fake_tensor.FakeTensorMode(): 2025-12-04T13:48:37.7498284Z >>> x = torch.randn(2, 3) 2025-12-04T13:48:37.7498409Z >>> w = torch.randn(3, 3) 2025-12-04T13:48:37.7498557Z >>> b = torch.randn(3) 2025-12-04T13:48:37.7498698Z >>> y = torch.ops.mylib.custom_linear(x, w, b) 2025-12-04T13:48:37.7498829Z >>> 2025-12-04T13:48:37.7498930Z >>> assert y.shape == (2, 3) 2025-12-04T13:48:37.7499045Z >>> 2025-12-04T13:48:37.7499167Z >>> # Example 2: an operator with data-dependent output shape 2025-12-04T13:48:37.7499355Z >>> @torch.library.custom_op("mylib::custom_nonzero", mutates_args=()) 2025-12-04T13:48:37.7499529Z >>> def custom_nonzero(x: Tensor) -> Tensor: 2025-12-04T13:48:37.7499666Z >>> x_np = x.numpy(force=True) 2025-12-04T13:48:37.7499803Z >>> res = np.stack(np.nonzero(x_np), axis=1) 2025-12-04T13:48:37.7499950Z >>> return torch.tensor(res, device=x.device) 2025-12-04T13:48:37.7500077Z >>> 2025-12-04T13:48:37.7500199Z >>> @torch.library.register_fake("mylib::custom_nonzero") 2025-12-04T13:48:37.7500344Z >>> def _(x): 2025-12-04T13:48:37.7500472Z >>> # Number of nonzero-elements is data-dependent. 2025-12-04T13:48:37.7500635Z >>> # Since we cannot peek at the data in an fake impl, 2025-12-04T13:48:37.7500796Z >>> # we use the ctx object to construct a new symint that 2025-12-04T13:48:37.7500948Z >>> # represents the data-dependent size. 2025-12-04T13:48:37.7501087Z >>> ctx = torch.library.get_ctx() 2025-12-04T13:48:37.7501225Z >>> nnz = ctx.new_dynamic_size() 2025-12-04T13:48:37.7501356Z >>> shape = [nnz, x.dim()] 2025-12-04T13:48:37.7501503Z >>> result = x.new_empty(shape, dtype=torch.int64) 2025-12-04T13:48:37.7501645Z >>> return result 2025-12-04T13:48:37.7501755Z >>> 2025-12-04T13:48:37.7501932Z >>> from torch.fx.experimental.proxy_tensor import make_fx 2025-12-04T13:48:37.7502077Z >>> 2025-12-04T13:48:37.7502183Z >>> x = torch.tensor([0, 1, 2, 3, 4, 0]) 2025-12-04T13:48:37.7502365Z >>> trace = make_fx(torch.ops.mylib.custom_nonzero, tracing_mode="symbolic")(x) 2025-12-04T13:48:37.7502543Z >>> trace.print_readable() 2025-12-04T13:48:37.7502662Z >>> 2025-12-04T13:48:37.7502797Z >>> assert torch.allclose(trace(x), torch.ops.mylib.custom_nonzero(x)) 2025-12-04T13:48:37.7502950Z 2025-12-04T13:48:37.7503033Z 2025-12-04T13:48:37.7503264Z Original Error: IndentationError('expected an indented block after function definition on line 37', ('', 38, 1, '_._ = None\n', 38, 2)) 2025-12-04T13:48:37.7503513Z 2025-12-04T13:48:37.7503594Z _._ = None 2025-12-04T13:48:37.7503686Z ^ 2025-12-04T13:48:37.7503775Z warnings.warn(msg) 2025-12-04T13:48:37.7503876Z 2025-12-04T13:48:37.7503996Z --- Parse Warning: 3 / 17 --- 2025-12-04T13:48:37.7504360Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=get_kernel in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py line=1530. 2025-12-04T13:48:37.7504783Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:37.7504990Z Returns the computed kernel for a given operator and dispatch key. 2025-12-04T13:48:37.7505138Z 2025-12-04T13:48:37.7505267Z This function retrieves the kernel that would be executed for a given 2025-12-04T13:48:37.7505489Z operator and dispatch key combination. The returned SafeKernelFunction 2025-12-04T13:48:37.7505699Z can be used to call the kernel in a boxed fashion. The intended use 2025-12-04T13:48:37.7505887Z case for this function is to retrieve the original kernel for a given 2025-12-04T13:48:37.7506086Z dispatch key and then register another kernel to the same dispatch key 2025-12-04T13:48:37.7506276Z that calls into the original kernel for certain cases. 2025-12-04T13:48:37.7506413Z 2025-12-04T13:48:37.7506513Z Args: 2025-12-04T13:48:37.7506644Z op: Operator name (along with the overload) or OpOverload object 2025-12-04T13:48:37.7506844Z Can be a string (e.g., "aten::add.Tensor"), an OpOverload, or a CustomOpDef. 2025-12-04T13:48:37.7507065Z dispatch_key (str | torch.DispatchKey): The dispatch key to get the kernel for. 2025-12-04T13:48:37.7507272Z Can be a string (e.g., "CPU", "CUDA") or a DispatchKey enum value. 2025-12-04T13:48:37.7507416Z 2025-12-04T13:48:37.7507498Z Returns: 2025-12-04T13:48:37.7507644Z torch._C._SafeKernelFunction: A safe kernel function that can be used to 2025-12-04T13:48:37.7507814Z call the kernel. 2025-12-04T13:48:37.7507921Z 2025-12-04T13:48:37.7508003Z Raises: 2025-12-04T13:48:37.7508116Z RuntimeError: If the operator does not exist. 2025-12-04T13:48:37.7508245Z 2025-12-04T13:48:37.7508324Z Example: 2025-12-04T13:48:37.7508430Z >>> # Get the CPU kernel for torch.add 2025-12-04T13:48:37.7508588Z >>> kernel = torch.library.get_kernel("aten::add.Tensor", "CPU") 2025-12-04T13:48:37.7508729Z >>> 2025-12-04T13:48:37.7508832Z >>> # You can also use DispatchKey enum 2025-12-04T13:48:37.7509012Z >>> kernel = torch.library.get_kernel("aten::add.Tensor", torch.DispatchKey.CPU) 2025-12-04T13:48:37.7509176Z >>> 2025-12-04T13:48:37.7509275Z >>> # Or use an OpOverload directly 2025-12-04T13:48:37.7509442Z >>> kernel = torch.library.get_kernel(torch.ops.aten.add.Tensor, "CPU") 2025-12-04T13:48:37.7509595Z >>> 2025-12-04T13:48:37.7509727Z >>> # Example: Using get_kernel in a custom op with conditional dispatch 2025-12-04T13:48:37.7509897Z >>> # Get the original kernel for torch.sin 2025-12-04T13:48:37.7510064Z >>> original_sin_kernel = torch.library.get_kernel("aten::sin", "CPU") 2025-12-04T13:48:37.7510215Z >>> 2025-12-04T13:48:37.7510350Z >>> # If input has negative values, use original sin, otherwise return zeros 2025-12-04T13:48:37.7510526Z >>> def conditional_sin_impl(dispatch_keys, x): 2025-12-04T13:48:37.7510659Z >>> if (x < 0).any(): 2025-12-04T13:48:37.7510802Z >>> return original_sin_kernel.call_boxed(dispatch_keys, x) 2025-12-04T13:48:37.7510945Z >>> else: 2025-12-04T13:48:37.7511057Z >>> return torch.zeros_like(x) 2025-12-04T13:48:37.7511173Z >>> 2025-12-04T13:48:37.7511280Z >>> lib = torch.library.Library("aten", "IMPL") 2025-12-04T13:48:37.7511465Z >>> # with_keyset=True so the first argument to the impl is the current DispatchKeySet 2025-12-04T13:48:37.7511666Z >>> which needs to be the first argument to ``kernel.call_boxed`` 2025-12-04T13:48:37.7511899Z >>> lib.impl("sin", conditional_sin_impl, "CPU", with_keyset=True) 2025-12-04T13:48:37.7512042Z >>> 2025-12-04T13:48:37.7512139Z >>> # Test the conditional behavior 2025-12-04T13:48:37.7512272Z >>> x_positive = torch.tensor([1.0, 2.0]) 2025-12-04T13:48:37.7512405Z >>> x_mixed = torch.tensor([-1.0, 2.0]) 2025-12-04T13:48:37.7512532Z >>> torch.sin(x_positive) 2025-12-04T13:48:37.7512649Z tensor([0., 0.]) 2025-12-04T13:48:37.7512759Z >>> torch.sin(x_mixed) 2025-12-04T13:48:37.7512890Z tensor([-0.8415, 0.9093]) 2025-12-04T13:48:37.7512998Z 2025-12-04T13:48:37.7513231Z Original Error: SyntaxError('invalid syntax', ('', 23, 7, 'which needs to be the first argument to ``kernel.call_boxed``\n', 23, 12)) 2025-12-04T13:48:37.7513462Z 2025-12-04T13:48:37.7513576Z which needs to be the first argument to ``kernel.call_boxed`` 2025-12-04T13:48:37.7513714Z ^ 2025-12-04T13:48:37.7513807Z warnings.warn(msg) 2025-12-04T13:48:37.7513904Z 2025-12-04T13:48:37.7514037Z --- Parse Warning: 4 / 17 --- 2025-12-04T13:48:37.7514417Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=is_available in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py line=70. 2025-12-04T13:48:37.7514859Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:37.7515078Z Check if the current accelerator is available at runtime: it was build, all the 2025-12-04T13:48:37.7515285Z required drivers are available and at least one device is visible. 2025-12-04T13:48:37.7515460Z See :ref:`accelerator` for details. 2025-12-04T13:48:37.7515585Z 2025-12-04T13:48:37.7515663Z Returns: 2025-12-04T13:48:37.7515818Z bool: A boolean indicating if there is an available :ref:`accelerator`. 2025-12-04T13:48:37.7515988Z 2025-12-04T13:48:37.7516124Z .. note:: This API delegates to the device-specific version of `is_available`. 2025-12-04T13:48:37.7516344Z On CUDA, when the environment variable ``PYTORCH_NVML_BASED_CUDA_CHECK=1`` is set, 2025-12-04T13:48:37.7516565Z this function will NOT poison fork. Otherwise, it will. For more details, see 2025-12-04T13:48:37.7516752Z :ref:`multiprocessing-poison-fork-note`. 2025-12-04T13:48:37.7516877Z 2025-12-04T13:48:37.7516958Z Example:: 2025-12-04T13:48:37.7517045Z 2025-12-04T13:48:37.7517184Z >>> assert torch.accelerator.is_available() "No available accelerators detected." 2025-12-04T13:48:37.7517349Z 2025-12-04T13:48:37.7517583Z Original Error: SyntaxError('invalid syntax', ('', 1, 41, 'assert torch.accelerator.is_available() "No available accelerators detected."\n', 1, 78)) 2025-12-04T13:48:37.7517836Z 2025-12-04T13:48:37.7517969Z assert torch.accelerator.is_available() "No available accelerators detected." 2025-12-04T13:48:37.7518142Z ^ 2025-12-04T13:48:37.7518259Z warnings.warn(msg) 2025-12-04T13:48:37.7518355Z 2025-12-04T13:48:37.7518468Z --- Parse Warning: 5 / 17 --- 2025-12-04T13:48:37.7518847Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=synchronize in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/accelerator/__init__.py line=239. 2025-12-04T13:48:37.7519262Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:37.7519464Z Wait for all kernels in all streams on the given device to complete. 2025-12-04T13:48:37.7519609Z 2025-12-04T13:48:37.7519686Z Args: 2025-12-04T13:48:37.7519851Z device (:class:`torch.device`, str, int, optional): device for which to synchronize. It must match 2025-12-04T13:48:37.7520107Z the current :ref:`accelerator` device type. If not given, 2025-12-04T13:48:37.7520302Z use :func:`torch.accelerator.current_device_index` by default. 2025-12-04T13:48:37.7520441Z 2025-12-04T13:48:37.7520592Z .. note:: This function is a no-op if the current :ref:`accelerator` is not initialized. 2025-12-04T13:48:37.7520769Z 2025-12-04T13:48:37.7520848Z Example:: 2025-12-04T13:48:37.7520934Z 2025-12-04T13:48:37.7521033Z >>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_CUDA) 2025-12-04T13:48:37.7521230Z >>> assert torch.accelerator.is_available() "No available accelerators detected." 2025-12-04T13:48:37.7521428Z >>> start_event = torch.Event(enable_timing=True) 2025-12-04T13:48:37.7521575Z >>> end_event = torch.Event(enable_timing=True) 2025-12-04T13:48:37.7521705Z >>> start_event.record() 2025-12-04T13:48:37.7521911Z >>> tensor = torch.randn(100, device=torch.accelerator.current_accelerator()) 2025-12-04T13:48:37.7522104Z >>> sum = torch.sum(tensor) 2025-12-04T13:48:37.7522232Z >>> end_event.record() 2025-12-04T13:48:37.7522365Z >>> torch.accelerator.synchronize() 2025-12-04T13:48:37.7522519Z >>> elapsed_time_ms = start_event.elapsed_time(end_event) 2025-12-04T13:48:37.7522658Z 2025-12-04T13:48:37.7522896Z Original Error: SyntaxError('invalid syntax', ('', 2, 41, 'assert torch.accelerator.is_available() "No available accelerators detected."\n', 2, 78)) 2025-12-04T13:48:37.7523154Z 2025-12-04T13:48:37.7523295Z assert torch.accelerator.is_available() "No available accelerators detected." 2025-12-04T13:48:37.7523468Z ^ 2025-12-04T13:48:37.7523587Z warnings.warn(msg) 2025-12-04T13:48:37.7523685Z 2025-12-04T13:48:37.7523803Z --- Parse Warning: 6 / 17 --- 2025-12-04T13:48:37.7524172Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=cudart in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py line=448. 2025-12-04T13:48:37.7524578Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:37.7524755Z Retrieves the CUDA runtime API module. 2025-12-04T13:48:37.7524871Z 2025-12-04T13:48:37.7524950Z 2025-12-04T13:48:37.7525089Z This function initializes the CUDA runtime environment if it is not already 2025-12-04T13:48:37.7525305Z initialized and returns the CUDA runtime API module (_cudart). The CUDA 2025-12-04T13:48:37.7525508Z runtime API module provides access to various CUDA runtime functions. 2025-12-04T13:48:37.7525664Z 2025-12-04T13:48:37.7525745Z Args: 2025-12-04T13:48:37.7525835Z ``None`` 2025-12-04T13:48:37.7525930Z 2025-12-04T13:48:37.7526011Z Returns: 2025-12-04T13:48:37.7526130Z module: The CUDA runtime API module (_cudart). 2025-12-04T13:48:37.7526259Z 2025-12-04T13:48:37.7526341Z Raises: 2025-12-04T13:48:37.7526484Z RuntimeError: If CUDA cannot be re-initialized in a forked subprocess. 2025-12-04T13:48:37.7526734Z AssertionError: If PyTorch is not compiled with CUDA support or if libcudart functions are unavailable. 2025-12-04T13:48:37.7526931Z 2025-12-04T13:48:37.7527033Z Example of CUDA operations with profiling: 2025-12-04T13:48:37.7527164Z >>> import torch 2025-12-04T13:48:37.7527290Z >>> from torch.cuda import cudart, check_error 2025-12-04T13:48:37.7527422Z >>> import os 2025-12-04T13:48:37.7527524Z >>> 2025-12-04T13:48:37.7527627Z >>> os.environ["CUDA_PROFILE"] = "1" 2025-12-04T13:48:37.7527747Z >>> 2025-12-04T13:48:37.7527858Z >>> def perform_cuda_operations_with_streams(): 2025-12-04T13:48:37.7527998Z >>> stream = torch.cuda.Stream() 2025-12-04T13:48:37.7528158Z >>> with torch.cuda.stream(stream): 2025-12-04T13:48:37.7528298Z >>> x = torch.randn(100, 100, device='cuda') 2025-12-04T13:48:37.7528442Z >>> y = torch.randn(100, 100, device='cuda') 2025-12-04T13:48:37.7528580Z >>> z = torch.mul(x, y) 2025-12-04T13:48:37.7528702Z >>> return z 2025-12-04T13:48:37.7528806Z >>> 2025-12-04T13:48:37.7528911Z >>> torch.cuda.synchronize() 2025-12-04T13:48:37.7529069Z >>> print("====== Start nsys profiling ======") 2025-12-04T13:48:37.7529217Z >>> check_error(cudart().cudaProfilerStart()) 2025-12-04T13:48:37.7529386Z >>> with torch.autograd.profiler.emit_nvtx(): 2025-12-04T13:48:37.7529544Z >>> result = perform_cuda_operations_with_streams() 2025-12-04T13:48:37.7529697Z >>> print("CUDA operations completed.") 2025-12-04T13:48:37.7529853Z >>> check_error(torch.cuda.cudart().cudaProfilerStop()) 2025-12-04T13:48:37.7530023Z >>> print("====== End nsys profiling ======") 2025-12-04T13:48:37.7530144Z 2025-12-04T13:48:37.7530269Z To run this example and save the profiling information, execute: 2025-12-04T13:48:37.7530520Z >>> $ nvprof --profile-from-start off --csv --print-summary -o trace_name.prof -f -- python cudart_test.py 2025-12-04T13:48:37.7530719Z 2025-12-04T13:48:37.7530856Z This command profiles the CUDA operations in the provided script and saves 2025-12-04T13:48:37.7531064Z the profiling information to a file named `trace_name.prof`. 2025-12-04T13:48:37.7531262Z The `--profile-from-start off` option ensures that profiling starts only 2025-12-04T13:48:37.7531447Z after the `cudaProfilerStart` call in the script. 2025-12-04T13:48:37.7531625Z The `--csv` and `--print-summary` options format the profiling output as a 2025-12-04T13:48:37.7531800Z CSV file and print a summary, respectively. 2025-12-04T13:48:37.7532037Z The `-o` option specifies the output file name, and the `-f` option forces the 2025-12-04T13:48:37.7532232Z overwrite of the output file if it already exists. 2025-12-04T13:48:37.7532365Z 2025-12-04T13:48:37.7532642Z Original Error: SyntaxError('invalid syntax', ('', 1, 1, '$ nvprof --profile-from-start off --csv --print-summary -o trace_name.prof -f -- python cudart_test.py\n', 1, 2)) 2025-12-04T13:48:37.7532935Z 2025-12-04T13:48:37.7533109Z $ nvprof --profile-from-start off --csv --print-summary -o trace_name.prof -f -- python cudart_test.py 2025-12-04T13:48:37.7533304Z ^ 2025-12-04T13:48:37.7533395Z warnings.warn(msg) 2025-12-04T13:48:37.7533499Z 2025-12-04T13:48:37.7533619Z --- Parse Warning: 7 / 17 --- 2025-12-04T13:48:37.7533984Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=vmap in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/apis.py line=39. 2025-12-04T13:48:37.7534386Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:37.7534547Z 2025-12-04T13:48:37.7534675Z vmap is the vectorizing map; ``vmap(func)`` returns a new function that 2025-12-04T13:48:37.7534869Z maps ``func`` over some dimension of the inputs. Semantically, vmap 2025-12-04T13:48:37.7535061Z pushes the map into PyTorch operations called by ``func``, effectively 2025-12-04T13:48:37.7535230Z vectorizing those operations. 2025-12-04T13:48:37.7535344Z 2025-12-04T13:48:37.7535469Z vmap is useful for handling batch dimensions: one can write a function 2025-12-04T13:48:37.7535664Z ``func`` that runs on examples and then lift it to a function that can 2025-12-04T13:48:37.7535855Z take batches of examples with ``vmap(func)``. vmap can also be used to 2025-12-04T13:48:37.7536041Z compute batched gradients when composed with autograd. 2025-12-04T13:48:37.7536204Z 2025-12-04T13:48:37.7536288Z .. note:: 2025-12-04T13:48:37.7536417Z :func:`torch.vmap` is aliased to :func:`torch.func.vmap` for 2025-12-04T13:48:37.7536577Z convenience. Use whichever one you'd like. 2025-12-04T13:48:37.7536699Z 2025-12-04T13:48:37.7536781Z Args: 2025-12-04T13:48:37.7536912Z func (function): A Python function that takes one or more arguments. 2025-12-04T13:48:37.7537078Z Must return one or more Tensors. 2025-12-04T13:48:37.7537236Z in_dims (int or nested structure): Specifies which dimension of the 2025-12-04T13:48:37.7537433Z inputs should be mapped over. ``in_dims`` should have a 2025-12-04T13:48:37.7537622Z structure like the inputs. If the ``in_dim`` for a particular 2025-12-04T13:48:37.7537808Z input is None, then that indicates there is no map dimension. 2025-12-04T13:48:37.7537955Z Default: 0. 2025-12-04T13:48:37.7538095Z out_dims (int or Tuple[int]): Specifies where the mapped dimension 2025-12-04T13:48:37.7538296Z should appear in the outputs. If ``out_dims`` is a Tuple, then 2025-12-04T13:48:37.7538471Z it should have one element per output. Default: 0. 2025-12-04T13:48:37.7538644Z randomness (str): Specifies whether the randomness in this 2025-12-04T13:48:37.7538830Z vmap should be the same or different across batches. If 'different', 2025-12-04T13:48:37.7539021Z the randomness for each batch will be different. If 'same', the 2025-12-04T13:48:37.7539215Z randomness will be the same across batches. If 'error', any calls to 2025-12-04T13:48:37.7539410Z random functions will error. Default: 'error'. WARNING: this flag 2025-12-04T13:48:37.7539603Z only applies to random PyTorch operations and does not apply to 2025-12-04T13:48:37.7539773Z Python's random module or numpy randomness. 2025-12-04T13:48:37.7539951Z chunk_size (None or int): If None (default), apply a single vmap over inputs. 2025-12-04T13:48:37.7540159Z If not None, then compute the vmap :attr:`chunk_size` samples at a time. 2025-12-04T13:48:37.7540376Z Note that :attr:`chunk_size=1` is equivalent to computing the vmap with a for-loop. 2025-12-04T13:48:37.7540609Z If you run into memory issues computing the vmap, please try a non-None chunk_size. 2025-12-04T13:48:37.7540779Z 2025-12-04T13:48:37.7540863Z Returns: 2025-12-04T13:48:37.7540991Z Returns a new "batched" function. It takes the same inputs as 2025-12-04T13:48:37.7541175Z ``func``, except each input has an extra dimension at the index 2025-12-04T13:48:37.7541358Z specified by ``in_dims``. It takes returns the same outputs as 2025-12-04T13:48:37.7541539Z ``func``, except each output has an extra dimension at the index 2025-12-04T13:48:37.7541694Z specified by ``out_dims``. 2025-12-04T13:48:37.7541805Z 2025-12-04T13:48:37.7541915Z .. warning: 2025-12-04T13:48:37.7542051Z :func:`vmap` works best with functional-style code. Please do not 2025-12-04T13:48:37.7542239Z perform any side-effects in ``func``, with the exception of 2025-12-04T13:48:37.7542436Z in-place PyTorch operations. Examples of side-effects include mutating 2025-12-04T13:48:37.7542644Z Python data structures and assigning values to variables not captured 2025-12-04T13:48:37.7542801Z in ``func``. 2025-12-04T13:48:37.7542899Z 2025-12-04T13:48:37.7543038Z One example of using :func:`vmap` is to compute batched dot products. PyTorch 2025-12-04T13:48:37.7543245Z doesn't provide a batched ``torch.dot`` API; instead of unsuccessfully 2025-12-04T13:48:37.7543450Z rummaging through docs, use :func:`vmap` to construct a new function. 2025-12-04T13:48:37.7543603Z 2025-12-04T13:48:37.7543692Z >>> torch.dot # [D], [D] -> [] 2025-12-04T13:48:37.7543850Z >>> batched_dot = torch.func.vmap(torch.dot) # [N, D], [N, D] -> [N] 2025-12-04T13:48:37.7544042Z >>> x, y = torch.randn(2, 5), torch.randn(2, 5) 2025-12-04T13:48:37.7544177Z >>> batched_dot(x, y) 2025-12-04T13:48:37.7544282Z 2025-12-04T13:48:37.7544414Z :func:`vmap` can be helpful in hiding batch dimensions, leading to a simpler 2025-12-04T13:48:37.7544582Z model authoring experience. 2025-12-04T13:48:37.7544691Z 2025-12-04T13:48:37.7544789Z >>> batch_size, feature_size = 3, 5 2025-12-04T13:48:37.7544941Z >>> weights = torch.randn(feature_size, requires_grad=True) 2025-12-04T13:48:37.7545096Z >>> 2025-12-04T13:48:37.7545195Z >>> def model(feature_vec): 2025-12-04T13:48:37.7545340Z >>> # Very simple linear model with activation 2025-12-04T13:48:37.7545484Z >>> return feature_vec.dot(weights).relu() 2025-12-04T13:48:37.7545607Z >>> 2025-12-04T13:48:37.7545718Z >>> examples = torch.randn(batch_size, feature_size) 2025-12-04T13:48:37.7545865Z >>> result = torch.vmap(model)(examples) 2025-12-04T13:48:37.7545987Z 2025-12-04T13:48:37.7546143Z :func:`vmap` can also help vectorize computations that were previously difficult 2025-12-04T13:48:37.7546359Z or impossible to batch. One example is higher-order gradient computation. 2025-12-04T13:48:37.7546567Z The PyTorch autograd engine computes vjps (vector-Jacobian products). 2025-12-04T13:48:37.7546775Z Computing a full Jacobian matrix for some function f: R^N -> R^N usually 2025-12-04T13:48:37.7546991Z requires N calls to ``autograd.grad``, one per Jacobian row. Using :func:`vmap`, 2025-12-04T13:48:37.7547207Z we can vectorize the whole computation, computing the Jacobian in a single 2025-12-04T13:48:37.7547376Z call to ``autograd.grad``. 2025-12-04T13:48:37.7547483Z 2025-12-04T13:48:37.7547573Z >>> # Setup 2025-12-04T13:48:37.7547667Z >>> N = 5 2025-12-04T13:48:37.7547768Z >>> f = lambda x: x**2 2025-12-04T13:48:37.7547889Z >>> x = torch.randn(N, requires_grad=True) 2025-12-04T13:48:37.7548015Z >>> y = f(x) 2025-12-04T13:48:37.7548119Z >>> I_N = torch.eye(N) 2025-12-04T13:48:37.7548224Z >>> 2025-12-04T13:48:37.7548319Z >>> # Sequential approach 2025-12-04T13:48:37.7548481Z >>> jacobian_rows = [torch.autograd.grad(y, x, v, retain_graph=True)[0] 2025-12-04T13:48:37.7548647Z >>> for v in I_N.unbind()] 2025-12-04T13:48:37.7548782Z >>> jacobian = torch.stack(jacobian_rows) 2025-12-04T13:48:37.7548904Z >>> 2025-12-04T13:48:37.7549006Z >>> # vectorized gradient computation 2025-12-04T13:48:37.7549135Z >>> def get_vjp(v): 2025-12-04T13:48:37.7549254Z >>> return torch.autograd.grad(y, x, v) 2025-12-04T13:48:37.7549391Z >>> jacobian = torch.vmap(get_vjp)(I_N) 2025-12-04T13:48:37.7549510Z 2025-12-04T13:48:37.7549653Z :func:`vmap` can also be nested, producing an output with multiple batched dimensions 2025-12-04T13:48:37.7549820Z 2025-12-04T13:48:37.7549911Z >>> torch.dot # [D], [D] -> [] 2025-12-04T13:48:37.7550038Z >>> batched_dot = torch.vmap( 2025-12-04T13:48:37.7550162Z ... torch.vmap(torch.dot) 2025-12-04T13:48:37.7550292Z ... ) # [N1, N0, D], [N1, N0, D] -> [N1, N0] 2025-12-04T13:48:37.7550435Z >>> x, y = torch.randn(2, 3, 5), torch.randn(2, 3, 5) 2025-12-04T13:48:37.7550581Z >>> batched_dot(x, y) # tensor of size [2, 3] 2025-12-04T13:48:37.7550702Z 2025-12-04T13:48:37.7550837Z If the inputs are not batched along the first dimension, ``in_dims`` specifies 2025-12-04T13:48:37.7551025Z the dimension that each inputs are batched along as 2025-12-04T13:48:37.7551155Z 2025-12-04T13:48:37.7551244Z >>> torch.dot # [N], [N] -> [] 2025-12-04T13:48:37.7551406Z >>> batched_dot = torch.vmap(torch.dot, in_dims=1) # [N, D], [N, D] -> [D] 2025-12-04T13:48:37.7551579Z >>> x, y = torch.randn(2, 5), torch.randn(2, 5) 2025-12-04T13:48:37.7551705Z >>> batched_dot( 2025-12-04T13:48:37.7551828Z ... x, y 2025-12-04T13:48:37.7551997Z ... ) # output is [5] instead of [2] if batched along the 0th dimension 2025-12-04T13:48:37.7552141Z 2025-12-04T13:48:37.7552283Z If there are multiple inputs each of which is batched along different dimensions, 2025-12-04T13:48:37.7552493Z ``in_dims`` must be a tuple with the batch dimension for each input as 2025-12-04T13:48:37.7552641Z 2025-12-04T13:48:37.7552731Z >>> torch.dot # [D], [D] -> [] 2025-12-04T13:48:37.7552926Z >>> batched_dot = torch.vmap(torch.dot, in_dims=(0, None)) # [N, D], [D] -> [N] 2025-12-04T13:48:37.7553121Z >>> x, y = torch.randn(2, 5), torch.randn(5) 2025-12-04T13:48:37.7553249Z >>> batched_dot( 2025-12-04T13:48:37.7553352Z ... x, y 2025-12-04T13:48:37.7553486Z ... ) # second arg doesn't have a batch dim because in_dim[1] was None 2025-12-04T13:48:37.7553631Z 2025-12-04T13:48:37.7553765Z If the input is a Python struct, ``in_dims`` must be a tuple containing a struct 2025-12-04T13:48:37.7553953Z matching the shape of the input: 2025-12-04T13:48:37.7554069Z 2025-12-04T13:48:37.7554174Z >>> f = lambda dict: torch.dot(dict["x"], dict["y"]) 2025-12-04T13:48:37.7554318Z >>> x, y = torch.randn(2, 5), torch.randn(5) 2025-12-04T13:48:37.7554446Z >>> input = {"x": x, "y": y} 2025-12-04T13:48:37.7554591Z >>> batched_dot = torch.vmap(f, in_dims=({"x": 0, "y": None},)) 2025-12-04T13:48:37.7554739Z >>> batched_dot(input) 2025-12-04T13:48:37.7554845Z 2025-12-04T13:48:37.7554989Z By default, the output is batched along the first dimension. However, it can be batched 2025-12-04T13:48:37.7555184Z along any dimension by using ``out_dims`` 2025-12-04T13:48:37.7555305Z 2025-12-04T13:48:37.7555392Z >>> f = lambda x: x**2 2025-12-04T13:48:37.7555506Z >>> x = torch.randn(2, 5) 2025-12-04T13:48:37.7555632Z >>> batched_pow = torch.vmap(f, out_dims=1) 2025-12-04T13:48:37.7555765Z >>> batched_pow(x) # [5, 2] 2025-12-04T13:48:37.7555877Z 2025-12-04T13:48:37.7556030Z For any function that uses kwargs, the returned function will not batch the kwargs but will 2025-12-04T13:48:37.7556209Z accept kwargs 2025-12-04T13:48:37.7556301Z 2025-12-04T13:48:37.7556389Z >>> x = torch.randn([2, 5]) 2025-12-04T13:48:37.7556508Z >>> def fn(x, scale=4.): 2025-12-04T13:48:37.7556625Z >>> return x * scale 2025-12-04T13:48:37.7556731Z >>> 2025-12-04T13:48:37.7556829Z >>> batched_pow = torch.vmap(fn) 2025-12-04T13:48:37.7556968Z >>> assert torch.allclose(batched_pow(x), x * 4) 2025-12-04T13:48:37.7557150Z >>> batched_pow(x, scale=x) # scale is not batched, output has shape [2, 2, 5] 2025-12-04T13:48:37.7557309Z 2025-12-04T13:48:37.7557392Z .. note:: 2025-12-04T13:48:37.7557530Z vmap does not provide general autobatching or handle variable-length 2025-12-04T13:48:37.7557695Z sequences out of the box. 2025-12-04T13:48:37.7557806Z 2025-12-04T13:48:37.7558034Z Original Error: IndentationError('expected an indented block after function definition on line 4', ('', 5, 1, '_._ = None\n', 5, 2)) 2025-12-04T13:48:37.7558279Z 2025-12-04T13:48:37.7558361Z _._ = None 2025-12-04T13:48:37.7558452Z ^ 2025-12-04T13:48:37.7558541Z warnings.warn(msg) 2025-12-04T13:48:37.7558641Z 2025-12-04T13:48:37.7558761Z --- Parse Warning: 8 / 17 --- 2025-12-04T13:48:37.7559133Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=grad in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_functorch/apis.py line=306. 2025-12-04T13:48:37.7559537Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:37.7559754Z ``grad`` operator helps computing gradients of ``func`` with respect to the 2025-12-04T13:48:37.7559975Z input(s) specified by ``argnums``. This operator can be nested to 2025-12-04T13:48:37.7560135Z compute higher-order gradients. 2025-12-04T13:48:37.7560255Z 2025-12-04T13:48:37.7560337Z Args: 2025-12-04T13:48:37.7560470Z func (Callable): A Python function that takes one or more arguments. 2025-12-04T13:48:37.7560685Z Must return a single-element Tensor. If specified ``has_aux`` equals ``True``, 2025-12-04T13:48:37.7560917Z function can return a tuple of single-element Tensor and other auxiliary objects: 2025-12-04T13:48:37.7561117Z ``(output, aux)``. 2025-12-04T13:48:37.7561306Z argnums (int or Tuple[int]): Specifies arguments to compute gradients with respect to. 2025-12-04T13:48:37.7561527Z ``argnums`` can be single integer or tuple of integers. Default: 0. 2025-12-04T13:48:37.7561728Z has_aux (bool): Flag indicating that ``func`` returns a tensor and other 2025-12-04T13:48:37.7561955Z auxiliary objects: ``(output, aux)``. Default: False. 2025-12-04T13:48:37.7562106Z 2025-12-04T13:48:37.7562190Z Returns: 2025-12-04T13:48:37.7562351Z Function to compute gradients with respect to its inputs. By default, the output of 2025-12-04T13:48:37.7562579Z the function is the gradient tensor(s) with respect to the first argument. 2025-12-04T13:48:37.7562804Z If specified ``has_aux`` equals ``True``, tuple of gradients and output auxiliary objects 2025-12-04T13:48:37.7563032Z is returned. If ``argnums`` is a tuple of integers, a tuple of output gradients with 2025-12-04T13:48:37.7563221Z respect to each ``argnums`` value is returned. 2025-12-04T13:48:37.7563350Z 2025-12-04T13:48:37.7563437Z Example of using ``grad``: 2025-12-04T13:48:37.7563547Z 2025-12-04T13:48:37.7563635Z >>> # xdoctest: +SKIP 2025-12-04T13:48:37.7563758Z >>> from torch.func import grad 2025-12-04T13:48:37.7563886Z >>> x = torch.randn([]) 2025-12-04T13:48:37.7564021Z >>> cos_x = grad(lambda x: torch.sin(x))(x) 2025-12-04T13:48:37.7564164Z >>> assert torch.allclose(cos_x, x.cos()) 2025-12-04T13:48:37.7564287Z >>> 2025-12-04T13:48:37.7564390Z >>> # Second-order gradients 2025-12-04T13:48:37.7564531Z >>> neg_sin_x = grad(grad(lambda x: torch.sin(x)))(x) 2025-12-04T13:48:37.7564687Z >>> assert torch.allclose(neg_sin_x, -x.sin()) 2025-12-04T13:48:37.7564814Z 2025-12-04T13:48:37.7564955Z When composed with ``vmap``, ``grad`` can be used to compute per-sample-gradients: 2025-12-04T13:48:37.7565120Z 2025-12-04T13:48:37.7565210Z >>> # xdoctest: +SKIP 2025-12-04T13:48:37.7565337Z >>> from torch.func import grad, vmap 2025-12-04T13:48:37.7565472Z >>> batch_size, feature_size = 3, 5 2025-12-04T13:48:37.7565592Z >>> 2025-12-04T13:48:37.7565694Z >>> def model(weights, feature_vec): 2025-12-04T13:48:37.7565835Z >>> # Very simple linear model with activation 2025-12-04T13:48:37.7565974Z >>> assert feature_vec.dim() == 1 2025-12-04T13:48:37.7566110Z >>> return feature_vec.dot(weights).relu() 2025-12-04T13:48:37.7566234Z >>> 2025-12-04T13:48:37.7566347Z >>> def compute_loss(weights, example, target): 2025-12-04T13:48:37.7566488Z >>> y = model(weights, example) 2025-12-04T13:48:37.7566628Z >>> return ((y - target) ** 2).mean() # MSELoss 2025-12-04T13:48:37.7566756Z >>> 2025-12-04T13:48:37.7566879Z >>> weights = torch.randn(feature_size, requires_grad=True) 2025-12-04T13:48:37.7567044Z >>> examples = torch.randn(batch_size, feature_size) 2025-12-04T13:48:37.7567188Z >>> targets = torch.randn(batch_size) 2025-12-04T13:48:37.7567324Z >>> inputs = (weights, examples, targets) 2025-12-04T13:48:37.7567500Z >>> grad_weight_per_example = vmap(grad(compute_loss), in_dims=(None, 0, 0))( 2025-12-04T13:48:37.7567691Z ... *inputs 2025-12-04T13:48:37.7567797Z ... ) 2025-12-04T13:48:37.7567887Z 2025-12-04T13:48:37.7568005Z Example of using ``grad`` with ``has_aux`` and ``argnums``: 2025-12-04T13:48:37.7568145Z 2025-12-04T13:48:37.7568234Z >>> # xdoctest: +SKIP 2025-12-04T13:48:37.7568355Z >>> from torch.func import grad 2025-12-04T13:48:37.7568482Z >>> def my_loss_func(y, y_pred): 2025-12-04T13:48:37.7568635Z >>> loss_per_sample = (0.5 * y_pred - y) ** 2 2025-12-04T13:48:37.7568790Z >>> loss = loss_per_sample.mean() 2025-12-04T13:48:37.7568929Z >>> return loss, (y_pred, loss_per_sample) 2025-12-04T13:48:37.7569052Z >>> 2025-12-04T13:48:37.7569172Z >>> fn = grad(my_loss_func, argnums=(0, 1), has_aux=True) 2025-12-04T13:48:37.7569312Z >>> y_true = torch.rand(4) 2025-12-04T13:48:37.7569462Z >>> y_preds = torch.rand(4, requires_grad=True) 2025-12-04T13:48:37.7569600Z >>> out = fn(y_true, y_preds) 2025-12-04T13:48:37.7569774Z >>> # > output is ((grads w.r.t y_true, grads w.r.t y_preds), (y_pred, loss_per_sample)) 2025-12-04T13:48:37.7569934Z 2025-12-04T13:48:37.7570017Z .. note:: 2025-12-04T13:48:37.7570141Z Using PyTorch ``torch.no_grad`` together with ``grad``. 2025-12-04T13:48:37.7570279Z 2025-12-04T13:48:37.7570391Z Case 1: Using ``torch.no_grad`` inside a function: 2025-12-04T13:48:37.7570521Z 2025-12-04T13:48:37.7570612Z >>> # xdoctest: +SKIP 2025-12-04T13:48:37.7570731Z >>> def f(x): 2025-12-04T13:48:37.7570851Z >>> with torch.no_grad(): 2025-12-04T13:48:37.7570974Z >>> c = x ** 2 2025-12-04T13:48:37.7571092Z >>> return x - c 2025-12-04T13:48:37.7571201Z 2025-12-04T13:48:37.7571332Z In this case, ``grad(f)(x)`` will respect the inner ``torch.no_grad``. 2025-12-04T13:48:37.7571479Z 2025-12-04T13:48:37.7571595Z Case 2: Using ``grad`` inside ``torch.no_grad`` context manager: 2025-12-04T13:48:37.7571738Z 2025-12-04T13:48:37.7571827Z >>> # xdoctest: +SKIP 2025-12-04T13:48:37.7572013Z >>> with torch.no_grad(): 2025-12-04T13:48:37.7572135Z >>> grad(f)(x) 2025-12-04T13:48:37.7572243Z 2025-12-04T13:48:37.7572371Z In this case, ``grad`` will respect the inner ``torch.no_grad``, but not the 2025-12-04T13:48:37.7572579Z outer one. This is because ``grad`` is a "function transform": its result 2025-12-04T13:48:37.7572784Z should not depend on the result of a context manager outside of ``f``. 2025-12-04T13:48:37.7572939Z 2025-12-04T13:48:37.7573019Z 2025-12-04T13:48:37.7573246Z Original Error: IndentationError('expected an indented block after function definition on line 5', ('', 6, 1, '_._ = None\n', 6, 2)) 2025-12-04T13:48:37.7573496Z 2025-12-04T13:48:37.7579324Z _._ = None 2025-12-04T13:48:37.7579426Z ^ 2025-12-04T13:48:37.7579524Z warnings.warn(msg) 2025-12-04T13:48:37.7579634Z 2025-12-04T13:48:37.7579769Z --- Parse Warning: 9 / 17 --- 2025-12-04T13:48:37.7580188Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=CustomOpDef.register_fake in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/custom_ops.py line=402. 2025-12-04T13:48:37.7580635Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:37.7580839Z Register a FakeTensor implementation for this custom op. 2025-12-04T13:48:37.7580982Z 2025-12-04T13:48:37.7581124Z This is necessary to get the operator to work efficiently with torch.compile. 2025-12-04T13:48:37.7581296Z 2025-12-04T13:48:37.7581475Z The Fake impl (sometimes also known as a meta kernel or abstract impl) 2025-12-04T13:48:37.7581681Z specifies the behavior of this operator on Tensors that carry no data. 2025-12-04T13:48:37.7581910Z Given some input Tensors with certain properties 2025-12-04T13:48:37.7582109Z (sizes/strides/storage_offset/device), it specifies what the properties of 2025-12-04T13:48:37.7582292Z the output Tensors are. 2025-12-04T13:48:37.7582408Z 2025-12-04T13:48:37.7582559Z Please see :func:`torch.library.register_fake` for more details. 2025-12-04T13:48:37.7582710Z 2025-12-04T13:48:37.7582798Z Args: 2025-12-04T13:48:37.7582944Z fn (Callable): The function to register as the FakeTensor 2025-12-04T13:48:37.7583096Z implementation. 2025-12-04T13:48:37.7583213Z 2025-12-04T13:48:37.7583302Z Examples: 2025-12-04T13:48:37.7583411Z >>> import torch 2025-12-04T13:48:37.7583536Z >>> import numpy as np 2025-12-04T13:48:37.7583687Z >>> from torch import Tensor 2025-12-04T13:48:37.7583809Z >>> 2025-12-04T13:48:37.7583945Z >>> # Example 1: an operator without data-dependent output shape 2025-12-04T13:48:37.7584133Z >>> @torch.library.custom_op("mylib::linear", mutates_args=()) 2025-12-04T13:48:37.7584316Z >>> def linear(x: Tensor, weight: Tensor, bias: Tensor) -> Tensor: 2025-12-04T13:48:37.7584484Z >>> return (x @ weight.t()) + bias 2025-12-04T13:48:37.7584611Z >>> 2025-12-04T13:48:37.7584714Z >>> @linear.register_fake 2025-12-04T13:48:37.7584844Z >>> def _(x, weight, bias): 2025-12-04T13:48:37.7584967Z >>> assert x.dim() == 2 2025-12-04T13:48:37.7585096Z >>> assert weight.dim() == 2 2025-12-04T13:48:37.7585231Z >>> assert bias.dim() == 1 2025-12-04T13:48:37.7585490Z >>> assert x.shape[1] == weight.shape[1] 2025-12-04T13:48:37.7585640Z >>> assert weight.shape[0] == bias.shape[0] 2025-12-04T13:48:37.7585785Z >>> assert x.device == weight.device 2025-12-04T13:48:37.7585931Z >>> return x.new_empty(x.size(0), weight.size(0)) 2025-12-04T13:48:37.7586063Z >>> 2025-12-04T13:48:37.7586165Z >>> x = torch.randn(2, 2) 2025-12-04T13:48:37.7586297Z >>> weight = torch.randn(2, 2) 2025-12-04T13:48:37.7586428Z >>> bias = torch.randn(2) 2025-12-04T13:48:37.7586568Z >>> # xdoctest: +SKIP("Requires Python <= 3.11") 2025-12-04T13:48:37.7586741Z >>> out = torch.compile(linear, fullgraph=True)(x, weight, bias) 2025-12-04T13:48:37.7586913Z >>> # xdoctest: +SKIP("Requires Python <= 3.11") 2025-12-04T13:48:37.7587105Z >>> assert torch.allclose(out, torch.nn.functional.linear(x, weight, bias)) 2025-12-04T13:48:37.7587274Z >>> 2025-12-04T13:48:37.7587403Z >>> # Example 2: an operator with data-dependent output shape 2025-12-04T13:48:37.7587587Z >>> @torch.library.custom_op("mylib::nonzero", mutates_args=()) 2025-12-04T13:48:37.7587759Z >>> def nonzero(x: Tensor) -> Tensor: 2025-12-04T13:48:37.7587896Z >>> x_np = x.cpu().numpy() 2025-12-04T13:48:37.7588037Z >>> res = np.stack(np.nonzero(x_np), axis=1) 2025-12-04T13:48:37.7588190Z >>> return torch.tensor(res, device=x.device) 2025-12-04T13:48:37.7588323Z >>> 2025-12-04T13:48:37.7588430Z >>> @nonzero.register_fake 2025-12-04T13:48:37.7588557Z >>> def _(x): 2025-12-04T13:48:37.7588687Z >>> # Number of nonzero-elements is data-dependent. 2025-12-04T13:48:37.7588852Z >>> # Since we cannot peek at the data in an abstract impl, 2025-12-04T13:48:37.7589042Z >>> # we use the ctx object to construct a new symint that 2025-12-04T13:48:37.7589197Z >>> # represents the data-dependent size. 2025-12-04T13:48:37.7589342Z >>> ctx = torch.library.get_ctx() 2025-12-04T13:48:37.7589483Z >>> nnz = ctx.new_dynamic_size() 2025-12-04T13:48:37.7589620Z >>> shape = [nnz, x.dim()] 2025-12-04T13:48:37.7589769Z >>> result = x.new_empty(shape, dtype=torch.int64) 2025-12-04T13:48:37.7589924Z >>> return result 2025-12-04T13:48:37.7590035Z >>> 2025-12-04T13:48:37.7590140Z >>> x = torch.tensor([0, 1, 2, 0, 0, 1]) 2025-12-04T13:48:37.7590295Z >>> # xdoctest: +SKIP("Requires Python <= 3.11") 2025-12-04T13:48:37.7590452Z >>> out = torch.compile(nonzero, fullgraph=True)(x) 2025-12-04T13:48:37.7590607Z >>> # xdoctest: +SKIP("Requires Python <= 3.11") 2025-12-04T13:48:37.7590758Z >>> assert torch.allclose(out, x.nonzero()) 2025-12-04T13:48:37.7590898Z 2025-12-04T13:48:37.7590986Z 2025-12-04T13:48:37.7591225Z Original Error: IndentationError('expected an indented block after function definition on line 36', ('', 37, 1, '_._ = None\n', 37, 2)) 2025-12-04T13:48:37.7591476Z 2025-12-04T13:48:37.7591563Z _._ = None 2025-12-04T13:48:37.7591653Z ^ 2025-12-04T13:48:37.7591744Z warnings.warn(msg) 2025-12-04T13:48:37.7591886Z 2025-12-04T13:48:37.7592015Z --- Parse Warning: 10 / 17 --- 2025-12-04T13:48:37.7592431Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=unsafe_generate_fake_kernels in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_library/fake_profile.py line=94. 2025-12-04T13:48:37.7592883Z Caused by: DoctestParseError('Failed to parse doctest in _label_docsrc_lines') 2025-12-04T13:48:37.7593053Z 2025-12-04T13:48:37.7593188Z Registers a fake kernel based on the given operator profiles. This fake 2025-12-04T13:48:37.7593400Z kernel registration will override any existing fake kernel registrations. 2025-12-04T13:48:37.7593558Z 2025-12-04T13:48:37.7593681Z The input is a dictionary mapping operator names to a set of operator 2025-12-04T13:48:37.7593881Z profiles, which we will use to generate fake kernels. The operator profiles 2025-12-04T13:48:37.7594078Z are a record of the input and output tensor metadata. Based on this 2025-12-04T13:48:37.7594275Z information we will match a given input to the recorded profile, and return 2025-12-04T13:48:37.7594479Z an output with the same metadata as in the recorded profile. If a profile 2025-12-04T13:48:37.7594651Z doesn't exist then an exception will be thrown. 2025-12-04T13:48:37.7594775Z 2025-12-04T13:48:37.7594900Z The fake kernel generation is considered unsafe because it relies on the 2025-12-04T13:48:37.7595106Z rigid, pre-defined operator profiles that do not account for potential 2025-12-04T13:48:37.7595315Z variations in output behavior. Specifically, the generated kernels assume a 2025-12-04T13:48:37.7595530Z fixed relationship between input and output ranks. However, in reality, it's 2025-12-04T13:48:37.7595746Z possible that data-dependent operations may produce outputs of different 2025-12-04T13:48:37.7595947Z ranks even when given inputs of the same rank. The generated fake kernels 2025-12-04T13:48:37.7596145Z are inflexible and unable to accommodate these nuances, making them 2025-12-04T13:48:37.7596300Z potentially unsafe. 2025-12-04T13:48:37.7596399Z 2025-12-04T13:48:37.7596479Z Args: 2025-12-04T13:48:37.7596607Z op_profiles (dict[str, set[OpProfile]]): A dictionary mapping operator 2025-12-04T13:48:37.7596800Z name to a set of operator profiles from which we will generate fake 2025-12-04T13:48:37.7596948Z kernels. 2025-12-04T13:48:37.7597069Z 2025-12-04T13:48:37.7597151Z Examples: 2025-12-04T13:48:37.7597236Z 2025-12-04T13:48:37.7597348Z >>> # Example: Registering an op-profile from draft-export 2025-12-04T13:48:37.7597488Z >>> import torch 2025-12-04T13:48:37.7597619Z >>> from torch.export._draft_export import draft_export 2025-12-04T13:48:37.7597753Z >>> 2025-12-04T13:48:37.7597872Z >>> @torch.library.custom_op("mylib::foo", mutates_args=()) 2025-12-04T13:48:37.7598029Z >>> def foo(x: Tensor, y: Tensor) -> Tensor: 2025-12-04T13:48:37.7598172Z >>> return x + y 2025-12-04T13:48:37.7598274Z >>> 2025-12-04T13:48:37.7598382Z >>> class M(torch.nn.Module): 2025-12-04T13:48:37.7598507Z >>> def forward(self, a, b): 2025-12-04T13:48:37.7598645Z >>> res = torch.ops.mylib.foo(a, b) # no fake impl 2025-12-04T13:48:37.7598780Z >>> return res 2025-12-04T13:48:37.7598881Z >>> 2025-12-04T13:48:37.7598997Z >>> ep = draft_export(M(), (torch.ones(3, 4), torch.ones(3, 4)) 2025-12-04T13:48:37.7599145Z >>> 2025-12-04T13:48:37.7599300Z >>> with torch._library.fake_profile.unsafe_generate_fake_kernels(ep._report.op_profiles): 2025-12-04T13:48:37.7599493Z >>> decomp = ep.run_decompositions() 2025-12-04T13:48:37.7599608Z 2025-12-04T13:48:37.7599684Z 2025-12-04T13:48:37.7599894Z Original Error: IncompleteParseError('ill-formed doctest: all parts have been processed but the doctest source is not balanced') 2025-12-04T13:48:37.7600125Z 2025-12-04T13:48:37.7600208Z warnings.warn(msg) 2025-12-04T13:48:37.7600305Z 2025-12-04T13:48:37.7600421Z --- Parse Warning: 11 / 17 --- 2025-12-04T13:48:37.7600896Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=ActivationSparsifier in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/ao/pruning/_experimental/activation_sparsifier/activation_sparsifier.py line=16. 2025-12-04T13:48:37.7601403Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:37.7601562Z 2025-12-04T13:48:37.7601700Z The Activation sparsifier class aims to sparsify/prune activations in a neural 2025-12-04T13:48:37.7601953Z network. The idea is to attach the sparsifier to a layer (or layers) and it 2025-12-04T13:48:37.7602163Z zeroes out the activations based on the mask_fn (or sparsification function) 2025-12-04T13:48:37.7602326Z input by the user. 2025-12-04T13:48:37.7602470Z The mask_fn is applied once all the inputs are aggregated and reduced i.e. 2025-12-04T13:48:37.7602650Z mask = mask_fn(reduce_fn(aggregate_fn(activations))) 2025-12-04T13:48:37.7602777Z 2025-12-04T13:48:37.7602856Z Note:: 2025-12-04T13:48:37.7603018Z The sparsification mask is computed on the input **before it goes through the attached layer**. 2025-12-04T13:48:37.7603200Z 2025-12-04T13:48:37.7603278Z Args: 2025-12-04T13:48:37.7603367Z model (nn.Module): 2025-12-04T13:48:37.7603516Z The model whose layers will be sparsified. The layers that needs to be 2025-12-04T13:48:37.7603722Z sparsified should be added separately using the register_layer() function 2025-12-04T13:48:37.7603895Z aggregate_fn (Optional, Callable): 2025-12-04T13:48:37.7604068Z default aggregate_fn that is used if not specified while registering the layer. 2025-12-04T13:48:37.7604261Z specifies how inputs should be aggregated over time. 2025-12-04T13:48:37.7604463Z The aggregate_fn should usually take 2 torch tensors and return the aggregated tensor. 2025-12-04T13:48:37.7604636Z Example 2025-12-04T13:48:37.7604762Z def add_agg_fn(tensor1, tensor2): return tensor1 + tensor2 2025-12-04T13:48:37.7604913Z reduce_fn (Optional, Callable): 2025-12-04T13:48:37.7605091Z default reduce_fn that is used if not specified while registering the layer. 2025-12-04T13:48:37.7605340Z reduce_fn will be called on the aggregated tensor i.e. the tensor obtained after 2025-12-04T13:48:37.7605517Z calling agg_fn() on all inputs. 2025-12-04T13:48:37.7605640Z Example 2025-12-04T13:48:37.7605780Z def mean_reduce_fn(agg_tensor): return agg_tensor.mean(dim=0) 2025-12-04T13:48:37.7605940Z mask_fn (Optional, Callable): 2025-12-04T13:48:37.7606127Z default mask_fn that is used to create the sparsification mask using the tensor obtained after 2025-12-04T13:48:37.7606394Z calling the reduce_fn(). This is used by default if a custom one is passed in the 2025-12-04T13:48:37.7606565Z register_layer(). 2025-12-04T13:48:37.7606770Z Note that the mask_fn() definition should contain the sparse arguments that is passed in sparse_config 2025-12-04T13:48:37.7606969Z arguments. 2025-12-04T13:48:37.7607102Z features (Optional, list): 2025-12-04T13:48:37.7607237Z default selected features to sparsify. 2025-12-04T13:48:37.7607423Z If this is non-empty, then the mask_fn will be applied for each feature of the input. 2025-12-04T13:48:37.7607597Z For example, 2025-12-04T13:48:37.7607759Z mask = [mask_fn(reduce_fn(aggregated_fn(input[feature])) for feature in features] 2025-12-04T13:48:37.7607934Z feature_dim (Optional, int): 2025-12-04T13:48:37.7608114Z default dimension of input features. Again, features along this dim will be chosen 2025-12-04T13:48:37.7608289Z for sparsification. 2025-12-04T13:48:37.7608407Z sparse_config (Dict): 2025-12-04T13:48:37.7608562Z Default configuration for the mask_fn. This config will be passed 2025-12-04T13:48:37.7608720Z with the mask_fn() 2025-12-04T13:48:37.7608830Z 2025-12-04T13:48:37.7608908Z Example: 2025-12-04T13:48:37.7609001Z >>> # xdoctest: +SKIP 2025-12-04T13:48:37.7609110Z >>> model = SomeModel() 2025-12-04T13:48:37.7609266Z >>> act_sparsifier = ActivationSparsifier(...) # init activation sparsifier 2025-12-04T13:48:37.7609433Z >>> # Initialize aggregate_fn 2025-12-04T13:48:37.7609546Z >>> def agg_fn(x, y): 2025-12-04T13:48:37.7609651Z >>> return x + y 2025-12-04T13:48:37.7609751Z >>> 2025-12-04T13:48:37.7609842Z >>> # Initialize reduce_fn 2025-12-04T13:48:37.7609954Z >>> def reduce_fn(x): 2025-12-04T13:48:37.7610067Z >>> return torch.mean(x, dim=0) 2025-12-04T13:48:37.7610179Z >>> 2025-12-04T13:48:37.7610267Z >>> # Initialize mask_fn 2025-12-04T13:48:37.7610376Z >>> def mask_fn(data): 2025-12-04T13:48:37.7610503Z >>> return torch.eye(data.shape).to(data.device) 2025-12-04T13:48:37.7610629Z >>> 2025-12-04T13:48:37.7610710Z >>> 2025-12-04T13:48:37.7610806Z >>> act_sparsifier.register_layer( 2025-12-04T13:48:37.7610927Z ... model.some_layer, 2025-12-04T13:48:37.7611043Z ... aggregate_fn=agg_fn, 2025-12-04T13:48:37.7611162Z ... reduce_fn=reduce_fn, 2025-12-04T13:48:37.7611275Z ... mask_fn=mask_fn, 2025-12-04T13:48:37.7611377Z ... ) 2025-12-04T13:48:37.7611462Z >>> 2025-12-04T13:48:37.7611550Z >>> # start training process 2025-12-04T13:48:37.7611665Z >>> for _ in [...]: 2025-12-04T13:48:37.7611772Z >>> # epoch starts 2025-12-04T13:48:37.7611944Z >>> # model.forward(), compute_loss() and model.backwards() 2025-12-04T13:48:37.7612082Z >>> # epoch ends 2025-12-04T13:48:37.7612188Z >>> act_sparsifier.step() 2025-12-04T13:48:37.7612306Z >>> # end training process 2025-12-04T13:48:37.7612420Z >>> sparsifier.squash_mask() 2025-12-04T13:48:37.7612549Z 2025-12-04T13:48:37.7612763Z Original Error: IndentationError("expected an indented block after 'for' statement on line 25", ('', 26, 1, '_._ = None\n', 26, 2)) 2025-12-04T13:48:37.7613000Z 2025-12-04T13:48:37.7613077Z _._ = None 2025-12-04T13:48:37.7613163Z ^ 2025-12-04T13:48:37.7613245Z warnings.warn(msg) 2025-12-04T13:48:37.7613341Z 2025-12-04T13:48:37.7613456Z --- Parse Warning: 12 / 17 --- 2025-12-04T13:48:37.7613877Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=DeviceMesh.__getitem__ in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/device_mesh.py line=547. 2025-12-04T13:48:37.7614326Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:37.7614487Z 2025-12-04T13:48:37.7614628Z Slice the current DeviceMesh based on the mesh_dim_names given to create a submesh. 2025-12-04T13:48:37.7614873Z The submesh created consists of the dimensions and the communicators indicated by 2025-12-04T13:48:37.7615044Z ``mesh_dim_names`` 2025-12-04T13:48:37.7615142Z 2025-12-04T13:48:37.7615221Z Args: 2025-12-04T13:48:37.7615358Z mesh_dim_names (Union[str, Tuple[str]]): the name or the tuple of names of the 2025-12-04T13:48:37.7615555Z mesh dimension of the DeviceMesh to create the submesh for. 2025-12-04T13:48:37.7615695Z Returns: 2025-12-04T13:48:37.7615787Z A :class:`DeviceMesh` object 2025-12-04T13:48:37.7615896Z 2025-12-04T13:48:37.7616043Z The following program runs on each process/rank in an SPMD manner in a world size of 8. 2025-12-04T13:48:37.7616222Z In the first example: 2025-12-04T13:48:37.7616383Z Calling mesh_2d["tp"] on rank 0, 1, 2, 3 returns a 1D submesh of DeviceMesh:([0, 1, 2, 3]). 2025-12-04T13:48:37.7616596Z Calling mesh_2d["tp"] on rank 4, 5, 6, 7 returns a 1D submesh of DeviceMesh:([4, 5, 6, 7]). 2025-12-04T13:48:37.7616807Z Calling mesh_2d["dp"] on rank 0, 4 returns a 1D submesh of DeviceMesh:([0, 4]). 2025-12-04T13:48:37.7617003Z Calling mesh_2d["dp"] on rank 1, 5 returns a 1D submesh of DeviceMesh:([1, 5]). 2025-12-04T13:48:37.7617198Z Calling mesh_2d["dp"] on rank 2, 6 returns a 1D submesh of DeviceMesh:([2, 6]). 2025-12-04T13:48:37.7617392Z Calling mesh_2d["dp"] on rank 3, 7 returns a 1D submesh of DeviceMesh:([3, 7]). 2025-12-04T13:48:37.7617538Z 2025-12-04T13:48:37.7617624Z In the second example: 2025-12-04T13:48:37.7617788Z Calling mesh_3d["dp", "cp"] on rank 0, 1, 4, 5 returns a 2D submesh of DeviceMesh:([[0, 1], [4, 5]]). 2025-12-04T13:48:37.7618011Z Calling mesh_3d["dp", "cp"] on rank 2, 3, 6, 7 returns a 2D submesh of DeviceMesh:([[2, 3], [6, 7]]). 2025-12-04T13:48:37.7618230Z Calling mesh_3d["cp", "dp"] on rank 0, 1, 4, 5 returns a 2D submesh of DeviceMesh:([[0, 4], [1, 5]]). 2025-12-04T13:48:37.7618448Z Calling mesh_3d["cp", "dp"] on rank 2, 3, 6, 7 returns a 2D submesh of DeviceMesh:([[2, 6], [3, 7]]). 2025-12-04T13:48:37.7618609Z 2025-12-04T13:48:37.7618690Z Example:: 2025-12-04T13:48:37.7618774Z 2025-12-04T13:48:37.7618861Z >>> # xdoctest: +SKIP("no rank") 2025-12-04T13:48:37.7619009Z >>> from torch.distributed.device_mesh import DeviceMesh 2025-12-04T13:48:37.7619145Z >>> 2025-12-04T13:48:37.7619267Z >>> # Initialize a 2D device mesh as (2, 4) to represent the topology 2025-12-04T13:48:37.7619434Z >>> # of cross-host(dim 0), and within-host (dim 1). 2025-12-04T13:48:37.7619621Z >>> mesh_2d = init_device_mesh(device_type="cuda", (2,4), mesh_dim_names=("dp", "tp")) 2025-12-04T13:48:37.7619795Z >>> tp_mesh = mesh_2d["tp"] 2025-12-04T13:48:37.7619913Z >>> dp_mesh = mesh_2d["dp"] 2025-12-04T13:48:37.7620020Z >>> 2025-12-04T13:48:37.7620106Z >>> # Initialize a 3D mesh. 2025-12-04T13:48:37.7620279Z >>> mesh_3d = init_device_mesh(device_type="cuda", (2,2,2), mesh_dim_names=("dp", "pp", "cp")) 2025-12-04T13:48:37.7620570Z >>> # The order of the mesh_dim_names provided deteremines the order of dimensions in the submesh. 2025-12-04T13:48:37.7620759Z >>> dp_cp_mesh = mesh_3d["dp", "cp"] 2025-12-04T13:48:37.7620884Z >>> cp_dp_mesh = mesh_3d["cp", "dp"] 2025-12-04T13:48:37.7620995Z 2025-12-04T13:48:37.7621268Z Original Error: SyntaxError('positional argument follows keyword argument', ('', 6, 82, 'mesh_2d = init_device_mesh(device_type="cuda", (2,4), mesh_dim_names=("dp", "tp"))\n', 6, 83)) 2025-12-04T13:48:37.7621574Z 2025-12-04T13:48:37.7621717Z mesh_2d = init_device_mesh(device_type="cuda", (2,4), mesh_dim_names=("dp", "tp")) 2025-12-04T13:48:37.7621938Z ^ 2025-12-04T13:48:37.7622064Z warnings.warn(msg) 2025-12-04T13:48:37.7622161Z 2025-12-04T13:48:37.7622274Z --- Parse Warning: 13 / 17 --- 2025-12-04T13:48:37.7622717Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=SavePlanner in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/planner.py line=122. 2025-12-04T13:48:37.7623153Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:37.7623311Z 2025-12-04T13:48:37.7623455Z Abstract class defining the protocol used by save_state_dict to plan the save process. 2025-12-04T13:48:37.7623627Z 2025-12-04T13:48:37.7623775Z SavePlanners are stateful objects that can be used to customize the whole save process. 2025-12-04T13:48:37.7623949Z 2025-12-04T13:48:37.7624089Z SavePlanner acts as an access proxy to the state_dict, so any transformation done to it 2025-12-04T13:48:37.7624268Z will be visible to the whole process. 2025-12-04T13:48:37.7624384Z 2025-12-04T13:48:37.7624523Z A planner subclass can expect the following sequence of calls during save_state_dict: 2025-12-04T13:48:37.7624689Z 2025-12-04T13:48:37.7624783Z 1) set_up_planner - called on all ranks. 2025-12-04T13:48:37.7624918Z Signals the start of a checkpoint save. 2025-12-04T13:48:37.7625034Z 2025-12-04T13:48:37.7625128Z 2) create_local_plan - called on all ranks. 2025-12-04T13:48:37.7625316Z Process the state_dict and produces a `SavePlan` that will be sent for global planning. 2025-12-04T13:48:37.7625487Z 2025-12-04T13:48:37.7625599Z 3) create_global_plan - called on the coordinator rank only. 2025-12-04T13:48:37.7625781Z Takes the SavePlan from all ranks and make any global decision. 2025-12-04T13:48:37.7625924Z 2025-12-04T13:48:37.7626012Z 4) finish_plan - called on all ranks. 2025-12-04T13:48:37.7626171Z This gives each rank a chance to adjust to global planning decisions. 2025-12-04T13:48:37.7626317Z 2025-12-04T13:48:37.7626419Z 5) resolve_data - called multiple times on each rank 2025-12-04T13:48:37.7626591Z Lookups a value on the `state_dict` for the storage layer to write. 2025-12-04T13:48:37.7626734Z 2025-12-04T13:48:37.7626884Z Users are recommended to extend DefaultSavePlanner instead of this interface directly as 2025-12-04T13:48:37.7627098Z most changes can be expressed by changes in a single method. 2025-12-04T13:48:37.7627233Z 2025-12-04T13:48:37.7627323Z There are 3 usual patterns of extension: 2025-12-04T13:48:37.7627437Z 2025-12-04T13:48:37.7627570Z Rewriting state_dict. This is the simplest way to extend the save process as it 2025-12-04T13:48:37.7627781Z doesn't requite understanding the intrincacies of how SavePlan works: 2025-12-04T13:48:37.7627931Z 2025-12-04T13:48:37.7628019Z >>> # xdoctest: +SKIP("undefined vars") 2025-12-04T13:48:37.7628154Z >>> class RenamePlanner(DefaultSavePlanner): 2025-12-04T13:48:37.7628284Z >>> def set_up_planner( 2025-12-04T13:48:37.7628389Z >>> self, 2025-12-04T13:48:37.7628519Z >>> state_dict: STATE_DICT_TYPE, 2025-12-04T13:48:37.7628653Z >>> storage_meta: Optional[StorageMeta], 2025-12-04T13:48:37.7628781Z >>> is_coordinator: bool, 2025-12-04T13:48:37.7628892Z >>> ) -> None: 2025-12-04T13:48:37.7628998Z >>> # prefix all keys with `foo_`` 2025-12-04T13:48:37.7629181Z >>> super().set_up_planner({"foo_" + k: v for k, v in state_dict.items()}, storage_meta, is_coordinator) 2025-12-04T13:48:37.7629351Z 2025-12-04T13:48:37.7629524Z Modifying local plan and lookup in tandem. This is useful when fine control of how data is persisted 2025-12-04T13:48:37.7629709Z 2025-12-04T13:48:37.7629813Z >>> # xdoctest: +SKIP("undefined vars") 2025-12-04T13:48:37.7629946Z >>> class FP16Planner(DefaultSavePlanner): 2025-12-04T13:48:37.7630074Z >>> def create_local_plan(self): 2025-12-04T13:48:37.7630125Z >>> plan = super().create_local_plan() 2025-12-04T13:48:37.7630168Z >>> for p in plan: 2025-12-04T13:48:37.7630232Z >>> if p.tensor_data is not None: 2025-12-04T13:48:37.7630297Z >>> p.tensor_data.properties.dtype = torch.float16 2025-12-04T13:48:37.7630337Z >>> return plan 2025-12-04T13:48:37.7630372Z >>> 2025-12-04T13:48:37.7630421Z >>> def resolve_data(self, write_item): 2025-12-04T13:48:37.7630477Z >>> item = super().resolve_data(write_item) 2025-12-04T13:48:37.7630579Z >>> return item if write_item.type == WriteItemType.BYTE_IO else item.to(torch.float16) 2025-12-04T13:48:37.7630613Z 2025-12-04T13:48:37.7630733Z Using the global planning step to make central decisions that can't be made individually by each rank 2025-12-04T13:48:37.7630767Z 2025-12-04T13:48:37.7630814Z >>> # xdoctest: +SKIP("undefined vars") 2025-12-04T13:48:37.7630862Z >>> from itertools import zip_longest 2025-12-04T13:48:37.7630907Z >>> from dataclasses import replace 2025-12-04T13:48:37.7630974Z >>> class DDPLoadBalancingPlanner(DefaultSavePlanner): 2025-12-04T13:48:37.7631076Z >>> # This uses the default local plan behavior of having all non-sharded writes in rank 0 2025-12-04T13:48:37.7631131Z >>> # This sample doesn't handle ShardedTensors 2025-12-04T13:48:37.7631186Z >>> def create_global_plan(self, all_plans): 2025-12-04T13:48:37.7631248Z >>> iters = [iter(all_plans[0].items)] * len(all_plans) 2025-12-04T13:48:37.7631290Z >>> items_per_rank = [ 2025-12-04T13:48:37.7631348Z >>> [item for item in items if item is not None] 2025-12-04T13:48:37.7631408Z >>> for items in zip(*zip_longest(*iters), strict=True) 2025-12-04T13:48:37.7631445Z >>> ] 2025-12-04T13:48:37.7631485Z >>> all_plans = [ 2025-12-04T13:48:37.7631534Z >>> replace(plan, items=items) 2025-12-04T13:48:37.7631607Z >>> for plan, items in zip(all_plans, items_per_rank, strict=True) 2025-12-04T13:48:37.7631645Z >>> ] 2025-12-04T13:48:37.7631703Z >>> return super().create_global_plan(all_plans) 2025-12-04T13:48:37.7631740Z 2025-12-04T13:48:37.7631838Z Finally, some planners need to save additional metadata in the checkpoint, this is 2025-12-04T13:48:37.7631977Z accomplished by having each rank contribute their data items in the local plan and 2025-12-04T13:48:37.7632025Z the global planner aggregate them: 2025-12-04T13:48:37.7632062Z 2025-12-04T13:48:37.7632109Z >>> # xdoctest: +SKIP("undefined vars") 2025-12-04T13:48:37.7632176Z >>> class SaveExtraDataPlanner(DefaultSavePlanner): 2025-12-04T13:48:37.7632231Z >>> def create_local_plan(self) -> SavePlan: 2025-12-04T13:48:37.7632284Z >>> plan = super().create_local_plan() 2025-12-04T13:48:37.7632352Z >>> return replace(plan, planner_data="per-rank-data") 2025-12-04T13:48:37.7632389Z >>> 2025-12-04T13:48:37.7632498Z >>> def create_global_plan(self, all_plans: List[SavePlan]) -> Tuple[List[SavePlan], Metadata]: 2025-12-04T13:48:37.7632597Z >>> global_plan, metadata = super().create_global_plan(all_plans) 2025-12-04T13:48:37.7632660Z >>> merged_data = [p.planner_data for p in global_plan] 2025-12-04T13:48:37.7632731Z >>> metadata = replace(metadata, planner_data=merged_data) 2025-12-04T13:48:37.7632778Z >>> return global_plan, metadata 2025-12-04T13:48:37.7632814Z 2025-12-04T13:48:37.7632994Z Original Error: IndentationError('expected an indented block after function definition on line 3', ('', 9, 0, '_._ = None\n', 9, -1)) 2025-12-04T13:48:37.7633045Z 2025-12-04T13:48:37.7633081Z _._ = None 2025-12-04T13:48:37.7633131Z ^ 2025-12-04T13:48:37.7633171Z warnings.warn(msg) 2025-12-04T13:48:37.7633208Z 2025-12-04T13:48:37.7633283Z --- Parse Warning: 14 / 17 --- 2025-12-04T13:48:37.7633618Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=LoadPlanner in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/checkpoint/planner.py line=305. 2025-12-04T13:48:37.7633710Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:37.7633746Z 2025-12-04T13:48:37.7633849Z Abstract class defining the protocol used by load_state_dict to plan the load process. 2025-12-04T13:48:37.7633885Z 2025-12-04T13:48:37.7633987Z LoadPlanner are stateful objects that can be used to customize the whole load process. 2025-12-04T13:48:37.7634024Z 2025-12-04T13:48:37.7634125Z LoadPlanner acts as an access proxy to the state_dict, so any transformation done to it 2025-12-04T13:48:37.7634175Z will be visible to the whole process. 2025-12-04T13:48:37.7634210Z 2025-12-04T13:48:37.7634313Z A planner subclass can expect the following sequence of calls during load_state_dict: 2025-12-04T13:48:37.7634346Z 2025-12-04T13:48:37.7634398Z 1) set_up_planner - called on all ranks. 2025-12-04T13:48:37.7634453Z Signals the start of loading a checkpoint. 2025-12-04T13:48:37.7634489Z 2025-12-04T13:48:37.7634541Z 2) create_local_plan - called on all ranks. 2025-12-04T13:48:37.7634647Z Process the state_dict and produces a `LoadPlan` that will be sent for global planning. 2025-12-04T13:48:37.7634680Z 2025-12-04T13:48:37.7634753Z 3) create_global_plan - called on the coordinator rank only. 2025-12-04T13:48:37.7634829Z Takes the LoadPlan from all ranks and make any global decision. 2025-12-04T13:48:37.7634863Z 2025-12-04T13:48:37.7634925Z 4) load_bytes - called multiple times on each rank 2025-12-04T13:48:37.7634995Z This is called once per non-tensor value in state_dict. 2025-12-04T13:48:37.7635032Z 2025-12-04T13:48:37.7635114Z 5) resolve_tensor and commit_tensor - called multiple times on each rank 2025-12-04T13:48:37.7635186Z They are called in pair for each Tensor value in state_dict. 2025-12-04T13:48:37.7635222Z 2025-12-04T13:48:37.7635333Z Users are recommended to extend DefaultLoadPlanner instead of this interface directly as 2025-12-04T13:48:37.7635403Z most changes can be expressed by changes in a single method. 2025-12-04T13:48:37.7635439Z 2025-12-04T13:48:37.7635493Z There are two usual patterns of extension: 2025-12-04T13:48:37.7635529Z 2025-12-04T13:48:37.7635623Z Rewriting state_dict. This is the simplest way to extend the load process as it 2025-12-04T13:48:37.7635720Z doesn't requite understanding the intrincacies of how LoadPlan works. We need 2025-12-04T13:48:37.7635804Z to keep a reference to the original state_dict as load happens in place so 2025-12-04T13:48:37.7635858Z we need to be able to perform it in place 2025-12-04T13:48:37.7635892Z 2025-12-04T13:48:37.7635943Z >>> # xdoctest: +SKIP("undefined vars") 2025-12-04T13:48:37.7635999Z >>> class RenamePlanner(DefaultLoadPlanner): 2025-12-04T13:48:37.7636059Z >>> def set_up_planner( 2025-12-04T13:48:37.7636096Z >>> self, 2025-12-04T13:48:37.7636147Z >>> state_dict: STATE_DICT_TYPE, 2025-12-04T13:48:37.7636192Z >>> metadata: Metadata, 2025-12-04T13:48:37.7636242Z >>> is_coordinator: bool, 2025-12-04T13:48:37.7636281Z >>> ) -> None: 2025-12-04T13:48:37.7636338Z >>> self.original_state_dict = state_dict 2025-12-04T13:48:37.7636408Z >>> state_dict = {"foo_" + k: v for k, v in state_dict.items()} 2025-12-04T13:48:37.7636458Z >>> 2025-12-04T13:48:37.7636508Z >>> if self.flatten_sharded_tensors: 2025-12-04T13:48:37.7636589Z >>> state_dict = _flatten_sharded_tensors(state_dict) 2025-12-04T13:48:37.7636625Z >>> 2025-12-04T13:48:37.7636673Z >>> if self.flatten_state_dict: 2025-12-04T13:48:37.7636742Z >>> state_dict, self.mappings = flatten_state_dict(state_dict) 2025-12-04T13:48:37.7636781Z >>> 2025-12-04T13:48:37.7636829Z >>> self.state_dict = state_dict 2025-12-04T13:48:37.7636889Z >>> self.metadata = metadata 2025-12-04T13:48:37.7636941Z >>> self.is_coordinator = is_coordinator 2025-12-04T13:48:37.7636979Z >>> 2025-12-04T13:48:37.7637030Z >>> def load_bytes(self, read_item, value): 2025-12-04T13:48:37.7637076Z >>> # Remove the "foo_" prefix 2025-12-04T13:48:37.7637190Z >>> self.original_state_dict[read_item.dest_index.fqn[4:]] = torch.load(value, weights_only=False) 2025-12-04T13:48:37.7637227Z 2025-12-04T13:48:37.7637261Z 2025-12-04T13:48:37.7637358Z Modifying resolve_tensor and commit_tensor to handle load time transformation. 2025-12-04T13:48:37.7637392Z 2025-12-04T13:48:37.7637444Z >>> # xdoctest: +SKIP("undefined vars") 2025-12-04T13:48:37.7637509Z >>> class MetaModelMaterialize(DefaultSavePlanner): 2025-12-04T13:48:37.7637560Z >>> def resolve_tensor(self, read_item): 2025-12-04T13:48:37.7637615Z >>> tensor = super().resolve_tensor(read_item) 2025-12-04T13:48:37.7637680Z >>> return torch.empty_like(tensor, device="cpu") 2025-12-04T13:48:37.7637714Z >>> 2025-12-04T13:48:37.7637770Z >>> def commit_tensor(self, read_item, tensor): 2025-12-04T13:48:37.7637833Z >>> self.state_dict[read_item.dest_index.fqn] = tensor 2025-12-04T13:48:37.7637870Z 2025-12-04T13:48:37.7638050Z Original Error: IndentationError('expected an indented block after function definition on line 22', ('', 23, 0, '_._ = None\n', 23, -1)) 2025-12-04T13:48:37.7638088Z 2025-12-04T13:48:37.7638123Z _._ = None 2025-12-04T13:48:37.7638160Z ^ 2025-12-04T13:48:37.7638203Z warnings.warn(msg) 2025-12-04T13:48:37.7638239Z 2025-12-04T13:48:37.7638312Z --- Parse Warning: 15 / 17 --- 2025-12-04T13:48:37.7638627Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=FullStateDictConfig in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/api.py line=295. 2025-12-04T13:48:37.7638719Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:37.7638755Z 2025-12-04T13:48:37.7638833Z ``FullStateDictConfig`` is a config class meant to be used with 2025-12-04T13:48:37.7638913Z ``StateDictType.FULL_STATE_DICT``. We recommend enabling both 2025-12-04T13:48:37.7638991Z ``offload_to_cpu=True`` and ``rank0_only=True`` when saving full state 2025-12-04T13:48:37.7639078Z dicts to save GPU memory and CPU memory, respectively. This config class 2025-12-04T13:48:37.7639153Z is meant to be used via the :func:`state_dict_type` context manager as 2025-12-04T13:48:37.7639192Z follows: 2025-12-04T13:48:37.7639225Z 2025-12-04T13:48:37.7639281Z >>> # xdoctest: +SKIP("undefined variables") 2025-12-04T13:48:37.7639368Z >>> from torch.distributed.fsdp import FullyShardedDataParallel as FSDP 2025-12-04T13:48:37.7639437Z >>> fsdp = FSDP(model, auto_wrap_policy=...) 2025-12-04T13:48:37.7639514Z >>> cfg = FullStateDictConfig(offload_to_cpu=True, rank0_only=True) 2025-12-04T13:48:37.7639599Z >>> with FSDP.state_dict_type(fsdp, StateDictType.FULL_STATE_DICT, cfg): 2025-12-04T13:48:37.7639648Z >>> state = fsdp.state_dict() 2025-12-04T13:48:37.7639731Z >>> # `state` will be empty on non rank 0 and contain CPU tensors on rank 0. 2025-12-04T13:48:37.7639818Z >>> # To reload checkpoint for inference, finetuning, transfer learning, etc: 2025-12-04T13:48:37.7639921Z >>> model = model_fn() # Initialize model in preparation for wrapping with FSDP 2025-12-04T13:48:37.7639986Z >>> if dist.get_rank() == 0: 2025-12-04T13:48:37.7640060Z >>> # Load checkpoint only on rank 0 to avoid memory redundancy 2025-12-04T13:48:37.7640119Z >>> state_dict = torch.load("my_checkpoint.pt") 2025-12-04T13:48:37.7640170Z >>> model.load_state_dict(state_dict) 2025-12-04T13:48:37.7640271Z >>> # All ranks initialize FSDP module as usual. `sync_module_states` argument 2025-12-04T13:48:37.7640359Z >>> # communicates loaded checkpoint states from rank 0 to rest of the world. 2025-12-04T13:48:37.7640402Z >>> fsdp = FSDP( 2025-12-04T13:48:37.7640439Z ... model, 2025-12-04T13:48:37.7640496Z ... device_id=torch.cuda.current_device(), 2025-12-04T13:48:37.7640540Z ... auto_wrap_policy=..., 2025-12-04T13:48:37.7640586Z ... sync_module_states=True, 2025-12-04T13:48:37.7640622Z ... ) 2025-12-04T13:48:37.7640704Z >>> # After this point, all ranks have FSDP model with loaded checkpoint. 2025-12-04T13:48:37.7640738Z 2025-12-04T13:48:37.7640778Z Attributes: 2025-12-04T13:48:37.7640854Z rank0_only (bool): If ``True``, then only rank 0 saves the full state 2025-12-04T13:48:37.7640931Z dict, and nonzero ranks save an empty dict. If ``False``, then all 2025-12-04T13:48:37.7640995Z ranks save the full state dict. (Default: ``False``) 2025-12-04T13:48:37.7641032Z 2025-12-04T13:48:37.7641201Z Original Error: IndentationError("expected an indented block after 'if' statement on line 10", ('', 11, 1, '_._ = None\n', 11, 2)) 2025-12-04T13:48:37.7641239Z 2025-12-04T13:48:37.7641274Z _._ = None 2025-12-04T13:48:37.7641311Z ^ 2025-12-04T13:48:37.7641351Z warnings.warn(msg) 2025-12-04T13:48:37.7641386Z 2025-12-04T13:48:37.7641460Z --- Parse Warning: 16 / 17 --- 2025-12-04T13:48:37.7641780Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=register_parametrization in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/parametrize.py line=437. 2025-12-04T13:48:37.7641906Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:37.7641972Z Register a parametrization to a tensor in a module. 2025-12-04T13:48:37.7642007Z 2025-12-04T13:48:37.7642112Z Assume that ``tensor_name="weight"`` for simplicity. When accessing ``module.weight``, 2025-12-04T13:48:37.7642215Z the module will return the parametrized version ``parametrization(module.weight)``. 2025-12-04T13:48:37.7642312Z If the original tensor requires a gradient, the backward pass will differentiate 2025-12-04T13:48:37.7642420Z through :attr:`parametrization`, and the optimizer will update the tensor accordingly. 2025-12-04T13:48:37.7642458Z 2025-12-04T13:48:37.7642570Z The first time that a module registers a parametrization, this function will add an attribute 2025-12-04T13:48:37.7642663Z ``parametrizations`` to the module of type :class:`~ParametrizationList`. 2025-12-04T13:48:37.7642697Z 2025-12-04T13:48:37.7642810Z The list of parametrizations on the tensor ``weight`` will be accessible under 2025-12-04T13:48:37.7642864Z ``module.parametrizations.weight``. 2025-12-04T13:48:37.7642917Z 2025-12-04T13:48:37.7642974Z The original tensor will be accessible under 2025-12-04T13:48:37.7643037Z ``module.parametrizations.weight.original``. 2025-12-04T13:48:37.7643071Z 2025-12-04T13:48:37.7643169Z Parametrizations may be concatenated by registering several parametrizations 2025-12-04T13:48:37.7643213Z on the same attribute. 2025-12-04T13:48:37.7643248Z 2025-12-04T13:48:37.7643337Z The training mode of a registered parametrization is updated on registration 2025-12-04T13:48:37.7643415Z to match the training mode of the host module 2025-12-04T13:48:37.7643449Z 2025-12-04T13:48:37.7643573Z Parametrized parameters and buffers have an inbuilt caching system that can be activated 2025-12-04T13:48:37.7643625Z using the context manager :func:`cached`. 2025-12-04T13:48:37.7643662Z 2025-12-04T13:48:37.7643749Z A :attr:`parametrization` may optionally implement a method with signature 2025-12-04T13:48:37.7643786Z 2025-12-04T13:48:37.7643846Z .. code-block:: python 2025-12-04T13:48:37.7643883Z 2025-12-04T13:48:37.7643965Z def right_inverse(self, X: Tensor) -> Union[Tensor, Sequence[Tensor]] 2025-12-04T13:48:37.7644000Z 2025-12-04T13:48:37.7644095Z This method is called on the unparametrized tensor when the first parametrization 2025-12-04T13:48:37.7644175Z is registered to compute the initial value of the original tensor. 2025-12-04T13:48:37.7644283Z If this method is not implemented, the original tensor will be just the unparametrized tensor. 2025-12-04T13:48:37.7644319Z 2025-12-04T13:48:37.7644430Z If all the parametrizations registered on a tensor implement `right_inverse` it is possible 2025-12-04T13:48:37.7644534Z to initialize a parametrized tensor by assigning to it, as shown in the example below. 2025-12-04T13:48:37.7644569Z 2025-12-04T13:48:37.7644652Z It is possible for the first parametrization to depend on several inputs. 2025-12-04T13:48:37.7644745Z This may be implemented returning a tuple of tensors from ``right_inverse`` 2025-12-04T13:48:37.7644831Z (see the example implementation of a ``RankOne`` parametrization below). 2025-12-04T13:48:37.7644867Z 2025-12-04T13:48:37.7644988Z In this case, the unconstrained tensors are also located under ``module.parametrizations.weight`` 2025-12-04T13:48:37.7645043Z with names ``original0``, ``original1``,... 2025-12-04T13:48:37.7645079Z 2025-12-04T13:48:37.7645119Z .. note:: 2025-12-04T13:48:37.7645152Z 2025-12-04T13:48:37.7645256Z If unsafe=False (default) both the forward and right_inverse methods will be called 2025-12-04T13:48:37.7645316Z once to perform a number of consistency checks. 2025-12-04T13:48:37.7645415Z If unsafe=True, then right_inverse will be called if the tensor is not parametrized, 2025-12-04T13:48:37.7645465Z and nothing will be called otherwise. 2025-12-04T13:48:37.7645500Z 2025-12-04T13:48:37.7645537Z .. note:: 2025-12-04T13:48:37.7645572Z 2025-12-04T13:48:37.7645647Z In most situations, ``right_inverse`` will be a function such that 2025-12-04T13:48:37.7645699Z ``forward(right_inverse(X)) == X`` (see 2025-12-04T13:48:37.7645805Z `right inverse `_). 2025-12-04T13:48:37.7645899Z Sometimes, when the parametrization is not surjective, it may be reasonable 2025-12-04T13:48:37.7645940Z to relax this. 2025-12-04T13:48:37.7645975Z 2025-12-04T13:48:37.7646013Z .. warning:: 2025-12-04T13:48:37.7646050Z 2025-12-04T13:48:37.7646149Z If a parametrization depends on several inputs, :func:`~register_parametrization` 2025-12-04T13:48:37.7646246Z will register a number of new parameters. If such parametrization is registered 2025-12-04T13:48:37.7646346Z after the optimizer is created, these new parameters will need to be added manually 2025-12-04T13:48:37.7646438Z to the optimizer. See :meth:`torch.Optimizer.add_param_group`. 2025-12-04T13:48:37.7646471Z 2025-12-04T13:48:37.7646508Z Args: 2025-12-04T13:48:37.7646588Z module (nn.Module): module on which to register the parametrization 2025-12-04T13:48:37.7646673Z tensor_name (str): name of the parameter or buffer on which to register 2025-12-04T13:48:37.7646719Z the parametrization 2025-12-04T13:48:37.7646812Z parametrization (nn.Module): the parametrization to register 2025-12-04T13:48:37.7646851Z Keyword args: 2025-12-04T13:48:37.7646945Z unsafe (bool): a boolean flag that denotes whether the parametrization 2025-12-04T13:48:37.7647019Z may change the dtype and shape of the tensor. Default: `False` 2025-12-04T13:48:37.7647117Z Warning: the parametrization is not checked for consistency upon registration. 2025-12-04T13:48:37.7647181Z Enable this flag at your own risk. 2025-12-04T13:48:37.7647218Z 2025-12-04T13:48:37.7647253Z Raises: 2025-12-04T13:48:37.7647360Z ValueError: if the module does not have a parameter or a buffer named :attr:`tensor_name` 2025-12-04T13:48:37.7647394Z 2025-12-04T13:48:37.7647434Z Examples: 2025-12-04T13:48:37.7647497Z >>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_LAPACK) 2025-12-04T13:48:37.7647543Z >>> import torch 2025-12-04T13:48:37.7647589Z >>> import torch.nn as nn 2025-12-04T13:48:37.7647651Z >>> import torch.nn.utils.parametrize as P 2025-12-04T13:48:37.7647687Z >>> 2025-12-04T13:48:37.7647736Z >>> class Symmetric(nn.Module): 2025-12-04T13:48:37.7647780Z >>> def forward(self, X): 2025-12-04T13:48:37.7647853Z >>> return X.triu() + X.triu(1).T # Return a symmetric matrix 2025-12-04T13:48:37.7647889Z >>> 2025-12-04T13:48:37.7647944Z >>> def right_inverse(self, A): 2025-12-04T13:48:37.7647987Z >>> return A.triu() 2025-12-04T13:48:37.7648026Z >>> 2025-12-04T13:48:37.7648070Z >>> m = nn.Linear(5, 5) 2025-12-04T13:48:37.7648141Z >>> P.register_parametrization(m, "weight", Symmetric()) 2025-12-04T13:48:37.7648232Z >>> print(torch.allclose(m.weight, m.weight.T)) # m.weight is now symmetric 2025-12-04T13:48:37.7648270Z True 2025-12-04T13:48:37.7648314Z >>> A = torch.rand(5, 5) 2025-12-04T13:48:37.7648364Z >>> A = A + A.T # A is now symmetric 2025-12-04T13:48:37.7648440Z >>> m.weight = A # Initialize the weight to be the symmetric matrix A 2025-12-04T13:48:37.7648494Z >>> print(torch.allclose(m.weight, A)) 2025-12-04T13:48:37.7648529Z True 2025-12-04T13:48:37.7648565Z 2025-12-04T13:48:37.7648610Z >>> class RankOne(nn.Module): 2025-12-04T13:48:37.7648661Z >>> def forward(self, x, y): 2025-12-04T13:48:37.7648720Z >>> # Form a rank 1 matrix multiplying two vectors 2025-12-04T13:48:37.7648779Z >>> return x.unsqueeze(-1) @ y.unsqueeze(-2) 2025-12-04T13:48:37.7648814Z >>> 2025-12-04T13:48:37.7648864Z >>> def right_inverse(self, Z): 2025-12-04T13:48:37.7648915Z >>> # Project Z onto the rank 1 matrices 2025-12-04T13:48:37.7648978Z >>> U, S, Vh = torch.linalg.svd(Z, full_matrices=False) 2025-12-04T13:48:37.7649027Z >>> # Return rescaled singular vectors 2025-12-04T13:48:37.7649082Z >>> s0_sqrt = S[0].sqrt().unsqueeze(-1) 2025-12-04T13:48:37.7649141Z >>> return U[..., :, 0] * s0_sqrt, Vh[..., 0, :] * s0_sqrt 2025-12-04T13:48:37.7649181Z >>> 2025-12-04T13:48:37.7649241Z >>> linear_rank_one = P.register_parametrization( 2025-12-04T13:48:37.7649296Z ... nn.Linear(4, 4), "weight", RankOne() 2025-12-04T13:48:37.7649349Z ... ) 2025-12-04T13:48:37.7649424Z >>> print(torch.linalg.matrix_rank(linear_rank_one.weight).item()) 2025-12-04T13:48:37.7649461Z 1 2025-12-04T13:48:37.7649495Z 2025-12-04T13:48:37.7649530Z 2025-12-04T13:48:37.7649711Z Original Error: IndentationError('expected an indented block after function definition on line 2', ('', 3, 0, '_._ = None\n', 3, -1)) 2025-12-04T13:48:37.7649747Z 2025-12-04T13:48:37.7649793Z _._ = None 2025-12-04T13:48:37.7649829Z ^ 2025-12-04T13:48:37.7649870Z warnings.warn(msg) 2025-12-04T13:48:37.7649903Z 2025-12-04T13:48:37.7649997Z --- Parse Warning: 17 / 17 --- 2025-12-04T13:48:37.7650308Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/xdoctest/core.py:416: UserWarning: Cannot scrape callname=ReduceLROnPlateau in modpath=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/optim/lr_scheduler.py line=1586. 2025-12-04T13:48:37.7650409Z Caused by: DoctestParseError('Failed to parse doctest in _package_groups') 2025-12-04T13:48:37.7650483Z Reduce learning rate when a metric has stopped improving. 2025-12-04T13:48:37.7650516Z 2025-12-04T13:48:37.7650596Z Models often benefit from reducing the learning rate by a factor 2025-12-04T13:48:37.7650669Z of 2-10 once learning stagnates. This scheduler reads a metrics 2025-12-04T13:48:37.7650744Z quantity and if no improvement is seen for a 'patience' number 2025-12-04T13:48:37.7650796Z of epochs, the learning rate is reduced. 2025-12-04T13:48:37.7650832Z 2025-12-04T13:48:37.7650867Z Args: 2025-12-04T13:48:37.7650926Z optimizer (Optimizer): Wrapped optimizer. 2025-12-04T13:48:37.7650990Z mode (str): One of `min`, `max`. In `min` mode, lr will 2025-12-04T13:48:37.7651057Z be reduced when the quantity monitored has stopped 2025-12-04T13:48:37.7651121Z decreasing; in `max` mode it will be reduced when the 2025-12-04T13:48:37.7651198Z quantity monitored has stopped increasing. Default: 'min'. 2025-12-04T13:48:37.7651265Z factor (float): Factor by which the learning rate will be 2025-12-04T13:48:37.7651325Z reduced. new_lr = lr * factor. Default: 0.1. 2025-12-04T13:48:37.7651406Z patience (int): The number of allowed epochs with no improvement after 2025-12-04T13:48:37.7651460Z which the learning rate will be reduced. 2025-12-04T13:48:37.7651545Z For example, consider the case of having no patience (`patience = 0`). 2025-12-04T13:48:37.7651675Z In the first epoch, a baseline is established and is always considered good as there's no previous baseline. 2025-12-04T13:48:37.7651752Z In the second epoch, if the performance is worse than the baseline, 2025-12-04T13:48:37.7651814Z we have what is considered an intolerable epoch. 2025-12-04T13:48:37.7651952Z Since the count of intolerable epochs (1) is greater than the patience level (0), 2025-12-04T13:48:37.7652021Z the learning rate is reduced at the end of this epoch. 2025-12-04T13:48:37.7652135Z From the third epoch onwards, the learning rate continues to be reduced at the end of each epoch 2025-12-04T13:48:37.7652247Z if the performance is worse than the baseline. If the performance improves or remains the same, 2025-12-04T13:48:37.7652298Z the learning rate is not adjusted. 2025-12-04T13:48:37.7652342Z Default: 10. 2025-12-04T13:48:37.7652416Z threshold (float): Threshold for measuring the new optimum, 2025-12-04T13:48:37.7652482Z to only focus on significant changes. Default: 1e-4. 2025-12-04T13:48:37.7652552Z threshold_mode (str): One of `rel`, `abs`. In `rel` mode, 2025-12-04T13:48:37.7652614Z dynamic_threshold = best * ( 1 + threshold ) in 'max' 2025-12-04T13:48:37.7652691Z mode or best * ( 1 - threshold ) in `min` mode. 2025-12-04T13:48:37.7652753Z In `abs` mode, dynamic_threshold = best + threshold in 2025-12-04T13:48:37.7652823Z `max` mode or best - threshold in `min` mode. Default: 'rel'. 2025-12-04T13:48:37.7652891Z cooldown (int): Number of epochs to wait before resuming 2025-12-04T13:48:37.7652962Z normal operation after lr has been reduced. Default: 0. 2025-12-04T13:48:37.7653053Z min_lr (float or list): A scalar or a list of scalars. A 2025-12-04T13:48:37.7653120Z lower bound on the learning rate of all param groups 2025-12-04T13:48:37.7653189Z or each group respectively. Default: 0. 2025-12-04T13:48:37.7653261Z eps (float): Minimal decay applied to lr. If the difference 2025-12-04T13:48:37.7653330Z between new and old lr is smaller than eps, the update is 2025-12-04T13:48:37.7653379Z ignored. Default: 1e-8. 2025-12-04T13:48:37.7653427Z 2025-12-04T13:48:37.7653465Z Example: 2025-12-04T13:48:37.7653512Z >>> # xdoctest: +SKIP 2025-12-04T13:48:37.7653599Z >>> optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9) 2025-12-04T13:48:37.7653661Z >>> scheduler = ReduceLROnPlateau(optimizer, "min") 2025-12-04T13:48:37.7653706Z >>> for epoch in range(10): 2025-12-04T13:48:37.7653747Z >>> train(...) 2025-12-04T13:48:37.7653800Z >>> val_loss = validate(...) 2025-12-04T13:48:37.7653863Z >>> # Note that step should be called after validate() 2025-12-04T13:48:37.7653916Z >>> scheduler.step(val_loss) 2025-12-04T13:48:37.7653950Z 2025-12-04T13:48:37.7654026Z .. image:: ../scripts/lr_scheduler_images/ReduceLROnPlateau.png 2025-12-04T13:48:37.7654061Z 2025-12-04T13:48:37.7654211Z Original Error: IndentationError('unexpected indent', ('', 8, 4, ' scheduler.step(val_loss)\n', 8, -1)) 2025-12-04T13:48:37.7654248Z 2025-12-04T13:48:37.7654296Z scheduler.step(val_loss) 2025-12-04T13:48:37.7654331Z ^ 2025-12-04T13:48:37.7654374Z warnings.warn(msg) 2025-12-04T13:48:37.7654408Z 2025-12-04T13:48:37.7654458Z  2025-12-04T13:48:37.7654531Z === Found 10 run-time warnings === 2025-12-04T13:48:37.7654606Z --- Runtime Warning: 1 / 10 --- 2025-12-04T13:48:37.7654684Z example = 2025-12-04T13:48:37.7655171Z :3: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() 2025-12-04T13:48:37.7655210Z 2025-12-04T13:48:37.7655282Z --- Runtime Warning: 2 / 10 --- 2025-12-04T13:48:37.7655380Z example = 2025-12-04T13:48:37.7655818Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_tensor.py:1392: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /var/lib/jenkins/workspace/c10/core/TensorImpl.h:1973.) 2025-12-04T13:48:37.7655871Z return super().refine_names(names) 2025-12-04T13:48:37.7655906Z 2025-12-04T13:48:37.7655980Z --- Runtime Warning: 3 / 10 --- 2025-12-04T13:48:37.7656091Z example = 2025-12-04T13:48:37.7656305Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/library.py:275: UserWarning: Warning only once for all operators, other operators may also be overridden. 2025-12-04T13:48:37.7656432Z Overriding a previously registered kernel for the same operator and the same dispatch key 2025-12-04T13:48:37.7656517Z operator: aten::div.Tensor(Tensor self, Tensor other) -> Tensor 2025-12-04T13:48:37.7656625Z registered at /var/lib/jenkins/workspace/build/aten/src/ATen/RegisterSchema.cpp:6 2025-12-04T13:48:37.7656669Z dispatch key: CPU 2025-12-04T13:48:37.7656821Z previous kernel: registered at /var/lib/jenkins/workspace/aten/src/ATen/LegacyBatchingRegistrations.cpp:1079 2025-12-04T13:48:37.7657174Z new kernel: registered at :1 (Triggered internally at /var/lib/jenkins/workspace/aten/src/ATen/core/dispatch/OperatorEntry.cpp:208.) 2025-12-04T13:48:37.7657244Z impl_fn(self.ns, name.split("::")[-1], dispatch_key) 2025-12-04T13:48:37.7657278Z 2025-12-04T13:48:37.7657351Z --- Runtime Warning: 4 / 10 --- 2025-12-04T13:48:37.7657453Z example = 2025-12-04T13:48:37.7658056Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nested/__init__.py:117: UserWarning: The PyTorch API of nested tensors is in prototype stage and will change in the near future. We recommend specifying layout=torch.jagged when constructing a nested tensor, as this layout receives active development, has better operator coverage, and works with torch.compile. (Triggered internally at /var/lib/jenkins/workspace/aten/src/ATen/NestedTensorImpl.cpp:178.) 2025-12-04T13:48:37.7658150Z return torch._nested_tensor_from_tensor_list(ts, dtype, None, device, None) 2025-12-04T13:48:37.7658186Z 2025-12-04T13:48:37.7658256Z --- Runtime Warning: 5 / 10 --- 2025-12-04T13:48:37.7658351Z example = 2025-12-04T13:48:37.7658884Z :1: UserWarning: Sparse CSR tensor support is in beta state. If you miss a functionality in the sparse tensor support, please submit a feature request to https://github.com/pytorch/pytorch/issues. (Triggered internally at /var/lib/jenkins/workspace/aten/src/ATen/SparseCsrTensorImpl.cpp:53.) 2025-12-04T13:48:37.7658919Z 2025-12-04T13:48:37.7658992Z --- Runtime Warning: 6 / 10 --- 2025-12-04T13:48:37.7659104Z example = 2025-12-04T13:48:37.7659615Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/fx/experimental/const_fold.py:314: UserWarning: Attempted to insert a get_attr Node with no underlying reference in the owning GraphModule! Call GraphModule.add_submodule to add the necessary submodule, GraphModule.add_parameter to add the necessary Parameter, or nn.Module.register_buffer to add the necessary buffer 2025-12-04T13:48:37.7659684Z new_node = root_const_gm.graph.get_attr(in_node.target) 2025-12-04T13:48:37.7659720Z 2025-12-04T13:48:37.7659790Z --- Runtime Warning: 7 / 10 --- 2025-12-04T13:48:37.7659898Z example = 2025-12-04T13:48:37.7660260Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:48:37.7660313Z self.encoder = TransformerEncoder( 2025-12-04T13:48:37.7660349Z 2025-12-04T13:48:37.7660421Z --- Runtime Warning: 8 / 10 --- 2025-12-04T13:48:37.7660537Z example = 2025-12-04T13:48:37.7660938Z :2: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:48:37.7660989Z 2025-12-04T13:48:37.7661058Z --- Runtime Warning: 9 / 10 --- 2025-12-04T13:48:37.7661161Z example = 2025-12-04T13:48:37.7661446Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py:144: FutureWarning: `torch.nn.utils.weight_norm` is deprecated in favor of `torch.nn.utils.parametrizations.weight_norm`. 2025-12-04T13:48:37.7661511Z WeightNorm.apply(module, name, dim) 2025-12-04T13:48:37.7661546Z 2025-12-04T13:48:37.7661620Z --- Runtime Warning: 10 / 10 --- 2025-12-04T13:48:37.7661729Z example = 2025-12-04T13:48:37.7662059Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py:144: FutureWarning: `torch.nn.utils.weight_norm` is deprecated in favor of `torch.nn.utils.parametrizations.weight_norm`. 2025-12-04T13:48:37.7662111Z WeightNorm.apply(module, name, dim) 2025-12-04T13:48:37.7662150Z 2025-12-04T13:48:37.7662268Z === 378 passed, 516 skipped, 27 warnings in 26.94 seconds === 2025-12-04T13:48:37.7662373Z Finished doctests 1/1 ... [2025-12-04 13:48:37.752985][3582626.277788934], took 0.45min 2025-12-04T13:48:37.7662613Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/inductor.test_aot_inductor/inductor.test_aot_inductor-3bc2b30a2382b82e.xml 2025-12-04T13:48:37.7662699Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:48:37.7662794Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T13:48:37.7662844Z Uploading artifacts took 0.00 seconds 2025-12-04T13:48:39.8751532Z Running test batch 'tests to run' cost 16149.93 seconds 2025-12-04T13:48:39.8760554Z Emitting td_test_failure_stats_v2 2025-12-04T13:48:39.8763336Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856119_f08dc7cad11711f0a60a26383c68b1b6 2025-12-04T13:48:41.8974027Z /var/lib/jenkins/pytorch/tools/stats/upload_metrics.py:156: UserWarning: Error uploading metric td_test_failure_stats_v2 to DynamoDB: Unable to locate credentials 2025-12-04T13:48:41.8974982Z warn(f"Error uploading metric {metric_name} to DynamoDB: {e}") 2025-12-04T13:48:41.8975428Z inductor/test_cuda_select_algorithm 1/1 failed! 2025-12-04T13:48:42.8371270Z 2025-12-04T13:48:42.8371674Z real 269m15.890s 2025-12-04T13:48:42.8372094Z user 1488m39.833s 2025-12-04T13:48:42.8372253Z sys 110m7.007s 2025-12-04T13:48:42.8372408Z + sccache_epilogue 2025-12-04T13:48:42.8372624Z + echo '::group::Sccache Compilation Log' 2025-12-04T13:48:42.8373173Z ##[group]Sccache Compilation Log 2025-12-04T13:48:42.8373409Z + echo '=================== sccache compilation log ===================' 2025-12-04T13:48:42.8373678Z =================== sccache compilation log =================== 2025-12-04T13:48:42.8374047Z + python /var/lib/jenkins/pytorch/.ci/pytorch/print_sccache_log.py /var/lib/jenkins/sccache_error.log 2025-12-04T13:48:42.8446858Z + echo '=========== If your build fails, please take a look at the log above for possible reasons ===========' 2025-12-04T13:48:42.8447241Z =========== If your build fails, please take a look at the log above for possible reasons =========== 2025-12-04T13:48:42.8447512Z + sccache --show-stats 2025-12-04T13:48:42.8468258Z Compile requests 5025 2025-12-04T13:48:42.8468517Z Compile requests executed 866 2025-12-04T13:48:42.8468697Z Cache hits 112 2025-12-04T13:48:42.8468863Z Cache hits (C/C++) 112 2025-12-04T13:48:42.8469020Z Cache misses 731 2025-12-04T13:48:42.8469528Z Cache misses (C/C++) 725 2025-12-04T13:48:42.8469689Z Cache misses (HIP) 6 2025-12-04T13:48:42.8469853Z Cache hits rate 13.29 % 2025-12-04T13:48:42.8470032Z Cache hits rate (C/C++) 13.38 % 2025-12-04T13:48:42.8470202Z Cache hits rate (HIP) 0.00 % 2025-12-04T13:48:42.8470366Z Cache timeouts 0 2025-12-04T13:48:42.8470550Z Cache read errors 0 2025-12-04T13:48:42.8470711Z Forced recaches 0 2025-12-04T13:48:42.8470941Z Cache write errors 0 2025-12-04T13:48:42.8471107Z Cache errors 0 2025-12-04T13:48:42.8471326Z Compilations 731 2025-12-04T13:48:42.8471487Z Compilation failures 23 2025-12-04T13:48:42.8471658Z Non-cacheable compilations 0 2025-12-04T13:48:42.8471829Z Non-cacheable calls 442 2025-12-04T13:48:42.8472044Z Non-compilation calls 3717 2025-12-04T13:48:42.8472258Z Unsupported compiler calls 0 2025-12-04T13:48:42.8472440Z Average cache write 0.000 s 2025-12-04T13:48:42.8472611Z Average compiler 1.489 s 2025-12-04T13:48:42.8472786Z Average cache read hit 0.000 s 2025-12-04T13:48:42.8472967Z Failed distributed compilations 0 2025-12-04T13:48:42.8473078Z 2025-12-04T13:48:42.8473144Z Non-cacheable reasons: 2025-12-04T13:48:42.8473298Z unknown source language 385 2025-12-04T13:48:42.8473476Z -E 57 2025-12-04T13:48:42.8473582Z 2025-12-04T13:48:42.8473695Z Cache location Local disk: "/var/lib/jenkins/.cache/sccache" 2025-12-04T13:48:42.8473931Z Use direct/preprocessor mode? yes 2025-12-04T13:48:42.8474102Z Version (client) 0.10.0 2025-12-04T13:48:42.8474284Z Cache size 45 MiB 2025-12-04T13:48:42.8474452Z Max cache size 10 GiB 2025-12-04T13:48:42.8474623Z + sccache --stop-server 2025-12-04T13:48:42.8482135Z Stopping sccache server... 2025-12-04T13:48:42.8486430Z Compile requests 5025 2025-12-04T13:48:42.8486757Z Compile requests executed 866 2025-12-04T13:48:42.8486966Z Cache hits 112 2025-12-04T13:48:42.8487172Z Cache hits (C/C++) 112 2025-12-04T13:48:42.8487371Z Cache misses 731 2025-12-04T13:48:42.8487564Z Cache misses (C/C++) 725 2025-12-04T13:48:42.8487826Z Cache misses (HIP) 6 2025-12-04T13:48:42.8488043Z Cache hits rate 13.29 % 2025-12-04T13:48:42.8488259Z Cache hits rate (C/C++) 13.38 % 2025-12-04T13:48:42.8488468Z Cache hits rate (HIP) 0.00 % 2025-12-04T13:48:42.8488682Z Cache timeouts 0 2025-12-04T13:48:42.8488885Z Cache read errors 0 2025-12-04T13:48:42.8489082Z Forced recaches 0 2025-12-04T13:48:42.8489284Z Cache write errors 0 2025-12-04T13:48:42.8489490Z Cache errors 0 2025-12-04T13:48:42.8489687Z Compilations 731 2025-12-04T13:48:42.8489890Z Compilation failures 23 2025-12-04T13:48:42.8490099Z Non-cacheable compilations 0 2025-12-04T13:48:42.8490303Z Non-cacheable calls 442 2025-12-04T13:48:42.8490505Z Non-compilation calls 3717 2025-12-04T13:48:42.8490711Z Unsupported compiler calls 0 2025-12-04T13:48:42.8490922Z Average cache write 0.000 s 2025-12-04T13:48:42.8491134Z Average compiler 1.489 s 2025-12-04T13:48:42.8491347Z Average cache read hit 0.000 s 2025-12-04T13:48:42.8491560Z Failed distributed compilations 0 2025-12-04T13:48:42.8491700Z 2025-12-04T13:48:42.8491776Z Non-cacheable reasons: 2025-12-04T13:48:42.8492047Z unknown source language 385 2025-12-04T13:48:42.8492248Z -E 57 2025-12-04T13:48:42.8492722Z 2025-12-04T13:48:42.8492860Z Cache location Local disk: "/var/lib/jenkins/.cache/sccache" 2025-12-04T13:48:42.8493137Z Use direct/preprocessor mode? yes 2025-12-04T13:48:42.8493351Z Version (client) 0.10.0 2025-12-04T13:48:42.8493565Z Cache size 45 MiB 2025-12-04T13:48:42.8493778Z Max cache size 10 GiB 2025-12-04T13:48:42.8494024Z + echo ::endgroup:: 2025-12-04T13:48:42.8494647Z ##[endgroup] 2025-12-04T13:48:42.8553483Z ##[error]Process completed with exit code 1. 2025-12-04T13:48:42.8587892Z ##[group]Run # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct 2025-12-04T13:48:42.8588210Z # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct 2025-12-04T13:48:42.8588577Z docker exec -t "617120bab89fb30c7a4b0a74db2c320a66c2c538ba97c2989f5c244419625fc0" sh -c "cd ../pytorch && sudo cp -R test/test-reports ../workspace/test" 2025-12-04T13:48:42.8592831Z shell: /usr/bin/bash -e {0} 2025-12-04T13:48:42.8592943Z env: 2025-12-04T13:48:42.8593042Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:48:42.8593180Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:48:42.8593360Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:48:42.8593524Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:48:42.8593917Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:48:42.8594286Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:48:42.8594401Z AWS_REGION: us-east-1 2025-12-04T13:48:42.8594582Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:48:42.8594730Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:48:42.8597043Z AWS_SESSION_TOKEN: *** 2025-12-04T13:48:42.8597212Z CONTAINER_NAME: 617120bab89fb30c7a4b0a74db2c320a66c2c538ba97c2989f5c244419625fc0 2025-12-04T13:48:42.8597394Z ##[endgroup] 2025-12-04T13:48:42.9276767Z ##[group]Run docker exec -t "617120bab89fb30c7a4b0a74db2c320a66c2c538ba97c2989f5c244419625fc0" sh -c "sudo chown -R 1001:1001 test" 2025-12-04T13:48:42.9277143Z docker exec -t "617120bab89fb30c7a4b0a74db2c320a66c2c538ba97c2989f5c244419625fc0" sh -c "sudo chown -R 1001:1001 test" 2025-12-04T13:48:42.9280081Z shell: /usr/bin/bash -e {0} 2025-12-04T13:48:42.9280190Z env: 2025-12-04T13:48:42.9280286Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:48:42.9280423Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:48:42.9280602Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:48:42.9280765Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:48:42.9281143Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:48:42.9281521Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:48:42.9281638Z AWS_REGION: us-east-1 2025-12-04T13:48:42.9281768Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:48:42.9281963Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:48:42.9284145Z AWS_SESSION_TOKEN: *** 2025-12-04T13:48:42.9284310Z CONTAINER_NAME: 617120bab89fb30c7a4b0a74db2c320a66c2c538ba97c2989f5c244419625fc0 2025-12-04T13:48:42.9284485Z ##[endgroup] 2025-12-04T13:48:43.0044678Z ##[group]Run cat test/**/*_toprint.log || true 2025-12-04T13:48:43.0044885Z cat test/**/*_toprint.log || true 2025-12-04T13:48:43.0048355Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T13:48:43.0048554Z env: 2025-12-04T13:48:43.0048694Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:48:43.0048835Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:48:43.0049022Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:48:43.0049255Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:48:43.0049656Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:48:43.0050049Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:48:43.0050170Z AWS_REGION: us-east-1 2025-12-04T13:48:43.0050314Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:48:43.0050527Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:48:43.0052927Z AWS_SESSION_TOKEN: *** 2025-12-04T13:48:43.0053106Z CONTAINER_NAME: 617120bab89fb30c7a4b0a74db2c320a66c2c538ba97c2989f5c244419625fc0 2025-12-04T13:48:43.0053297Z ##[endgroup] 2025-12-04T13:48:43.0102906Z cat: 'test/**/*_toprint.log': No such file or directory 2025-12-04T13:48:43.0178915Z Prepare all required actions 2025-12-04T13:48:43.0179466Z Getting action download info 2025-12-04T13:48:43.3498470Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2025-12-04T13:48:44.1528955Z Download action repository 'actions/upload-artifact@v4' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-12-04T13:48:45.0778411Z ##[group]Run ./.github/actions/upload-test-artifacts 2025-12-04T13:48:45.0778554Z with: 2025-12-04T13:48:45.0778641Z use-gha: true 2025-12-04T13:48:45.0778789Z file-suffix: test-default-3-6-linux.rocm.gpu.gfx942.1.b_57116213137 2025-12-04T13:48:45.0778955Z s3-bucket: gha-artifacts 2025-12-04T13:48:45.0779059Z env: 2025-12-04T13:48:45.0779145Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:48:45.0779276Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:48:45.0779460Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:48:45.0779645Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:48:45.0780034Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:48:45.0780405Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:48:45.0780525Z AWS_REGION: us-east-1 2025-12-04T13:48:45.0780695Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:48:45.0780844Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:48:45.0783351Z AWS_SESSION_TOKEN: *** 2025-12-04T13:48:45.0783523Z CONTAINER_NAME: 617120bab89fb30c7a4b0a74db2c320a66c2c538ba97c2989f5c244419625fc0 2025-12-04T13:48:45.0783705Z ##[endgroup] 2025-12-04T13:48:45.0813365Z ##[group]Run actions/upload-artifact@v4 2025-12-04T13:48:45.0813492Z with: 2025-12-04T13:48:45.0813670Z name: test-jsons-runattempt1-test-default-3-6-linux.rocm.gpu.gfx942.1.b_57116213137.zip 2025-12-04T13:48:45.0813871Z retention-days: 14 2025-12-04T13:48:45.0813981Z if-no-files-found: warn 2025-12-04T13:48:45.0814088Z path: test/**/*.json 2025-12-04T13:48:45.0814188Z compression-level: 6 2025-12-04T13:48:45.0814289Z overwrite: false 2025-12-04T13:48:45.0814392Z include-hidden-files: false 2025-12-04T13:48:45.0814500Z env: 2025-12-04T13:48:45.0814591Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:48:45.0814726Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:48:45.0814897Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:48:45.0815060Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:48:45.0815437Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:48:45.0815803Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:48:45.0815916Z AWS_REGION: us-east-1 2025-12-04T13:48:45.0816042Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:48:45.0816189Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:48:45.0818365Z AWS_SESSION_TOKEN: *** 2025-12-04T13:48:45.0818530Z CONTAINER_NAME: 617120bab89fb30c7a4b0a74db2c320a66c2c538ba97c2989f5c244419625fc0 2025-12-04T13:48:45.0818774Z ##[endgroup] 2025-12-04T13:48:45.4513142Z With the provided path, there will be 6 files uploaded 2025-12-04T13:48:45.4516121Z Artifact name is valid! 2025-12-04T13:48:45.4517175Z Root directory input is valid! 2025-12-04T13:48:45.6771427Z Beginning upload of artifact content to blob storage 2025-12-04T13:48:46.0313671Z Uploaded bytes 46621 2025-12-04T13:48:46.0972222Z Finished uploading artifact content to blob storage! 2025-12-04T13:48:46.0972907Z SHA256 digest of uploaded artifact zip is bcbe8c9a081a6502675e61fddffb8090ceb9e1f139baca33bafea58674aa3363 2025-12-04T13:48:46.0973922Z Finalizing artifact upload 2025-12-04T13:48:46.2403673Z Artifact test-jsons-runattempt1-test-default-3-6-linux.rocm.gpu.gfx942.1.b_57116213137.zip.zip successfully finalized. Artifact ID 4764709967 2025-12-04T13:48:46.2404675Z Artifact test-jsons-runattempt1-test-default-3-6-linux.rocm.gpu.gfx942.1.b_57116213137.zip has been successfully uploaded! Final size is 46621 bytes. Artifact ID is 4764709967 2025-12-04T13:48:46.2408540Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/19922849170/artifacts/4764709967 2025-12-04T13:48:46.2516434Z ##[group]Run actions/upload-artifact@v4 2025-12-04T13:48:46.2516601Z with: 2025-12-04T13:48:46.2516806Z name: test-reports-runattempt1-test-default-3-6-linux.rocm.gpu.gfx942.1.b_57116213137.zip 2025-12-04T13:48:46.2517029Z retention-days: 14 2025-12-04T13:48:46.2517144Z if-no-files-found: ignore 2025-12-04T13:48:46.2517284Z path: test/**/*.xml test/**/*.csv 2025-12-04T13:48:46.2517412Z compression-level: 6 2025-12-04T13:48:46.2517524Z overwrite: false 2025-12-04T13:48:46.2517631Z include-hidden-files: false 2025-12-04T13:48:46.2517744Z env: 2025-12-04T13:48:46.2517837Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:48:46.2517990Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:48:46.2518176Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:48:46.2518355Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:48:46.2518750Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:48:46.2519126Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:48:46.2519244Z AWS_REGION: us-east-1 2025-12-04T13:48:46.2519410Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:48:46.2519566Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:48:46.2521778Z AWS_SESSION_TOKEN: *** 2025-12-04T13:48:46.2521989Z CONTAINER_NAME: 617120bab89fb30c7a4b0a74db2c320a66c2c538ba97c2989f5c244419625fc0 2025-12-04T13:48:46.2522169Z ##[endgroup] 2025-12-04T13:48:46.6285345Z With the provided path, there will be 265 files uploaded 2025-12-04T13:48:46.6285663Z Artifact name is valid! 2025-12-04T13:48:46.6285794Z Root directory input is valid! 2025-12-04T13:48:46.8656814Z Beginning upload of artifact content to blob storage 2025-12-04T13:48:47.5582252Z Uploaded bytes 887512 2025-12-04T13:48:47.6262454Z Finished uploading artifact content to blob storage! 2025-12-04T13:48:47.6264069Z SHA256 digest of uploaded artifact zip is d737775831ecbd77d4c412291bcb261079d2603799c0d446d31cb43304854f33 2025-12-04T13:48:47.6264786Z Finalizing artifact upload 2025-12-04T13:48:47.7759776Z Artifact test-reports-runattempt1-test-default-3-6-linux.rocm.gpu.gfx942.1.b_57116213137.zip.zip successfully finalized. Artifact ID 4764710240 2025-12-04T13:48:47.7761200Z Artifact test-reports-runattempt1-test-default-3-6-linux.rocm.gpu.gfx942.1.b_57116213137.zip has been successfully uploaded! Final size is 887512 bytes. Artifact ID is 4764710240 2025-12-04T13:48:47.7764916Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/19922849170/artifacts/4764710240 2025-12-04T13:48:47.7885791Z ##[group]Run actions/upload-artifact@v4 2025-12-04T13:48:47.7885977Z with: 2025-12-04T13:48:47.7886194Z name: logs-runattempt1-test-default-3-6-linux.rocm.gpu.gfx942.1.b_57116213137.zip 2025-12-04T13:48:47.7886522Z retention-days: 14 2025-12-04T13:48:47.7886673Z if-no-files-found: ignore 2025-12-04T13:48:47.7886829Z path: usage_log.txt test/**/*.log 2025-12-04T13:48:47.7886988Z compression-level: 6 2025-12-04T13:48:47.7887118Z overwrite: false 2025-12-04T13:48:47.7887258Z include-hidden-files: false 2025-12-04T13:48:47.7887401Z env: 2025-12-04T13:48:47.7887513Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:48:47.7887682Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:48:47.7887985Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:48:47.7888176Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:48:47.7888692Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:48:47.7889077Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:48:47.7889203Z AWS_REGION: us-east-1 2025-12-04T13:48:47.7889363Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:48:47.7889529Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:48:47.7891739Z AWS_SESSION_TOKEN: *** 2025-12-04T13:48:47.7892162Z CONTAINER_NAME: 617120bab89fb30c7a4b0a74db2c320a66c2c538ba97c2989f5c244419625fc0 2025-12-04T13:48:47.7892351Z ##[endgroup] 2025-12-04T13:48:48.1785863Z Multiple search paths detected. Calculating the least common ancestor of all paths 2025-12-04T13:48:48.1786845Z The least common ancestor is /home/runner/_work/pytorch/pytorch. This will be the root directory of the artifact 2025-12-04T13:48:48.1787308Z With the provided path, there will be 90 files uploaded 2025-12-04T13:48:48.1789411Z Artifact name is valid! 2025-12-04T13:48:48.1790088Z Root directory input is valid! 2025-12-04T13:48:48.4092308Z Beginning upload of artifact content to blob storage 2025-12-04T13:48:49.1099161Z Uploaded bytes 798832 2025-12-04T13:48:49.1885257Z Finished uploading artifact content to blob storage! 2025-12-04T13:48:49.1886664Z SHA256 digest of uploaded artifact zip is 68e614f0605d7c8b272e354b0be9edd2d49d6e270a663026a40b4cde60bf3e6a 2025-12-04T13:48:49.1887373Z Finalizing artifact upload 2025-12-04T13:48:49.3260076Z Artifact logs-runattempt1-test-default-3-6-linux.rocm.gpu.gfx942.1.b_57116213137.zip.zip successfully finalized. Artifact ID 4764710503 2025-12-04T13:48:49.3260973Z Artifact logs-runattempt1-test-default-3-6-linux.rocm.gpu.gfx942.1.b_57116213137.zip has been successfully uploaded! Final size is 798832 bytes. Artifact ID is 4764710503 2025-12-04T13:48:49.3265243Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/19922849170/artifacts/4764710503 2025-12-04T13:48:49.3387964Z ##[group]Run # shellcheck disable=SC2156 2025-12-04T13:48:49.3388213Z # shellcheck disable=SC2156 2025-12-04T13:48:49.3388540Z find . -iname "core.[1-9]*" -exec docker exec "${CONTAINER_NAME}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2025-12-04T13:48:49.3393264Z shell: /usr/bin/bash -e {0} 2025-12-04T13:48:49.3393410Z env: 2025-12-04T13:48:49.3393526Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:48:49.3393688Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:48:49.3393904Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:48:49.3394096Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:48:49.3394526Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:48:49.3394944Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:48:49.3395106Z AWS_REGION: us-east-1 2025-12-04T13:48:49.3395314Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:48:49.3395502Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:48:49.3397894Z AWS_SESSION_TOKEN: *** 2025-12-04T13:48:49.3398092Z CONTAINER_NAME: 617120bab89fb30c7a4b0a74db2c320a66c2c538ba97c2989f5c244419625fc0 2025-12-04T13:48:49.3398379Z ##[endgroup] 2025-12-04T13:48:49.4716000Z ##[group]Run actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 2025-12-04T13:48:49.4716199Z with: 2025-12-04T13:48:49.4716341Z name: coredumps-default-3-6-linux.rocm.gpu.gfx942.1.b 2025-12-04T13:48:49.4716512Z retention-days: 14 2025-12-04T13:48:49.4716635Z if-no-files-found: ignore 2025-12-04T13:48:49.4716764Z path: ./**/core.[1-9]* 2025-12-04T13:48:49.4716889Z compression-level: 6 2025-12-04T13:48:49.4717072Z overwrite: false 2025-12-04T13:48:49.4717193Z include-hidden-files: false 2025-12-04T13:48:49.4717322Z env: 2025-12-04T13:48:49.4717424Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:48:49.4717582Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:48:49.4717781Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:48:49.4717962Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:48:49.4718388Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD136 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:48:49.4718775Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:48:49.4718913Z AWS_REGION: us-east-1 2025-12-04T13:48:49.4719055Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:48:49.4719219Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:48:49.4721429Z AWS_SESSION_TOKEN: *** 2025-12-04T13:48:49.4721612Z CONTAINER_NAME: 617120bab89fb30c7a4b0a74db2c320a66c2c538ba97c2989f5c244419625fc0 2025-12-04T13:48:49.4721802Z ##[endgroup] 2025-12-04T13:48:53.0401789Z No files were found with the provided path: ./**/core.[1-9]*. No artifacts will be uploaded. 2025-12-04T13:48:53.0573706Z Post job cleanup. 2025-12-04T13:48:53.0597244Z Post job cleanup. 2025-12-04T13:48:53.0779807Z Logging out of registry 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T13:48:53.0962244Z Post job cleanup. 2025-12-04T13:48:53.1553719Z Post job cleanup. 2025-12-04T13:48:53.1574390Z Post job cleanup. 2025-12-04T13:48:53.2028851Z [command]/usr/bin/git version 2025-12-04T13:48:53.2057066Z git version 2.52.0 2025-12-04T13:48:53.2082149Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/f88f2d15-8533-4862-8d1f-850053b99cc2/.gitconfig' 2025-12-04T13:48:53.2088606Z Temporarily overriding HOME='/home/runner/_work/_temp/f88f2d15-8533-4862-8d1f-850053b99cc2' before making global git config changes 2025-12-04T13:48:53.2088932Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T13:48:53.2091392Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch 2025-12-04T13:48:53.2118939Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T13:48:53.2139263Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T13:48:53.2367406Z Entering 'android/libs/fbjni' 2025-12-04T13:48:53.2400979Z Entering 'third_party/FP16' 2025-12-04T13:48:53.2438939Z Entering 'third_party/FXdiv' 2025-12-04T13:48:53.2466902Z Entering 'third_party/NNPACK' 2025-12-04T13:48:53.2491266Z Entering 'third_party/NVTX' 2025-12-04T13:48:53.2514828Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T13:48:53.2538637Z Entering 'third_party/XNNPACK' 2025-12-04T13:48:53.2567416Z Entering 'third_party/aiter' 2025-12-04T13:48:53.2597965Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T13:48:53.2632422Z Entering 'third_party/benchmark' 2025-12-04T13:48:53.2664703Z Entering 'third_party/composable_kernel' 2025-12-04T13:48:53.2696081Z Entering 'third_party/cpp-httplib' 2025-12-04T13:48:53.2732136Z Entering 'third_party/cpuinfo' 2025-12-04T13:48:53.2754854Z Entering 'third_party/cudnn_frontend' 2025-12-04T13:48:53.2782196Z Entering 'third_party/cutlass' 2025-12-04T13:48:53.2815488Z Entering 'third_party/fbgemm' 2025-12-04T13:48:53.2844679Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T13:48:53.2866897Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T13:48:53.2892022Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T13:48:53.2915596Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T13:48:53.2941689Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T13:48:53.2963760Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T13:48:53.2988868Z Entering 'third_party/fbgemm/external/json' 2025-12-04T13:48:53.3013810Z Entering 'third_party/flash-attention' 2025-12-04T13:48:53.3053026Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T13:48:53.3076143Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T13:48:53.3102227Z Entering 'third_party/flatbuffers' 2025-12-04T13:48:53.3124763Z Entering 'third_party/fmt' 2025-12-04T13:48:53.3159736Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T13:48:53.3185105Z Entering 'third_party/gloo' 2025-12-04T13:48:53.3206174Z Entering 'third_party/googletest' 2025-12-04T13:48:53.3230692Z Entering 'third_party/ideep' 2025-12-04T13:48:53.3256446Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T13:48:53.3288203Z Entering 'third_party/ittapi' 2025-12-04T13:48:53.3311151Z Entering 'third_party/kineto' 2025-12-04T13:48:53.3339449Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T13:48:53.3378469Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T13:48:53.3412296Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T13:48:53.3435020Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T13:48:53.3457909Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T13:48:53.3484010Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T13:48:53.3515565Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T13:48:53.3546223Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T13:48:53.3570126Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T13:48:53.3597021Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T13:48:53.3621492Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T13:48:53.3645816Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:48:53.3671711Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:48:53.3695973Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T13:48:53.3722210Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T13:48:53.3754132Z Entering 'third_party/kleidiai' 2025-12-04T13:48:53.3780529Z Entering 'third_party/mimalloc' 2025-12-04T13:48:53.3808381Z Entering 'third_party/nlohmann' 2025-12-04T13:48:53.3834347Z Entering 'third_party/onnx' 2025-12-04T13:48:53.3866000Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T13:48:53.3903008Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T13:48:53.3934559Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T13:48:53.3959664Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T13:48:53.3989082Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T13:48:53.4023773Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T13:48:53.4051977Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T13:48:53.4078169Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T13:48:53.4103860Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T13:48:53.4133424Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:48:53.4170015Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:48:53.4203693Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T13:48:53.4245322Z Entering 'third_party/pocketfft' 2025-12-04T13:48:53.4272498Z Entering 'third_party/protobuf' 2025-12-04T13:48:53.4297276Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T13:48:53.4319173Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T13:48:53.4348175Z Entering 'third_party/psimd' 2025-12-04T13:48:53.4370922Z Entering 'third_party/pthreadpool' 2025-12-04T13:48:53.4392922Z Entering 'third_party/pybind11' 2025-12-04T13:48:53.4416281Z Entering 'third_party/python-peachpy' 2025-12-04T13:48:53.4442436Z Entering 'third_party/sleef' 2025-12-04T13:48:53.4467472Z Entering 'third_party/tensorpipe' 2025-12-04T13:48:53.4497556Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T13:48:53.4531408Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T13:48:53.4559863Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T13:48:53.4585477Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T13:48:53.4608645Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T13:48:53.4662543Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T13:48:53.4685112Z http.https://github.com/.extraheader 2025-12-04T13:48:53.4694211Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-12-04T13:48:53.4715708Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T13:48:53.4919654Z Entering 'android/libs/fbjni' 2025-12-04T13:48:53.4937715Z http.https://github.com/.extraheader 2025-12-04T13:48:53.4958771Z Entering 'third_party/FP16' 2025-12-04T13:48:53.4977036Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5000454Z Entering 'third_party/FXdiv' 2025-12-04T13:48:53.5016151Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5033853Z Entering 'third_party/NNPACK' 2025-12-04T13:48:53.5048379Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5069771Z Entering 'third_party/NVTX' 2025-12-04T13:48:53.5095427Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5120480Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T13:48:53.5147097Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5167384Z Entering 'third_party/XNNPACK' 2025-12-04T13:48:53.5180866Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5217289Z Entering 'third_party/aiter' 2025-12-04T13:48:53.5236549Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5257126Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T13:48:53.5269108Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5293436Z Entering 'third_party/benchmark' 2025-12-04T13:48:53.5307108Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5335546Z Entering 'third_party/composable_kernel' 2025-12-04T13:48:53.5350778Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5374106Z Entering 'third_party/cpp-httplib' 2025-12-04T13:48:53.5387417Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5408867Z Entering 'third_party/cpuinfo' 2025-12-04T13:48:53.5422440Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5443086Z Entering 'third_party/cudnn_frontend' 2025-12-04T13:48:53.5455969Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5475981Z Entering 'third_party/cutlass' 2025-12-04T13:48:53.5490183Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5520087Z Entering 'third_party/fbgemm' 2025-12-04T13:48:53.5535075Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5553763Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T13:48:53.5570083Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5589550Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T13:48:53.5608641Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5640347Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T13:48:53.5661454Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5680443Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T13:48:53.5695134Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5716857Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T13:48:53.5732915Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5758027Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T13:48:53.5780264Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5807203Z Entering 'third_party/fbgemm/external/json' 2025-12-04T13:48:53.5823660Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5845539Z Entering 'third_party/flash-attention' 2025-12-04T13:48:53.5861193Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5889610Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T13:48:53.5901568Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5923130Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T13:48:53.5945143Z http.https://github.com/.extraheader 2025-12-04T13:48:53.5973533Z Entering 'third_party/flatbuffers' 2025-12-04T13:48:53.5988028Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6008650Z Entering 'third_party/fmt' 2025-12-04T13:48:53.6022607Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6043322Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T13:48:53.6062746Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6084203Z Entering 'third_party/gloo' 2025-12-04T13:48:53.6102218Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6124419Z Entering 'third_party/googletest' 2025-12-04T13:48:53.6138743Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6156834Z Entering 'third_party/ideep' 2025-12-04T13:48:53.6179744Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6200621Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T13:48:53.6213621Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6243119Z Entering 'third_party/ittapi' 2025-12-04T13:48:53.6257793Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6276077Z Entering 'third_party/kineto' 2025-12-04T13:48:53.6293253Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6311481Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T13:48:53.6327254Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6345768Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T13:48:53.6367313Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6390632Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T13:48:53.6406926Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6429412Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T13:48:53.6444949Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6469237Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T13:48:53.6487687Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6505007Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T13:48:53.6519307Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6547081Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T13:48:53.6565412Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6584335Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T13:48:53.6598821Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6620476Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T13:48:53.6634734Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6654087Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T13:48:53.6668826Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6695047Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T13:48:53.6707963Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6727177Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:48:53.6748579Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6771420Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:48:53.6787448Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6810110Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T13:48:53.6823585Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6851268Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T13:48:53.6866480Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6886669Z Entering 'third_party/kleidiai' 2025-12-04T13:48:53.6901729Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6920890Z Entering 'third_party/mimalloc' 2025-12-04T13:48:53.6938280Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6954787Z Entering 'third_party/nlohmann' 2025-12-04T13:48:53.6972965Z http.https://github.com/.extraheader 2025-12-04T13:48:53.6991580Z Entering 'third_party/onnx' 2025-12-04T13:48:53.7007478Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7035029Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T13:48:53.7048684Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7070827Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T13:48:53.7089931Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7109150Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T13:48:53.7131902Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7155689Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T13:48:53.7172065Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7190499Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T13:48:53.7209865Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7239516Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T13:48:53.7255025Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7279035Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T13:48:53.7296839Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7319745Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T13:48:53.7340439Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7366687Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T13:48:53.7385872Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7410298Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:48:53.7427674Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7447531Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:48:53.7465004Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7485680Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T13:48:53.7503766Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7535911Z Entering 'third_party/pocketfft' 2025-12-04T13:48:53.7554203Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7579265Z Entering 'third_party/protobuf' 2025-12-04T13:48:53.7597884Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7626162Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T13:48:53.7642078Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7660908Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T13:48:53.7677352Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7699702Z Entering 'third_party/psimd' 2025-12-04T13:48:53.7715194Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7737562Z Entering 'third_party/pthreadpool' 2025-12-04T13:48:53.7757736Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7778180Z Entering 'third_party/pybind11' 2025-12-04T13:48:53.7800812Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7820054Z Entering 'third_party/python-peachpy' 2025-12-04T13:48:53.7841971Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7870848Z Entering 'third_party/sleef' 2025-12-04T13:48:53.7886193Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7909418Z Entering 'third_party/tensorpipe' 2025-12-04T13:48:53.7930913Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7957168Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T13:48:53.7971791Z http.https://github.com/.extraheader 2025-12-04T13:48:53.7990863Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T13:48:53.8005489Z http.https://github.com/.extraheader 2025-12-04T13:48:53.8024388Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T13:48:53.8038506Z http.https://github.com/.extraheader 2025-12-04T13:48:53.8061254Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T13:48:53.8079193Z http.https://github.com/.extraheader 2025-12-04T13:48:53.8096469Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T13:48:53.8115973Z http.https://github.com/.extraheader 2025-12-04T13:48:53.8166020Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:53.8193333Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T13:48:53.8386063Z Entering 'android/libs/fbjni' 2025-12-04T13:48:53.8398234Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T13:48:53.8413664Z Entering 'third_party/FP16' 2025-12-04T13:48:53.8428165Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T13:48:53.8438066Z Entering 'third_party/FXdiv' 2025-12-04T13:48:53.8451071Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T13:48:53.8463088Z Entering 'third_party/NNPACK' 2025-12-04T13:48:53.8477238Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T13:48:53.8488553Z Entering 'third_party/NVTX' 2025-12-04T13:48:53.8499347Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T13:48:53.8507854Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T13:48:53.8526303Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T13:48:53.8536077Z Entering 'third_party/XNNPACK' 2025-12-04T13:48:53.8545680Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T13:48:53.8560935Z Entering 'third_party/aiter' 2025-12-04T13:48:53.8573078Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T13:48:53.8583487Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T13:48:53.8594491Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T13:48:53.8607669Z Entering 'third_party/benchmark' 2025-12-04T13:48:53.8619878Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T13:48:53.8630263Z Entering 'third_party/composable_kernel' 2025-12-04T13:48:53.8640629Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T13:48:53.8653403Z Entering 'third_party/cpp-httplib' 2025-12-04T13:48:53.8667182Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T13:48:53.8677225Z Entering 'third_party/cpuinfo' 2025-12-04T13:48:53.8688466Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T13:48:53.8704549Z Entering 'third_party/cudnn_frontend' 2025-12-04T13:48:53.8720410Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T13:48:53.8732633Z Entering 'third_party/cutlass' 2025-12-04T13:48:53.8746959Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T13:48:53.8761461Z Entering 'third_party/fbgemm' 2025-12-04T13:48:53.8772931Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T13:48:53.8787175Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T13:48:53.8803922Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T13:48:53.8813117Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T13:48:53.8823065Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T13:48:53.8835691Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T13:48:53.8846091Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T13:48:53.8858400Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T13:48:53.8868116Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T13:48:53.8885677Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T13:48:53.8897227Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T13:48:53.8906867Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T13:48:53.8918250Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T13:48:53.8926962Z Entering 'third_party/fbgemm/external/json' 2025-12-04T13:48:53.8935897Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T13:48:53.8948506Z Entering 'third_party/flash-attention' 2025-12-04T13:48:53.8960738Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T13:48:53.8968754Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T13:48:53.8987020Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T13:48:53.9000312Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T13:48:53.9010039Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T13:48:53.9023698Z Entering 'third_party/flatbuffers' 2025-12-04T13:48:53.9033969Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T13:48:53.9044891Z Entering 'third_party/fmt' 2025-12-04T13:48:53.9059222Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T13:48:53.9069036Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T13:48:53.9081209Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T13:48:53.9090873Z Entering 'third_party/gloo' 2025-12-04T13:48:53.9102314Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T13:48:53.9111574Z Entering 'third_party/googletest' 2025-12-04T13:48:53.9122338Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:48:53.9131216Z Entering 'third_party/ideep' 2025-12-04T13:48:53.9143998Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T13:48:53.9153122Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T13:48:53.9165191Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T13:48:53.9178460Z Entering 'third_party/ittapi' 2025-12-04T13:48:53.9188796Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T13:48:53.9198807Z Entering 'third_party/kineto' 2025-12-04T13:48:53.9208900Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T13:48:53.9219021Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T13:48:53.9236533Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T13:48:53.9246292Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T13:48:53.9257754Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T13:48:53.9269165Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T13:48:53.9278792Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T13:48:53.9287830Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T13:48:53.9303376Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T13:48:53.9312912Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T13:48:53.9331580Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T13:48:53.9345210Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T13:48:53.9360889Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T13:48:53.9379053Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T13:48:53.9392203Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T13:48:53.9407265Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T13:48:53.9419856Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:48:53.9430775Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T13:48:53.9443488Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T13:48:53.9454393Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T13:48:53.9464934Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T13:48:53.9474462Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T13:48:53.9485342Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T13:48:53.9495562Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:48:53.9505341Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T13:48:53.9515303Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:48:53.9528292Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T13:48:53.9542783Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T13:48:53.9554559Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T13:48:53.9567324Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T13:48:53.9580343Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T13:48:53.9592899Z Entering 'third_party/kleidiai' 2025-12-04T13:48:53.9604875Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T13:48:53.9617642Z Entering 'third_party/mimalloc' 2025-12-04T13:48:53.9628618Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T13:48:53.9638052Z Entering 'third_party/nlohmann' 2025-12-04T13:48:53.9650071Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T13:48:53.9660302Z Entering 'third_party/onnx' 2025-12-04T13:48:53.9673779Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T13:48:53.9689701Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T13:48:53.9704076Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T13:48:53.9717257Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T13:48:53.9736191Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T13:48:53.9750853Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T13:48:53.9764894Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T13:48:53.9777571Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T13:48:53.9793435Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:48:53.9802780Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T13:48:53.9815167Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T13:48:53.9830751Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T13:48:53.9849711Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T13:48:53.9866963Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T13:48:53.9883998Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T13:48:53.9894355Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T13:48:53.9913850Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T13:48:53.9922618Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T13:48:53.9939859Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T13:48:53.9948487Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:48:53.9964085Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T13:48:53.9983121Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:48:53.9995020Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T13:48:54.0006904Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T13:48:54.0022835Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T13:48:54.0054484Z Entering 'third_party/pocketfft' 2025-12-04T13:48:54.0073161Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T13:48:54.0083635Z Entering 'third_party/protobuf' 2025-12-04T13:48:54.0102992Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T13:48:54.0112300Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T13:48:54.0123435Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T13:48:54.0132305Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T13:48:54.0150079Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:48:54.0162674Z Entering 'third_party/psimd' 2025-12-04T13:48:54.0183415Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T13:48:54.0195622Z Entering 'third_party/pthreadpool' 2025-12-04T13:48:54.0207309Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T13:48:54.0216903Z Entering 'third_party/pybind11' 2025-12-04T13:48:54.0227227Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T13:48:54.0237601Z Entering 'third_party/python-peachpy' 2025-12-04T13:48:54.0247661Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T13:48:54.0257393Z Entering 'third_party/sleef' 2025-12-04T13:48:54.0267838Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T13:48:54.0276992Z Entering 'third_party/tensorpipe' 2025-12-04T13:48:54.0290440Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T13:48:54.0303114Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T13:48:54.0312269Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:48:54.0322130Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T13:48:54.0334698Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T13:48:54.0344080Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T13:48:54.0355103Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T13:48:54.0365186Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T13:48:54.0380286Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T13:48:54.0396029Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T13:48:54.0411428Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T13:48:54.0444685Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0468956Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0486834Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0505151Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0526238Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0547520Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0562889Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0580434Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0595128Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0610599Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0627405Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0641662Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0656729Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0670403Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0684532Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0698458Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0711236Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0726416Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0745812Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0759254Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0776450Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0790552Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0804936Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0817534Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0830289Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0847374Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0861556Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0875189Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0889309Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0902770Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0917903Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0931604Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0946475Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0962570Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0982349Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.0997600Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1011903Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1027501Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1041011Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1056951Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1076480Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1093933Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1107494Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1122278Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1136513Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1150598Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1164638Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1178617Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1193140Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1207019Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1226468Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1240222Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1255823Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1270351Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1283688Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1297746Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1313188Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1327405Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1343859Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1361206Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1377503Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1401651Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1431000Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1443586Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1457957Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1475569Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1490364Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1505237Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1520243Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1534633Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1551373Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1566052Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1579485Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1594463Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1610485Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1625571Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1644304Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1659458Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1683725Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1699684Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1717200Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.1815078Z Post job cleanup. 2025-12-04T13:48:54.2269040Z [command]/usr/bin/git version 2025-12-04T13:48:54.2289881Z git version 2.52.0 2025-12-04T13:48:54.2305720Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/a373b02b-dcb2-405d-91aa-27d27b62578d/.gitconfig' 2025-12-04T13:48:54.2310796Z Temporarily overriding HOME='/home/runner/_work/_temp/a373b02b-dcb2-405d-91aa-27d27b62578d' before making global git config changes 2025-12-04T13:48:54.2311132Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T13:48:54.2313080Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch 2025-12-04T13:48:54.2339290Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T13:48:54.2357277Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T13:48:54.2535594Z Entering 'android/libs/fbjni' 2025-12-04T13:48:54.2564429Z Entering 'third_party/FP16' 2025-12-04T13:48:54.2592014Z Entering 'third_party/FXdiv' 2025-12-04T13:48:54.2621430Z Entering 'third_party/NNPACK' 2025-12-04T13:48:54.2649101Z Entering 'third_party/NVTX' 2025-12-04T13:48:54.2672341Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T13:48:54.2694864Z Entering 'third_party/XNNPACK' 2025-12-04T13:48:54.2730901Z Entering 'third_party/aiter' 2025-12-04T13:48:54.2761512Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T13:48:54.2792318Z Entering 'third_party/benchmark' 2025-12-04T13:48:54.2817609Z Entering 'third_party/composable_kernel' 2025-12-04T13:48:54.2849669Z Entering 'third_party/cpp-httplib' 2025-12-04T13:48:54.2872486Z Entering 'third_party/cpuinfo' 2025-12-04T13:48:54.2897782Z Entering 'third_party/cudnn_frontend' 2025-12-04T13:48:54.2930440Z Entering 'third_party/cutlass' 2025-12-04T13:48:54.2962042Z Entering 'third_party/fbgemm' 2025-12-04T13:48:54.2992729Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T13:48:54.3015973Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T13:48:54.3048295Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T13:48:54.3072678Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T13:48:54.3103224Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T13:48:54.3127952Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T13:48:54.3155096Z Entering 'third_party/fbgemm/external/json' 2025-12-04T13:48:54.3186693Z Entering 'third_party/flash-attention' 2025-12-04T13:48:54.3210016Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T13:48:54.3242390Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T13:48:54.3270948Z Entering 'third_party/flatbuffers' 2025-12-04T13:48:54.3295399Z Entering 'third_party/fmt' 2025-12-04T13:48:54.3318950Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T13:48:54.3341002Z Entering 'third_party/gloo' 2025-12-04T13:48:54.3365490Z Entering 'third_party/googletest' 2025-12-04T13:48:54.3388890Z Entering 'third_party/ideep' 2025-12-04T13:48:54.3410128Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T13:48:54.3441038Z Entering 'third_party/ittapi' 2025-12-04T13:48:54.3473299Z Entering 'third_party/kineto' 2025-12-04T13:48:54.3501074Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T13:48:54.3521368Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T13:48:54.3555286Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T13:48:54.3579678Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T13:48:54.3611079Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T13:48:54.3639365Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T13:48:54.3663558Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T13:48:54.3689107Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T13:48:54.3716299Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T13:48:54.3740885Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T13:48:54.3768777Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T13:48:54.3795916Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:48:54.3824167Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:48:54.3862430Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T13:48:54.3890579Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T13:48:54.3915575Z Entering 'third_party/kleidiai' 2025-12-04T13:48:54.3938452Z Entering 'third_party/mimalloc' 2025-12-04T13:48:54.3965057Z Entering 'third_party/nlohmann' 2025-12-04T13:48:54.3990018Z Entering 'third_party/onnx' 2025-12-04T13:48:54.4020260Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T13:48:54.4047151Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T13:48:54.4073278Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T13:48:54.4099829Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T13:48:54.4122524Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T13:48:54.4147149Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T13:48:54.4172864Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T13:48:54.4195009Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T13:48:54.4217886Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T13:48:54.4239015Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:48:54.4271152Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:48:54.4297275Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T13:48:54.4333307Z Entering 'third_party/pocketfft' 2025-12-04T13:48:54.4355971Z Entering 'third_party/protobuf' 2025-12-04T13:48:54.4383057Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T13:48:54.4412010Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T13:48:54.4437242Z Entering 'third_party/psimd' 2025-12-04T13:48:54.4464708Z Entering 'third_party/pthreadpool' 2025-12-04T13:48:54.4488024Z Entering 'third_party/pybind11' 2025-12-04T13:48:54.4510989Z Entering 'third_party/python-peachpy' 2025-12-04T13:48:54.4539032Z Entering 'third_party/sleef' 2025-12-04T13:48:54.4562303Z Entering 'third_party/tensorpipe' 2025-12-04T13:48:54.4589566Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T13:48:54.4614584Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T13:48:54.4637077Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T13:48:54.4662745Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T13:48:54.4687709Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T13:48:54.4737500Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T13:48:54.4762448Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T13:48:54.4977418Z Entering 'android/libs/fbjni' 2025-12-04T13:48:54.5002238Z Entering 'third_party/FP16' 2025-12-04T13:48:54.5028258Z Entering 'third_party/FXdiv' 2025-12-04T13:48:54.5056319Z Entering 'third_party/NNPACK' 2025-12-04T13:48:54.5083802Z Entering 'third_party/NVTX' 2025-12-04T13:48:54.5119457Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T13:48:54.5141946Z Entering 'third_party/XNNPACK' 2025-12-04T13:48:54.5176189Z Entering 'third_party/aiter' 2025-12-04T13:48:54.5199641Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T13:48:54.5227657Z Entering 'third_party/benchmark' 2025-12-04T13:48:54.5248689Z Entering 'third_party/composable_kernel' 2025-12-04T13:48:54.5280952Z Entering 'third_party/cpp-httplib' 2025-12-04T13:48:54.5308897Z Entering 'third_party/cpuinfo' 2025-12-04T13:48:54.5335634Z Entering 'third_party/cudnn_frontend' 2025-12-04T13:48:54.5359085Z Entering 'third_party/cutlass' 2025-12-04T13:48:54.5389272Z Entering 'third_party/fbgemm' 2025-12-04T13:48:54.5417137Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T13:48:54.5443757Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T13:48:54.5468989Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T13:48:54.5502807Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T13:48:54.5542606Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T13:48:54.5571482Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T13:48:54.5598737Z Entering 'third_party/fbgemm/external/json' 2025-12-04T13:48:54.5632280Z Entering 'third_party/flash-attention' 2025-12-04T13:48:54.5677081Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T13:48:54.5709832Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T13:48:54.5741456Z Entering 'third_party/flatbuffers' 2025-12-04T13:48:54.5766222Z Entering 'third_party/fmt' 2025-12-04T13:48:54.5792267Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T13:48:54.5818491Z Entering 'third_party/gloo' 2025-12-04T13:48:54.5847377Z Entering 'third_party/googletest' 2025-12-04T13:48:54.5871385Z Entering 'third_party/ideep' 2025-12-04T13:48:54.5894079Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T13:48:54.5921442Z Entering 'third_party/ittapi' 2025-12-04T13:48:54.5949067Z Entering 'third_party/kineto' 2025-12-04T13:48:54.5974155Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T13:48:54.6018228Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T13:48:54.6055391Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T13:48:54.6088089Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T13:48:54.6114814Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T13:48:54.6142162Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T13:48:54.6176951Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T13:48:54.6200520Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T13:48:54.6220339Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T13:48:54.6247554Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T13:48:54.6272760Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T13:48:54.6295673Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:48:54.6327617Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:48:54.6356735Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T13:48:54.6379683Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T13:48:54.6408245Z Entering 'third_party/kleidiai' 2025-12-04T13:48:54.6432241Z Entering 'third_party/mimalloc' 2025-12-04T13:48:54.6460116Z Entering 'third_party/nlohmann' 2025-12-04T13:48:54.6484995Z Entering 'third_party/onnx' 2025-12-04T13:48:54.6519732Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T13:48:54.6547813Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T13:48:54.6577541Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T13:48:54.6601728Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T13:48:54.6627468Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T13:48:54.6652009Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T13:48:54.6676784Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T13:48:54.6699013Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T13:48:54.6721977Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T13:48:54.6743487Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:48:54.6773532Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:48:54.6800992Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T13:48:54.6831281Z Entering 'third_party/pocketfft' 2025-12-04T13:48:54.6853363Z Entering 'third_party/protobuf' 2025-12-04T13:48:54.6885621Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T13:48:54.6909107Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T13:48:54.6932670Z Entering 'third_party/psimd' 2025-12-04T13:48:54.6956336Z Entering 'third_party/pthreadpool' 2025-12-04T13:48:54.6980461Z Entering 'third_party/pybind11' 2025-12-04T13:48:54.7011393Z Entering 'third_party/python-peachpy' 2025-12-04T13:48:54.7036961Z Entering 'third_party/sleef' 2025-12-04T13:48:54.7060156Z Entering 'third_party/tensorpipe' 2025-12-04T13:48:54.7085075Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T13:48:54.7106235Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T13:48:54.7129390Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T13:48:54.7156134Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T13:48:54.7178584Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T13:48:54.7221439Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.7244739Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T13:48:54.7422572Z Entering 'android/libs/fbjni' 2025-12-04T13:48:54.7432801Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T13:48:54.7441497Z Entering 'third_party/FP16' 2025-12-04T13:48:54.7453883Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T13:48:54.7463860Z Entering 'third_party/FXdiv' 2025-12-04T13:48:54.7473906Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T13:48:54.7486929Z Entering 'third_party/NNPACK' 2025-12-04T13:48:54.7497327Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T13:48:54.7506552Z Entering 'third_party/NVTX' 2025-12-04T13:48:54.7516586Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T13:48:54.7526047Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T13:48:54.7536713Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T13:48:54.7546033Z Entering 'third_party/XNNPACK' 2025-12-04T13:48:54.7555498Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T13:48:54.7569192Z Entering 'third_party/aiter' 2025-12-04T13:48:54.7579231Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T13:48:54.7588305Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T13:48:54.7598504Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T13:48:54.7616536Z Entering 'third_party/benchmark' 2025-12-04T13:48:54.7626464Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T13:48:54.7635178Z Entering 'third_party/composable_kernel' 2025-12-04T13:48:54.7645198Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T13:48:54.7657485Z Entering 'third_party/cpp-httplib' 2025-12-04T13:48:54.7667141Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T13:48:54.7676349Z Entering 'third_party/cpuinfo' 2025-12-04T13:48:54.7687472Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T13:48:54.7695878Z Entering 'third_party/cudnn_frontend' 2025-12-04T13:48:54.7706727Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T13:48:54.7715930Z Entering 'third_party/cutlass' 2025-12-04T13:48:54.7726838Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T13:48:54.7739370Z Entering 'third_party/fbgemm' 2025-12-04T13:48:54.7752864Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T13:48:54.7767032Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T13:48:54.7776371Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T13:48:54.7784521Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T13:48:54.7794530Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T13:48:54.7815211Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T13:48:54.7826880Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T13:48:54.7837196Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T13:48:54.7849143Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T13:48:54.7862389Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T13:48:54.7883888Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T13:48:54.7893346Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T13:48:54.7913302Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T13:48:54.7922700Z Entering 'third_party/fbgemm/external/json' 2025-12-04T13:48:54.7943458Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T13:48:54.7956527Z Entering 'third_party/flash-attention' 2025-12-04T13:48:54.7966450Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T13:48:54.7975857Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T13:48:54.7990418Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T13:48:54.8002647Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T13:48:54.8017662Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T13:48:54.8033512Z Entering 'third_party/flatbuffers' 2025-12-04T13:48:54.8051929Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T13:48:54.8062678Z Entering 'third_party/fmt' 2025-12-04T13:48:54.8072240Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T13:48:54.8085015Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T13:48:54.8097517Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T13:48:54.8110948Z Entering 'third_party/gloo' 2025-12-04T13:48:54.8120422Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T13:48:54.8129022Z Entering 'third_party/googletest' 2025-12-04T13:48:54.8140078Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:48:54.8149635Z Entering 'third_party/ideep' 2025-12-04T13:48:54.8159891Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T13:48:54.8168619Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T13:48:54.8182982Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T13:48:54.8200178Z Entering 'third_party/ittapi' 2025-12-04T13:48:54.8210384Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T13:48:54.8223092Z Entering 'third_party/kineto' 2025-12-04T13:48:54.8232895Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T13:48:54.8243611Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T13:48:54.8261603Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T13:48:54.8275261Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T13:48:54.8287009Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T13:48:54.8297524Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T13:48:54.8313959Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T13:48:54.8323258Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T13:48:54.8333152Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T13:48:54.8342023Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T13:48:54.8352914Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T13:48:54.8361675Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T13:48:54.8372067Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T13:48:54.8389013Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T13:48:54.8408022Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T13:48:54.8417871Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T13:48:54.8427959Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:48:54.8437282Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T13:48:54.8449135Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T13:48:54.8459595Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T13:48:54.8471770Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T13:48:54.8481351Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T13:48:54.8491477Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T13:48:54.8499692Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:48:54.8509776Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T13:48:54.8522774Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:48:54.8532432Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T13:48:54.8546262Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T13:48:54.8556647Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T13:48:54.8568316Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T13:48:54.8578918Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T13:48:54.8594109Z Entering 'third_party/kleidiai' 2025-12-04T13:48:54.8607464Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T13:48:54.8617099Z Entering 'third_party/mimalloc' 2025-12-04T13:48:54.8629602Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T13:48:54.8638981Z Entering 'third_party/nlohmann' 2025-12-04T13:48:54.8648331Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T13:48:54.8657197Z Entering 'third_party/onnx' 2025-12-04T13:48:54.8666462Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T13:48:54.8680831Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T13:48:54.8692023Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T13:48:54.8709049Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T13:48:54.8719068Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T13:48:54.8727596Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T13:48:54.8736398Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T13:48:54.8746194Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T13:48:54.8755354Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:48:54.8763748Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T13:48:54.8775120Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T13:48:54.8784873Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T13:48:54.8795308Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T13:48:54.8806758Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T13:48:54.8816426Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T13:48:54.8829247Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T13:48:54.8840630Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T13:48:54.8850303Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T13:48:54.8865665Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T13:48:54.8874766Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:48:54.8894968Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T13:48:54.8906239Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:48:54.8919233Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T13:48:54.8931616Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T13:48:54.8946945Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T13:48:54.8965430Z Entering 'third_party/pocketfft' 2025-12-04T13:48:54.8977357Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T13:48:54.8986519Z Entering 'third_party/protobuf' 2025-12-04T13:48:54.8997079Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T13:48:54.9007582Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T13:48:54.9018824Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T13:48:54.9027727Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T13:48:54.9048036Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:48:54.9059863Z Entering 'third_party/psimd' 2025-12-04T13:48:54.9071620Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T13:48:54.9081708Z Entering 'third_party/pthreadpool' 2025-12-04T13:48:54.9091446Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T13:48:54.9101302Z Entering 'third_party/pybind11' 2025-12-04T13:48:54.9110986Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T13:48:54.9119826Z Entering 'third_party/python-peachpy' 2025-12-04T13:48:54.9129165Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T13:48:54.9137617Z Entering 'third_party/sleef' 2025-12-04T13:48:54.9146807Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T13:48:54.9155378Z Entering 'third_party/tensorpipe' 2025-12-04T13:48:54.9165142Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T13:48:54.9175253Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T13:48:54.9193085Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T13:48:54.9202642Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T13:48:54.9214145Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T13:48:54.9223267Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T13:48:54.9234851Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T13:48:54.9246548Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T13:48:54.9258277Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T13:48:54.9267600Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T13:48:54.9277431Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T13:48:54.9305903Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9326901Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9345187Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9361251Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9378800Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9398506Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9416168Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9434954Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9452448Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9468491Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9484451Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9501423Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9521492Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9540334Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9558915Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9575082Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9590661Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9606053Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9620937Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9636932Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9652717Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9673943Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9689755Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9707390Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9723892Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9742845Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9764537Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9780816Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9798487Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9815438Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9833059Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9853235Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9877719Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9900261Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9921246Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9937590Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9954318Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9975759Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:54.9991604Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0015420Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0039114Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0060169Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0084272Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0105064Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0127852Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0143773Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0158712Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0174342Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0191460Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0206449Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0224062Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0240306Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0261172Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0277261Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0297083Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0313151Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0330413Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0347531Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0363255Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0379132Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0396619Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0418453Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0435335Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0452414Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0475427Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0492751Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0509043Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0526542Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0543251Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0559476Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0576370Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0597209Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0614749Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0632199Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0649143Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0666186Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0682133Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0700523Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0717796Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0737421Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0754151Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T13:48:55.0880652Z Cleaning up orphan processes